CEPH-Devel Archive on lore.kernel.org
 help / color / Atom feed
* cleanup updating the size of block devices v3
@ 2020-11-16 14:56 Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 01/78] block: remove the call to __invalidate_device in check_disk_size_change Christoph Hellwig
                   ` (79 more replies)
  0 siblings, 80 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Hi Jens,

this series builds on top of the work that went into the last merge window,
and make sure we have a single coherent interfac for updating the size of a
block device.

Changes since v2:
 - rebased to the set_capacity_revalidate_and_notify in mainline
 - keep the loop_set_size function
 - fix two mixed up acks
 
Changes since v1:
 - minor spelling fixes


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 01/78] block: remove the call to __invalidate_device in check_disk_size_change
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 02/78] loop: let set_capacity_revalidate_and_notify update the bdev size Christoph Hellwig
                   ` (78 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

__invalidate_device without the kill_dirty parameter just invalidates
various clean entries in caches, which doesn't really help us with
anything, but can cause all kinds of horrible lock orders due to how
it calls into the file system.  The only reason this hasn't been a
major issue is because so many people use partitions, for which no
invalidation was performed anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 fs/block_dev.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9e84b1928b9401..66ebf594c97f47 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1334,12 +1334,6 @@ static void check_disk_size_change(struct gendisk *disk,
 		i_size_write(bdev->bd_inode, disk_size);
 	}
 	spin_unlock(&bdev->bd_size_lock);
-
-	if (bdev_size > disk_size) {
-		if (__invalidate_device(bdev, false))
-			pr_warn("VFS: busy inodes on resized disk %s\n",
-				disk->disk_name);
-	}
 }
 
 /**
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 02/78] loop: let set_capacity_revalidate_and_notify update the bdev size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 01/78] block: remove the call to __invalidate_device in check_disk_size_change Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 03/78] nvme: " Christoph Hellwig
                   ` (77 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

There is no good reason to call revalidate_disk_size separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/loop.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index a58084c2ed7ceb..0a0c0c3a68ec4c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -251,12 +251,8 @@ loop_validate_block_size(unsigned short bsize)
  */
 static void loop_set_size(struct loop_device *lo, loff_t size)
 {
-	struct block_device *bdev = lo->lo_device;
-
-	bd_set_nr_sectors(bdev, size);
-
-	if (!set_capacity_revalidate_and_notify(lo->lo_disk, size, false))
-		kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
+	if (!set_capacity_revalidate_and_notify(lo->lo_disk, size, true))
+		kobject_uevent(&disk_to_dev(lo->lo_disk)->kobj, KOBJ_CHANGE);
 }
 
 static inline int
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 03/78] nvme: let set_capacity_revalidate_and_notify update the bdev size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 01/78] block: remove the call to __invalidate_device in check_disk_size_change Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 02/78] loop: let set_capacity_revalidate_and_notify update the bdev size Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 04/78] sd: update the bdev size in sd_revalidate_disk Christoph Hellwig
                   ` (76 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

There is no good reason to call revalidate_disk_size separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9b01afcb7777b8..f6c6479da0e9ec 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2053,7 +2053,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 			capacity = 0;
 	}
 
-	set_capacity_revalidate_and_notify(disk, capacity, false);
+	set_capacity_revalidate_and_notify(disk, capacity, true);
 
 	nvme_config_discard(disk, ns);
 	nvme_config_write_zeroes(disk, ns);
@@ -2134,7 +2134,6 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
 		blk_stack_limits(&ns->head->disk->queue->limits,
 				 &ns->queue->limits, 0);
 		blk_queue_update_readahead(ns->head->disk->queue);
-		nvme_update_bdev_size(ns->head->disk);
 		blk_mq_unfreeze_queue(ns->head->disk->queue);
 	}
 #endif
@@ -3963,8 +3962,6 @@ static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_ids *ids)
 	 */
 	if (ret && ret != -ENOMEM && !(ret > 0 && !(ret & NVME_SC_DNR)))
 		nvme_ns_remove(ns);
-	else
-		revalidate_disk_size(ns->disk, true);
 }
 
 static void nvme_validate_or_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 04/78] sd: update the bdev size in sd_revalidate_disk
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 03/78] nvme: " Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 05/78] block: remove the update_bdev parameter to set_capacity_revalidate_and_notify Christoph Hellwig
                   ` (75 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

This avoids the extra call to revalidate_disk_size in sd_rescan and
is otherwise a no-op because the size did not change, or we are in
the probe path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/sd.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 656bcf4940d6d1..4a34dd5b153196 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1750,10 +1750,8 @@ static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr *sshdr)
 static void sd_rescan(struct device *dev)
 {
 	struct scsi_disk *sdkp = dev_get_drvdata(dev);
-	int ret;
 
-	ret = sd_revalidate_disk(sdkp->disk);
-	revalidate_disk_size(sdkp->disk, ret == 0);
+	sd_revalidate_disk(sdkp->disk);
 }
 
 static int sd_ioctl(struct block_device *bdev, fmode_t mode,
@@ -3266,7 +3264,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	sdkp->first_scan = 0;
 
 	set_capacity_revalidate_and_notify(disk,
-		logical_to_sectors(sdp, sdkp->capacity), false);
+		logical_to_sectors(sdp, sdkp->capacity), true);
 	sd_config_write_same(sdkp);
 	kfree(buffer);
 
@@ -3276,7 +3274,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	 * capacity to 0.
 	 */
 	if (sd_zbc_revalidate_zones(sdkp))
-		set_capacity_revalidate_and_notify(disk, 0, false);
+		set_capacity_revalidate_and_notify(disk, 0, true);
 
  out:
 	return 0;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 05/78] block: remove the update_bdev parameter to set_capacity_revalidate_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 04/78] sd: update the bdev size in sd_revalidate_disk Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 06/78] nbd: remove the call to set_blocksize Christoph Hellwig
                   ` (74 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke, Petr Vorel

The update_bdev argument is always set to true, so remove it.  Also
rename the function to the slighly less verbose set_capacity_and_notify,
as propagating the disk size to the block device isn't really
revalidation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
---
 block/genhd.c                | 13 +++++--------
 drivers/block/loop.c         |  2 +-
 drivers/block/virtio_blk.c   |  2 +-
 drivers/block/xen-blkfront.c |  2 +-
 drivers/nvme/host/core.c     |  2 +-
 drivers/scsi/sd.c            |  5 ++---
 include/linux/genhd.h        |  3 +--
 7 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 9387f050c248a7..8c350fecfe8bfe 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -46,17 +46,15 @@ static void disk_del_events(struct gendisk *disk);
 static void disk_release_events(struct gendisk *disk);
 
 /*
- * Set disk capacity and notify if the size is not currently
- * zero and will not be set to zero
+ * Set disk capacity and notify if the size is not currently zero and will not
+ * be set to zero.  Returns true if a uevent was sent, otherwise false.
  */
-bool set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size,
-					bool update_bdev)
+bool set_capacity_and_notify(struct gendisk *disk, sector_t size)
 {
 	sector_t capacity = get_capacity(disk);
 
 	set_capacity(disk, size);
-	if (update_bdev)
-		revalidate_disk_size(disk, true);
+	revalidate_disk_size(disk, true);
 
 	if (capacity != size && capacity != 0 && size != 0) {
 		char *envp[] = { "RESIZE=1", NULL };
@@ -67,8 +65,7 @@ bool set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size,
 
 	return false;
 }
-
-EXPORT_SYMBOL_GPL(set_capacity_revalidate_and_notify);
+EXPORT_SYMBOL_GPL(set_capacity_and_notify);
 
 /*
  * Format the device name of the indicated disk into the supplied buffer and
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0a0c0c3a68ec4c..84a36c242e5550 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -251,7 +251,7 @@ loop_validate_block_size(unsigned short bsize)
  */
 static void loop_set_size(struct loop_device *lo, loff_t size)
 {
-	if (!set_capacity_revalidate_and_notify(lo->lo_disk, size, true))
+	if (!set_capacity_and_notify(lo->lo_disk, size))
 		kobject_uevent(&disk_to_dev(lo->lo_disk)->kobj, KOBJ_CHANGE);
 }
 
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index a314b9382442b6..3e812b4c32e669 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -470,7 +470,7 @@ static void virtblk_update_capacity(struct virtio_blk *vblk, bool resize)
 		   cap_str_10,
 		   cap_str_2);
 
-	set_capacity_revalidate_and_notify(vblk->disk, capacity, true);
+	set_capacity_and_notify(vblk->disk, capacity);
 }
 
 static void virtblk_config_changed_work(struct work_struct *work)
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 48629d3433b4c3..79521e33d30ed5 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2370,7 +2370,7 @@ static void blkfront_connect(struct blkfront_info *info)
 			return;
 		printk(KERN_INFO "Setting capacity to %Lu\n",
 		       sectors);
-		set_capacity_revalidate_and_notify(info->gd, sectors, true);
+		set_capacity_and_notify(info->gd, sectors);
 
 		return;
 	case BLKIF_STATE_SUSPENDED:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f6c6479da0e9ec..6c144e748f8cae 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2053,7 +2053,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 			capacity = 0;
 	}
 
-	set_capacity_revalidate_and_notify(disk, capacity, true);
+	set_capacity_and_notify(disk, capacity);
 
 	nvme_config_discard(disk, ns);
 	nvme_config_write_zeroes(disk, ns);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4a34dd5b153196..a2a4f385833d6c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3263,8 +3263,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 
 	sdkp->first_scan = 0;
 
-	set_capacity_revalidate_and_notify(disk,
-		logical_to_sectors(sdp, sdkp->capacity), true);
+	set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity));
 	sd_config_write_same(sdkp);
 	kfree(buffer);
 
@@ -3274,7 +3273,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	 * capacity to 0.
 	 */
 	if (sd_zbc_revalidate_zones(sdkp))
-		set_capacity_revalidate_and_notify(disk, 0, true);
+		set_capacity_and_notify(disk, 0);
 
  out:
 	return 0;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 03da3f603d309c..4b22bfd9336e1a 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -315,8 +315,7 @@ static inline int get_disk_ro(struct gendisk *disk)
 extern void disk_block_events(struct gendisk *disk);
 extern void disk_unblock_events(struct gendisk *disk);
 extern void disk_flush_events(struct gendisk *disk, unsigned int mask);
-bool set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size,
-		bool update_bdev);
+bool set_capacity_and_notify(struct gendisk *disk, sector_t size);
 
 /* drivers/char/random.c */
 extern void add_disk_randomness(struct gendisk *disk) __latent_entropy;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 06/78] nbd: remove the call to set_blocksize
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 05/78] block: remove the update_bdev parameter to set_capacity_revalidate_and_notify Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 07/78] nbd: move the task_recv check into nbd_size_update Christoph Hellwig
                   ` (73 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Block driver have no business setting the file system concept of a
block size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 drivers/block/nbd.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index aaae9220f3a008..a9a0b49ff16101 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -296,7 +296,7 @@ static void nbd_size_clear(struct nbd_device *nbd)
 	}
 }
 
-static void nbd_size_update(struct nbd_device *nbd, bool start)
+static void nbd_size_update(struct nbd_device *nbd)
 {
 	struct nbd_config *config = nbd->config;
 	struct block_device *bdev = bdget_disk(nbd->disk, 0);
@@ -311,11 +311,9 @@ static void nbd_size_update(struct nbd_device *nbd, bool start)
 	blk_queue_physical_block_size(nbd->disk->queue, config->blksize);
 	set_capacity(nbd->disk, nr_sectors);
 	if (bdev) {
-		if (bdev->bd_disk) {
+		if (bdev->bd_disk)
 			bd_set_nr_sectors(bdev, nr_sectors);
-			if (start)
-				set_blocksize(bdev, config->blksize);
-		} else
+		else
 			set_bit(GD_NEED_PART_SCAN, &nbd->disk->state);
 		bdput(bdev);
 	}
@@ -329,7 +327,7 @@ static void nbd_size_set(struct nbd_device *nbd, loff_t blocksize,
 	config->blksize = blocksize;
 	config->bytesize = blocksize * nr_blocks;
 	if (nbd->task_recv != NULL)
-		nbd_size_update(nbd, false);
+		nbd_size_update(nbd);
 }
 
 static void nbd_complete_rq(struct request *req)
@@ -1309,7 +1307,7 @@ static int nbd_start_device(struct nbd_device *nbd)
 		args->index = i;
 		queue_work(nbd->recv_workq, &args->work);
 	}
-	nbd_size_update(nbd, true);
+	nbd_size_update(nbd);
 	return error;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 07/78] nbd: move the task_recv check into nbd_size_update
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 06/78] nbd: remove the call to set_blocksize Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:56 ` [PATCH 08/78] nbd: refactor size updates Christoph Hellwig
                   ` (72 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

nbd_size_update is about to acquire a few more callers, so lift the check
into the function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 drivers/block/nbd.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index a9a0b49ff16101..48054051e281e6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -299,8 +299,11 @@ static void nbd_size_clear(struct nbd_device *nbd)
 static void nbd_size_update(struct nbd_device *nbd)
 {
 	struct nbd_config *config = nbd->config;
-	struct block_device *bdev = bdget_disk(nbd->disk, 0);
 	sector_t nr_sectors = config->bytesize >> 9;
+	struct block_device *bdev;
+
+	if (!nbd->task_recv)
+		return;
 
 	if (config->flags & NBD_FLAG_SEND_TRIM) {
 		nbd->disk->queue->limits.discard_granularity = config->blksize;
@@ -309,7 +312,9 @@ static void nbd_size_update(struct nbd_device *nbd)
 	}
 	blk_queue_logical_block_size(nbd->disk->queue, config->blksize);
 	blk_queue_physical_block_size(nbd->disk->queue, config->blksize);
+
 	set_capacity(nbd->disk, nr_sectors);
+	bdev = bdget_disk(nbd->disk, 0);
 	if (bdev) {
 		if (bdev->bd_disk)
 			bd_set_nr_sectors(bdev, nr_sectors);
@@ -326,8 +331,7 @@ static void nbd_size_set(struct nbd_device *nbd, loff_t blocksize,
 	struct nbd_config *config = nbd->config;
 	config->blksize = blocksize;
 	config->bytesize = blocksize * nr_blocks;
-	if (nbd->task_recv != NULL)
-		nbd_size_update(nbd);
+	nbd_size_update(nbd);
 }
 
 static void nbd_complete_rq(struct request *req)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 08/78] nbd: refactor size updates
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 07/78] nbd: move the task_recv check into nbd_size_update Christoph Hellwig
@ 2020-11-16 14:56 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 09/78] nbd: validate the block size in nbd_set_size Christoph Hellwig
                   ` (71 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Merge nbd_size_set and nbd_size_update into a single function that also
updates the nbd_config fields.  This new function takes the device size
in bytes as the first argument, and the blocksize as the second argument,
simplifying the calculations required in most callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 drivers/block/nbd.c | 44 ++++++++++++++++++--------------------------
 1 file changed, 18 insertions(+), 26 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 48054051e281e6..6e8f2ff715c661 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -296,28 +296,30 @@ static void nbd_size_clear(struct nbd_device *nbd)
 	}
 }
 
-static void nbd_size_update(struct nbd_device *nbd)
+static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
+		loff_t blksize)
 {
-	struct nbd_config *config = nbd->config;
-	sector_t nr_sectors = config->bytesize >> 9;
 	struct block_device *bdev;
 
+	nbd->config->bytesize = bytesize;
+	nbd->config->blksize = blksize;
+
 	if (!nbd->task_recv)
 		return;
 
-	if (config->flags & NBD_FLAG_SEND_TRIM) {
-		nbd->disk->queue->limits.discard_granularity = config->blksize;
-		nbd->disk->queue->limits.discard_alignment = config->blksize;
+	if (nbd->config->flags & NBD_FLAG_SEND_TRIM) {
+		nbd->disk->queue->limits.discard_granularity = blksize;
+		nbd->disk->queue->limits.discard_alignment = blksize;
 		blk_queue_max_discard_sectors(nbd->disk->queue, UINT_MAX);
 	}
-	blk_queue_logical_block_size(nbd->disk->queue, config->blksize);
-	blk_queue_physical_block_size(nbd->disk->queue, config->blksize);
+	blk_queue_logical_block_size(nbd->disk->queue, blksize);
+	blk_queue_physical_block_size(nbd->disk->queue, blksize);
 
-	set_capacity(nbd->disk, nr_sectors);
+	set_capacity(nbd->disk, bytesize >> 9);
 	bdev = bdget_disk(nbd->disk, 0);
 	if (bdev) {
 		if (bdev->bd_disk)
-			bd_set_nr_sectors(bdev, nr_sectors);
+			bd_set_nr_sectors(bdev, bytesize >> 9);
 		else
 			set_bit(GD_NEED_PART_SCAN, &nbd->disk->state);
 		bdput(bdev);
@@ -325,15 +327,6 @@ static void nbd_size_update(struct nbd_device *nbd)
 	kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
 }
 
-static void nbd_size_set(struct nbd_device *nbd, loff_t blocksize,
-			 loff_t nr_blocks)
-{
-	struct nbd_config *config = nbd->config;
-	config->blksize = blocksize;
-	config->bytesize = blocksize * nr_blocks;
-	nbd_size_update(nbd);
-}
-
 static void nbd_complete_rq(struct request *req)
 {
 	struct nbd_cmd *cmd = blk_mq_rq_to_pdu(req);
@@ -1311,7 +1304,7 @@ static int nbd_start_device(struct nbd_device *nbd)
 		args->index = i;
 		queue_work(nbd->recv_workq, &args->work);
 	}
-	nbd_size_update(nbd);
+	nbd_set_size(nbd, config->bytesize, config->blksize);
 	return error;
 }
 
@@ -1390,15 +1383,14 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 			arg = NBD_DEF_BLKSIZE;
 		if (!nbd_is_valid_blksize(arg))
 			return -EINVAL;
-		nbd_size_set(nbd, arg,
-			     div_s64(config->bytesize, arg));
+		nbd_set_size(nbd, config->bytesize, arg);
 		return 0;
 	case NBD_SET_SIZE:
-		nbd_size_set(nbd, config->blksize,
-			     div_s64(arg, config->blksize));
+		nbd_set_size(nbd, arg, config->blksize);
 		return 0;
 	case NBD_SET_SIZE_BLOCKS:
-		nbd_size_set(nbd, config->blksize, arg);
+		nbd_set_size(nbd, arg * config->blksize,
+			     config->blksize);
 		return 0;
 	case NBD_SET_TIMEOUT:
 		nbd_set_cmd_timeout(nbd, arg);
@@ -1828,7 +1820,7 @@ static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd)
 	}
 
 	if (bytes != config->bytesize || bsize != config->blksize)
-		nbd_size_set(nbd, bsize, div64_u64(bytes, bsize));
+		nbd_set_size(nbd, bytes, bsize);
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 09/78] nbd: validate the block size in nbd_set_size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-11-16 14:56 ` [PATCH 08/78] nbd: refactor size updates Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 10/78] nbd: use set_capacity_and_notify Christoph Hellwig
                   ` (70 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Move the validation of the block from the callers into nbd_set_size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 drivers/block/nbd.c | 47 +++++++++++++++------------------------------
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 6e8f2ff715c661..7478a5e02bc1ed 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -296,16 +296,21 @@ static void nbd_size_clear(struct nbd_device *nbd)
 	}
 }
 
-static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
+static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
 		loff_t blksize)
 {
 	struct block_device *bdev;
 
+	if (!blksize)
+		blksize = NBD_DEF_BLKSIZE;
+	if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize))
+		return -EINVAL;
+
 	nbd->config->bytesize = bytesize;
 	nbd->config->blksize = blksize;
 
 	if (!nbd->task_recv)
-		return;
+		return 0;
 
 	if (nbd->config->flags & NBD_FLAG_SEND_TRIM) {
 		nbd->disk->queue->limits.discard_granularity = blksize;
@@ -325,6 +330,7 @@ static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
 		bdput(bdev);
 	}
 	kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
+	return 0;
 }
 
 static void nbd_complete_rq(struct request *req)
@@ -1304,8 +1310,7 @@ static int nbd_start_device(struct nbd_device *nbd)
 		args->index = i;
 		queue_work(nbd->recv_workq, &args->work);
 	}
-	nbd_set_size(nbd, config->bytesize, config->blksize);
-	return error;
+	return nbd_set_size(nbd, config->bytesize, config->blksize);
 }
 
 static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev)
@@ -1347,14 +1352,6 @@ static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
 		nbd_config_put(nbd);
 }
 
-static bool nbd_is_valid_blksize(unsigned long blksize)
-{
-	if (!blksize || !is_power_of_2(blksize) || blksize < 512 ||
-	    blksize > PAGE_SIZE)
-		return false;
-	return true;
-}
-
 static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout)
 {
 	nbd->tag_set.timeout = timeout * HZ;
@@ -1379,19 +1376,12 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 	case NBD_SET_SOCK:
 		return nbd_add_socket(nbd, arg, false);
 	case NBD_SET_BLKSIZE:
-		if (!arg)
-			arg = NBD_DEF_BLKSIZE;
-		if (!nbd_is_valid_blksize(arg))
-			return -EINVAL;
-		nbd_set_size(nbd, config->bytesize, arg);
-		return 0;
+		return nbd_set_size(nbd, config->bytesize, arg);
 	case NBD_SET_SIZE:
-		nbd_set_size(nbd, arg, config->blksize);
-		return 0;
+		return nbd_set_size(nbd, arg, config->blksize);
 	case NBD_SET_SIZE_BLOCKS:
-		nbd_set_size(nbd, arg * config->blksize,
-			     config->blksize);
-		return 0;
+		return nbd_set_size(nbd, arg * config->blksize,
+				    config->blksize);
 	case NBD_SET_TIMEOUT:
 		nbd_set_cmd_timeout(nbd, arg);
 		return 0;
@@ -1809,18 +1799,11 @@ static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd)
 	if (info->attrs[NBD_ATTR_SIZE_BYTES])
 		bytes = nla_get_u64(info->attrs[NBD_ATTR_SIZE_BYTES]);
 
-	if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]) {
+	if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES])
 		bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]);
-		if (!bsize)
-			bsize = NBD_DEF_BLKSIZE;
-		if (!nbd_is_valid_blksize(bsize)) {
-			printk(KERN_ERR "Invalid block size %llu\n", bsize);
-			return -EINVAL;
-		}
-	}
 
 	if (bytes != config->bytesize || bsize != config->blksize)
-		nbd_set_size(nbd, bytes, bsize);
+		return nbd_set_size(nbd, bytes, bsize);
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 10/78] nbd: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 09/78] nbd: validate the block size in nbd_set_size Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 11/78] aoe: don't call set_capacity from irq context Christoph Hellwig
                   ` (69 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to update the disk and block device sizes and
send a RESIZE uevent to userspace.  Note that blktests relies on uevents
being sent also for updates that did not change the device size, so the
explicit kobject_uevent remains for that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 drivers/block/nbd.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 7478a5e02bc1ed..45b0423ef2c53d 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -299,8 +299,6 @@ static void nbd_size_clear(struct nbd_device *nbd)
 static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
 		loff_t blksize)
 {
-	struct block_device *bdev;
-
 	if (!blksize)
 		blksize = NBD_DEF_BLKSIZE;
 	if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize))
@@ -320,16 +318,9 @@ static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
 	blk_queue_logical_block_size(nbd->disk->queue, blksize);
 	blk_queue_physical_block_size(nbd->disk->queue, blksize);
 
-	set_capacity(nbd->disk, bytesize >> 9);
-	bdev = bdget_disk(nbd->disk, 0);
-	if (bdev) {
-		if (bdev->bd_disk)
-			bd_set_nr_sectors(bdev, bytesize >> 9);
-		else
-			set_bit(GD_NEED_PART_SCAN, &nbd->disk->state);
-		bdput(bdev);
-	}
-	kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
+	set_bit(GD_NEED_PART_SCAN, &nbd->disk->state);
+	if (!set_capacity_and_notify(nbd->disk, bytesize >> 9))
+		kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 11/78] aoe: don't call set_capacity from irq context
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 10/78] nbd: use set_capacity_and_notify Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 12/78] dm: use set_capacity_and_notify Christoph Hellwig
                   ` (68 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Updating the block device size from irq context can lead to torn
writes of the 64-bit value, and prevents us from using normal
process context locking primitives to serialize access to the 64-bit
nr_sectors value.  Defer the set_capacity to the already existing
workqueue handler, where it can be merged with the update of the
block device size by using set_capacity_and_notify.  As an extra
bonus this also adds proper uevent notifications for the resize.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/aoe/aoecmd.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 313f0b946fe2b3..ac720bdcd983e7 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -890,19 +890,13 @@ void
 aoecmd_sleepwork(struct work_struct *work)
 {
 	struct aoedev *d = container_of(work, struct aoedev, work);
-	struct block_device *bd;
-	u64 ssize;
 
 	if (d->flags & DEVFL_GDALLOC)
 		aoeblk_gdalloc(d);
 
 	if (d->flags & DEVFL_NEWSIZE) {
-		ssize = get_capacity(d->gd);
-		bd = bdget_disk(d->gd, 0);
-		if (bd) {
-			bd_set_nr_sectors(bd, ssize);
-			bdput(bd);
-		}
+		set_capacity_and_notify(d->gd, d->ssize);
+
 		spin_lock_irq(&d->lock);
 		d->flags |= DEVFL_UP;
 		d->flags &= ~DEVFL_NEWSIZE;
@@ -971,10 +965,9 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id)
 	d->geo.start = 0;
 	if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE))
 		return;
-	if (d->gd != NULL) {
-		set_capacity(d->gd, ssize);
+	if (d->gd != NULL)
 		d->flags |= DEVFL_NEWSIZE;
-	} else
+	else
 		d->flags |= DEVFL_GDALLOC;
 	schedule_work(&d->work);
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 12/78] dm: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 11/78] aoe: don't call set_capacity from irq context Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2021-02-12 15:45   ` Mike Snitzer
  2020-11-16 14:57 ` [PATCH 13/78] pktcdvd: " Christoph Hellwig
                   ` (67 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c18fc25485186d..62ad44925e73ec 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1971,8 +1971,7 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
 	if (size != dm_get_size(md))
 		memset(&md->geometry, 0, sizeof(md->geometry));
 
-	set_capacity(md->disk, size);
-	bd_set_nr_sectors(md->bdev, size);
+	set_capacity_and_notify(md->disk, size);
 
 	dm_table_event_callback(t, event_callback, md);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 13/78] pktcdvd: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 12/78] dm: use set_capacity_and_notify Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 14/78] nvme: use set_capacity_and_notify in nvme_set_queue_dying Christoph Hellwig
                   ` (66 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/pktcdvd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 467dbd06b7cdb1..4326401cede445 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2130,8 +2130,7 @@ static int pkt_open_dev(struct pktcdvd_device *pd, fmode_t write)
 	}
 
 	set_capacity(pd->disk, lba << 2);
-	set_capacity(pd->bdev->bd_disk, lba << 2);
-	bd_set_nr_sectors(pd->bdev, lba << 2);
+	set_capacity_and_notify(pd->bdev->bd_disk, lba << 2);
 
 	q = bdev_get_queue(pd->bdev);
 	if (write) {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 14/78] nvme: use set_capacity_and_notify in nvme_set_queue_dying
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 13/78] pktcdvd: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 15/78] drbd: use set_capacity_and_notify Christoph Hellwig
                   ` (65 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use the block layer helper to update both the disk and block device
sizes.  Contrary to the name no notification is sent in this case,
as a size 0 is special cased.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6c144e748f8cae..bc89e8659c403f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -93,16 +93,6 @@ static void nvme_put_subsystem(struct nvme_subsystem *subsys);
 static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
 					   unsigned nsid);
 
-static void nvme_update_bdev_size(struct gendisk *disk)
-{
-	struct block_device *bdev = bdget_disk(disk, 0);
-
-	if (bdev) {
-		bd_set_nr_sectors(bdev, get_capacity(disk));
-		bdput(bdev);
-	}
-}
-
 /*
  * Prepare a queue for teardown.
  *
@@ -119,8 +109,7 @@ static void nvme_set_queue_dying(struct nvme_ns *ns)
 	blk_set_queue_dying(ns->queue);
 	blk_mq_unquiesce_queue(ns->queue);
 
-	set_capacity(ns->disk, 0);
-	nvme_update_bdev_size(ns->disk);
+	set_capacity_and_notify(ns->disk, 0);
 }
 
 static void nvme_queue_scan(struct nvme_ctrl *ctrl)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 15/78] drbd: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 14/78] nvme: use set_capacity_and_notify in nvme_set_queue_dying Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 16/78] rbd: " Christoph Hellwig
                   ` (64 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/drbd/drbd_main.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 65b95aef8dbc95..1c8c18b2a25f33 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2036,8 +2036,7 @@ void drbd_set_my_capacity(struct drbd_device *device, sector_t size)
 {
 	char ppb[10];
 
-	set_capacity(device->vdisk, size);
-	revalidate_disk_size(device->vdisk, false);
+	set_capacity_and_notify(device->vdisk, size);
 
 	drbd_info(device, "size = %s (%llu KB)\n",
 		ppsize(ppb, size>>1), (unsigned long long)size>>1);
@@ -2068,8 +2067,7 @@ void drbd_device_cleanup(struct drbd_device *device)
 	}
 	D_ASSERT(device, first_peer_device(device)->connection->net_conf == NULL);
 
-	set_capacity(device->vdisk, 0);
-	revalidate_disk_size(device->vdisk, false);
+	set_capacity_and_notify(device->vdisk, 0);
 	if (device->bitmap) {
 		/* maybe never allocated. */
 		drbd_bm_resize(device, 0, 1);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 16/78] rbd: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (14 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 15/78] drbd: use set_capacity_and_notify Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 17/78] rnbd: " Christoph Hellwig
                   ` (63 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
---
 drivers/block/rbd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index f84128abade319..b7a194ffda55b4 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -4920,8 +4920,7 @@ static void rbd_dev_update_size(struct rbd_device *rbd_dev)
 	    !test_bit(RBD_DEV_FLAG_REMOVING, &rbd_dev->flags)) {
 		size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE;
 		dout("setting size to %llu sectors", (unsigned long long)size);
-		set_capacity(rbd_dev->disk, size);
-		revalidate_disk_size(rbd_dev->disk, true);
+		set_capacity_and_notify(rbd_dev->disk, size);
 	}
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 17/78] rnbd: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (15 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 16/78] rbd: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 18/78] zram: " Christoph Hellwig
                   ` (62 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/block/rnbd/rnbd-clt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 8b2411ccbda97c..bb13d7dd195a08 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -100,8 +100,7 @@ static int rnbd_clt_change_capacity(struct rnbd_clt_dev *dev,
 	rnbd_clt_info(dev, "Device size changed from %zu to %zu sectors\n",
 		       dev->nsectors, new_nsectors);
 	dev->nsectors = new_nsectors;
-	set_capacity(dev->gd, dev->nsectors);
-	revalidate_disk_size(dev->gd, true);
+	set_capacity_and_notify(dev->gd, dev->nsectors);
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 18/78] zram: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (16 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 17/78] rnbd: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 19/78] dm-raid: " Christoph Hellwig
                   ` (61 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/zram/zram_drv.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 1b697208d66157..6d15d51cee2b7e 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1695,7 +1695,7 @@ static void zram_reset_device(struct zram *zram)
 	disksize = zram->disksize;
 	zram->disksize = 0;
 
-	set_capacity(zram->disk, 0);
+	set_capacity_and_notify(zram->disk, 0);
 	part_stat_set_all(&zram->disk->part0, 0);
 
 	up_write(&zram->init_lock);
@@ -1741,9 +1741,7 @@ static ssize_t disksize_store(struct device *dev,
 
 	zram->comp = comp;
 	zram->disksize = disksize;
-	set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT);
-
-	revalidate_disk_size(zram->disk, true);
+	set_capacity_and_notify(zram->disk, zram->disksize >> SECTOR_SHIFT);
 	up_write(&zram->init_lock);
 
 	return len;
@@ -1790,7 +1788,6 @@ static ssize_t reset_store(struct device *dev,
 	/* Make sure all the pending I/O are finished */
 	fsync_bdev(bdev);
 	zram_reset_device(zram);
-	revalidate_disk_size(zram->disk, true);
 	bdput(bdev);
 
 	mutex_lock(&bdev->bd_mutex);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 19/78] dm-raid: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (17 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 18/78] zram: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 20/78] md: " Christoph Hellwig
                   ` (60 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-raid.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 9c1f7c4de65b35..294f34d2d61bae 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -700,8 +700,7 @@ static void rs_set_capacity(struct raid_set *rs)
 {
 	struct gendisk *gendisk = dm_disk(dm_table_get_md(rs->ti->table));
 
-	set_capacity(gendisk, rs->md.array_sectors);
-	revalidate_disk_size(gendisk, true);
+	set_capacity_and_notify(gendisk, rs->md.array_sectors);
 }
 
 /*
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 20/78] md: use set_capacity_and_notify
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (18 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 19/78] dm-raid: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 21/78] md: remove a spurious call to revalidate_disk_size in update_size Christoph Hellwig
                   ` (59 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
---
 drivers/md/md-cluster.c |  6 ++----
 drivers/md/md-linear.c  |  3 +--
 drivers/md/md.c         | 24 ++++++++++--------------
 3 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index 4aaf4820b6f625..87442dc59f6ca3 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -581,8 +581,7 @@ static int process_recvd_msg(struct mddev *mddev, struct cluster_msg *msg)
 		process_metadata_update(mddev, msg);
 		break;
 	case CHANGE_CAPACITY:
-		set_capacity(mddev->gendisk, mddev->array_sectors);
-		revalidate_disk_size(mddev->gendisk, true);
+		set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
 		break;
 	case RESYNCING:
 		set_bit(MD_RESYNCING_REMOTE, &mddev->recovery);
@@ -1296,8 +1295,7 @@ static void update_size(struct mddev *mddev, sector_t old_dev_sectors)
 		if (ret)
 			pr_err("%s:%d: failed to send CHANGE_CAPACITY msg\n",
 			       __func__, __LINE__);
-		set_capacity(mddev->gendisk, mddev->array_sectors);
-		revalidate_disk_size(mddev->gendisk, true);
+		set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
 	} else {
 		/* revert to previous sectors */
 		ret = mddev->pers->resize(mddev, old_dev_sectors);
diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index 5ab22069b5be9c..98f1b4b2bdcef8 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -200,9 +200,8 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 		"copied raid_disks doesn't match mddev->raid_disks");
 	rcu_assign_pointer(mddev->private, newconf);
 	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
-	set_capacity(mddev->gendisk, mddev->array_sectors);
+	set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
 	mddev_resume(mddev);
-	revalidate_disk_size(mddev->gendisk, true);
 	kfree_rcu(oldconf, rcu);
 	return 0;
 }
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 98bac4f304ae26..32e375d50fee17 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5355,10 +5355,9 @@ array_size_store(struct mddev *mddev, const char *buf, size_t len)
 
 	if (!err) {
 		mddev->array_sectors = sectors;
-		if (mddev->pers) {
-			set_capacity(mddev->gendisk, mddev->array_sectors);
-			revalidate_disk_size(mddev->gendisk, true);
-		}
+		if (mddev->pers)
+			set_capacity_and_notify(mddev->gendisk,
+						mddev->array_sectors);
 	}
 	mddev_unlock(mddev);
 	return err ?: len;
@@ -6107,8 +6106,7 @@ int do_md_run(struct mddev *mddev)
 	md_wakeup_thread(mddev->thread);
 	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
 
-	set_capacity(mddev->gendisk, mddev->array_sectors);
-	revalidate_disk_size(mddev->gendisk, true);
+	set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
 	clear_bit(MD_NOT_READY, &mddev->flags);
 	mddev->changed = 1;
 	kobject_uevent(&disk_to_dev(mddev->gendisk)->kobj, KOBJ_CHANGE);
@@ -6423,10 +6421,9 @@ static int do_md_stop(struct mddev *mddev, int mode,
 			if (rdev->raid_disk >= 0)
 				sysfs_unlink_rdev(mddev, rdev);
 
-		set_capacity(disk, 0);
+		set_capacity_and_notify(disk, 0);
 		mutex_unlock(&mddev->open_mutex);
 		mddev->changed = 1;
-		revalidate_disk_size(disk, true);
 
 		if (mddev->ro)
 			mddev->ro = 0;
@@ -7257,8 +7254,8 @@ static int update_size(struct mddev *mddev, sector_t num_sectors)
 		if (mddev_is_clustered(mddev))
 			md_cluster_ops->update_size(mddev, old_dev_sectors);
 		else if (mddev->queue) {
-			set_capacity(mddev->gendisk, mddev->array_sectors);
-			revalidate_disk_size(mddev->gendisk, true);
+			set_capacity_and_notify(mddev->gendisk,
+						mddev->array_sectors);
 		}
 	}
 	return rv;
@@ -9035,10 +9032,9 @@ void md_do_sync(struct md_thread *thread)
 		mddev_lock_nointr(mddev);
 		md_set_array_sectors(mddev, mddev->pers->size(mddev, 0, 0));
 		mddev_unlock(mddev);
-		if (!mddev_is_clustered(mddev)) {
-			set_capacity(mddev->gendisk, mddev->array_sectors);
-			revalidate_disk_size(mddev->gendisk, true);
-		}
+		if (!mddev_is_clustered(mddev))
+			set_capacity_and_notify(mddev->gendisk,
+						mddev->array_sectors);
 	}
 
 	spin_lock(&mddev->lock);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 21/78] md: remove a spurious call to revalidate_disk_size in update_size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (19 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 20/78] md: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 22/78] virtio-blk: remove a spurious call to revalidate_disk_size Christoph Hellwig
                   ` (58 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

None of the ->resize methods updates the disk size, so calling
revalidate_disk_size here won't do anything.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
---
 drivers/md/md-cluster.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index 87442dc59f6ca3..35e2690c1803dd 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -1299,8 +1299,6 @@ static void update_size(struct mddev *mddev, sector_t old_dev_sectors)
 	} else {
 		/* revert to previous sectors */
 		ret = mddev->pers->resize(mddev, old_dev_sectors);
-		if (!ret)
-			revalidate_disk_size(mddev->gendisk, true);
 		ret = __sendmsg(cinfo, &cmsg);
 		if (ret)
 			pr_err("%s:%d: failed to send METADATA_UPDATED msg\n",
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 22/78] virtio-blk: remove a spurious call to revalidate_disk_size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (20 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 21/78] md: remove a spurious call to revalidate_disk_size in update_size Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 23/78] block: unexport revalidate_disk_size Christoph Hellwig
                   ` (57 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

revalidate_disk_size just updates the block device size from the disk
size.  Thus calling it from virtblk_update_cache_mode doesn't actually
do anything.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/block/virtio_blk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 3e812b4c32e669..145606dc52db1e 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -598,7 +598,6 @@ static void virtblk_update_cache_mode(struct virtio_device *vdev)
 	struct virtio_blk *vblk = vdev->priv;
 
 	blk_queue_write_cache(vblk->disk->queue, writeback, false);
-	revalidate_disk_size(vblk->disk, true);
 }
 
 static const char *const virtblk_cache_types[] = {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 23/78] block: unexport revalidate_disk_size
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (21 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 22/78] virtio-blk: remove a spurious call to revalidate_disk_size Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 24/78] mtd_blkdevs: don't override BLKFLSBUF Christoph Hellwig
                   ` (56 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

revalidate_disk_size is now only called from set_capacity_and_notify,
so drop the export.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/block_dev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 66ebf594c97f47..d8664f5c1ff669 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1362,7 +1362,6 @@ void revalidate_disk_size(struct gendisk *disk, bool verbose)
 		bdput(bdev);
 	}
 }
-EXPORT_SYMBOL(revalidate_disk_size);
 
 void bd_set_nr_sectors(struct block_device *bdev, sector_t sectors)
 {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 24/78] mtd_blkdevs: don't override BLKFLSBUF
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (22 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 23/78] block: unexport revalidate_disk_size Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 25/78] block: don't call into the driver for BLKFLSBUF Christoph Hellwig
                   ` (55 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Richard Weinberger

BLKFLSBUF is not supposed to actually send a flush command to the device,
but to tear down buffer cache structures.  Remove the mtd_blkdevs
implementation and just use the default semantics instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/mtd_blkdevs.c | 28 ----------------------------
 1 file changed, 28 deletions(-)

diff --git a/drivers/mtd/mtd_blkdevs.c b/drivers/mtd/mtd_blkdevs.c
index 0c05f77f9b216e..fb8e12d590a13a 100644
--- a/drivers/mtd/mtd_blkdevs.c
+++ b/drivers/mtd/mtd_blkdevs.c
@@ -298,38 +298,10 @@ static int blktrans_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 	return ret;
 }
 
-static int blktrans_ioctl(struct block_device *bdev, fmode_t mode,
-			      unsigned int cmd, unsigned long arg)
-{
-	struct mtd_blktrans_dev *dev = blktrans_dev_get(bdev->bd_disk);
-	int ret = -ENXIO;
-
-	if (!dev)
-		return ret;
-
-	mutex_lock(&dev->lock);
-
-	if (!dev->mtd)
-		goto unlock;
-
-	switch (cmd) {
-	case BLKFLSBUF:
-		ret = dev->tr->flush ? dev->tr->flush(dev) : 0;
-		break;
-	default:
-		ret = -ENOTTY;
-	}
-unlock:
-	mutex_unlock(&dev->lock);
-	blktrans_dev_put(dev);
-	return ret;
-}
-
 static const struct block_device_operations mtd_block_ops = {
 	.owner		= THIS_MODULE,
 	.open		= blktrans_open,
 	.release	= blktrans_release,
-	.ioctl		= blktrans_ioctl,
 	.getgeo		= blktrans_getgeo,
 };
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 25/78] block: don't call into the driver for BLKFLSBUF
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (23 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 24/78] mtd_blkdevs: don't override BLKFLSBUF Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 26/78] block: add a new set_read_only method Christoph Hellwig
                   ` (54 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

BLKFLSBUF is entirely contained in the block core, and there is no
good reason to give the driver a hook into processing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 3fbc382eb926d4..c6d8863f040945 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -369,15 +369,8 @@ static inline int is_unrecognized_ioctl(int ret)
 static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
 		unsigned cmd, unsigned long arg)
 {
-	int ret;
-
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
-
-	ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-	if (!is_unrecognized_ioctl(ret))
-		return ret;
-
 	fsync_bdev(bdev);
 	invalidate_bdev(bdev);
 	return 0;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 26/78] block: add a new set_read_only method
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (24 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 25/78] block: don't call into the driver for BLKFLSBUF Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 27/78] rbd: implement ->set_read_only to hook into BLKROSET processing Christoph Hellwig
                   ` (53 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Add a new method to allow for driver-specific processing when setting or
clearing the block device read-only state.  This allows to replace the
cumbersome and error-prone override of the whole ioctl implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c          | 5 +++++
 include/linux/blkdev.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/block/ioctl.c b/block/ioctl.c
index c6d8863f040945..a6fa16b9770593 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -389,6 +389,11 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
 		return ret;
 	if (get_user(n, (int __user *)arg))
 		return -EFAULT;
+	if (bdev->bd_disk->fops->set_read_only) {
+		ret = bdev->bd_disk->fops->set_read_only(bdev, n);
+		if (ret)
+			return ret;
+	}
 	set_device_ro(bdev, n);
 	return 0;
 }
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 639cae2c158b59..5c1ba8a8d2bc7e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1850,6 +1850,7 @@ struct block_device_operations {
 	void (*unlock_native_capacity) (struct gendisk *);
 	int (*revalidate_disk) (struct gendisk *);
 	int (*getgeo)(struct block_device *, struct hd_geometry *);
+	int (*set_read_only)(struct block_device *bdev, bool ro);
 	/* this callback is with swap_lock and sometimes page table lock held */
 	void (*swap_slot_free_notify) (struct block_device *, unsigned long);
 	int (*report_zones)(struct gendisk *, sector_t sector,
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 27/78] rbd: implement ->set_read_only to hook into BLKROSET processing
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (25 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 26/78] block: add a new set_read_only method Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 28/78] md: " Christoph Hellwig
                   ` (52 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Implement the ->set_read_only method instead of parsing the actual
ioctl command.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
---
 drivers/block/rbd.c | 40 ++++------------------------------------
 1 file changed, 4 insertions(+), 36 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b7a194ffda55b4..2ed79b09439a82 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -692,12 +692,9 @@ static void rbd_release(struct gendisk *disk, fmode_t mode)
 	put_device(&rbd_dev->dev);
 }
 
-static int rbd_ioctl_set_ro(struct rbd_device *rbd_dev, unsigned long arg)
+static int rbd_set_read_only(struct block_device *bdev, bool ro)
 {
-	int ro;
-
-	if (get_user(ro, (int __user *)arg))
-		return -EFAULT;
+	struct rbd_device *rbd_dev = bdev->bd_disk->private_data;
 
 	/*
 	 * Both images mapped read-only and snapshots can't be marked
@@ -710,43 +707,14 @@ static int rbd_ioctl_set_ro(struct rbd_device *rbd_dev, unsigned long arg)
 		rbd_assert(!rbd_is_snap(rbd_dev));
 	}
 
-	/* Let blkdev_roset() handle it */
-	return -ENOTTY;
-}
-
-static int rbd_ioctl(struct block_device *bdev, fmode_t mode,
-			unsigned int cmd, unsigned long arg)
-{
-	struct rbd_device *rbd_dev = bdev->bd_disk->private_data;
-	int ret;
-
-	switch (cmd) {
-	case BLKROSET:
-		ret = rbd_ioctl_set_ro(rbd_dev, arg);
-		break;
-	default:
-		ret = -ENOTTY;
-	}
-
-	return ret;
-}
-
-#ifdef CONFIG_COMPAT
-static int rbd_compat_ioctl(struct block_device *bdev, fmode_t mode,
-				unsigned int cmd, unsigned long arg)
-{
-	return rbd_ioctl(bdev, mode, cmd, arg);
+	return 0;
 }
-#endif /* CONFIG_COMPAT */
 
 static const struct block_device_operations rbd_bd_ops = {
 	.owner			= THIS_MODULE,
 	.open			= rbd_open,
 	.release		= rbd_release,
-	.ioctl			= rbd_ioctl,
-#ifdef CONFIG_COMPAT
-	.compat_ioctl		= rbd_compat_ioctl,
-#endif
+	.set_read_only		= rbd_set_read_only,
 };
 
 /*
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 28/78] md: implement ->set_read_only to hook into BLKROSET processing
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (26 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 27/78] rbd: implement ->set_read_only to hook into BLKROSET processing Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 17:37   ` Song Liu
  2020-11-16 14:57 ` [PATCH 29/78] dasd: " Christoph Hellwig
                   ` (51 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Implement the ->set_read_only method instead of parsing the actual
ioctl command.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 62 ++++++++++++++++++++++++-------------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 32e375d50fee17..fa31b71a72a35d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7477,7 +7477,6 @@ static inline bool md_ioctl_valid(unsigned int cmd)
 {
 	switch (cmd) {
 	case ADD_NEW_DISK:
-	case BLKROSET:
 	case GET_ARRAY_INFO:
 	case GET_BITMAP_FILE:
 	case GET_DISK_INFO:
@@ -7504,7 +7503,6 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 	int err = 0;
 	void __user *argp = (void __user *)arg;
 	struct mddev *mddev = NULL;
-	int ro;
 	bool did_set_md_closing = false;
 
 	if (!md_ioctl_valid(cmd))
@@ -7684,35 +7682,6 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 			goto unlock;
 		}
 		break;
-
-	case BLKROSET:
-		if (get_user(ro, (int __user *)(arg))) {
-			err = -EFAULT;
-			goto unlock;
-		}
-		err = -EINVAL;
-
-		/* if the bdev is going readonly the value of mddev->ro
-		 * does not matter, no writes are coming
-		 */
-		if (ro)
-			goto unlock;
-
-		/* are we are already prepared for writes? */
-		if (mddev->ro != 1)
-			goto unlock;
-
-		/* transitioning to readauto need only happen for
-		 * arrays that call md_write_start
-		 */
-		if (mddev->pers) {
-			err = restart_array(mddev);
-			if (err == 0) {
-				mddev->ro = 2;
-				set_disk_ro(mddev->gendisk, 0);
-			}
-		}
-		goto unlock;
 	}
 
 	/*
@@ -7806,6 +7775,36 @@ static int md_compat_ioctl(struct block_device *bdev, fmode_t mode,
 }
 #endif /* CONFIG_COMPAT */
 
+static int md_set_read_only(struct block_device *bdev, bool ro)
+{
+	struct mddev *mddev = bdev->bd_disk->private_data;
+	int err;
+
+	err = mddev_lock(mddev);
+	if (err)
+		return err;
+
+	if (!mddev->raid_disks && !mddev->external) {
+		err = -ENODEV;
+		goto out_unlock;
+	}
+
+	/*
+	 * Transitioning to read-auto need only happen for arrays that call
+	 * md_write_start and which are not ready for writes yet.
+	 */
+	if (!ro && mddev->ro == 1 && mddev->pers) {
+		err = restart_array(mddev);
+		if (err)
+			goto out_unlock;
+		mddev->ro = 2;
+	}
+
+out_unlock:
+	mddev_unlock(mddev);
+	return err;
+}
+
 static int md_open(struct block_device *bdev, fmode_t mode)
 {
 	/*
@@ -7883,6 +7882,7 @@ const struct block_device_operations md_fops =
 #endif
 	.getgeo		= md_getgeo,
 	.check_events	= md_check_events,
+	.set_read_only	= md_set_read_only,
 };
 
 static int md_thread(void *arg)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 29/78] dasd: implement ->set_read_only to hook into BLKROSET processing
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (27 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 28/78] md: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 30/78] block: don't call into the driver for BLKROSET Christoph Hellwig
                   ` (50 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Implement the ->set_read_only method instead of parsing the actual
ioctl command.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/s390/block/dasd.c       |  1 +
 drivers/s390/block/dasd_int.h   |  3 ++-
 drivers/s390/block/dasd_ioctl.c | 27 +++++++++------------------
 3 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index eb17fea8075c6f..db24e04ee9781e 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3394,6 +3394,7 @@ dasd_device_operations = {
 	.ioctl		= dasd_ioctl,
 	.compat_ioctl	= dasd_ioctl,
 	.getgeo		= dasd_getgeo,
+	.set_read_only	= dasd_set_read_only,
 };
 
 /*******************************************************************************
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index fa552f9f166671..c59a0d63b506e6 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -844,7 +844,8 @@ int dasd_scan_partitions(struct dasd_block *);
 void dasd_destroy_partitions(struct dasd_block *);
 
 /* externals in dasd_ioctl.c */
-int  dasd_ioctl(struct block_device *, fmode_t, unsigned int, unsigned long);
+int dasd_ioctl(struct block_device *, fmode_t, unsigned int, unsigned long);
+int dasd_set_read_only(struct block_device *bdev, bool ro);
 
 /* externals in dasd_proc.c */
 int dasd_proc_init(void);
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index cb6427fb9f3d16..3359559517bfcf 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -532,28 +532,22 @@ static int dasd_ioctl_information(struct dasd_block *block, void __user *argp,
 /*
  * Set read only
  */
-static int
-dasd_ioctl_set_ro(struct block_device *bdev, void __user *argp)
+int dasd_set_read_only(struct block_device *bdev, bool ro)
 {
 	struct dasd_device *base;
-	int intval, rc;
+	int rc;
 
-	if (!capable(CAP_SYS_ADMIN))
-		return -EACCES;
+	/* do not manipulate hardware state for partitions */
 	if (bdev_is_partition(bdev))
-		// ro setting is not allowed for partitions
-		return -EINVAL;
-	if (get_user(intval, (int __user *)argp))
-		return -EFAULT;
+		return 0;
+
 	base = dasd_device_from_gendisk(bdev->bd_disk);
 	if (!base)
 		return -ENODEV;
-	if (!intval && test_bit(DASD_FLAG_DEVICE_RO, &base->flags)) {
-		dasd_put_device(base);
-		return -EROFS;
-	}
-	set_disk_ro(bdev->bd_disk, intval);
-	rc = dasd_set_feature(base->cdev, DASD_FEATURE_READONLY, intval);
+	if (!ro && test_bit(DASD_FLAG_DEVICE_RO, &base->flags))
+		rc = -EROFS;
+	else
+		rc = dasd_set_feature(base->cdev, DASD_FEATURE_READONLY, ro);
 	dasd_put_device(base);
 	return rc;
 }
@@ -633,9 +627,6 @@ int dasd_ioctl(struct block_device *bdev, fmode_t mode,
 	case BIODASDPRRST:
 		rc = dasd_ioctl_reset_profile(block);
 		break;
-	case BLKROSET:
-		rc = dasd_ioctl_set_ro(bdev, argp);
-		break;
 	case DASDAPIVER:
 		rc = dasd_ioctl_api_version(argp);
 		break;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 30/78] block: don't call into the driver for BLKROSET
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (28 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 29/78] dasd: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:19   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 31/78] loop: use set_disk_ro Christoph Hellwig
                   ` (49 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Now that all drivers that want to hook into setting or clearing the
read-only flag use the set_read_only method, this code can be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c | 23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index a6fa16b9770593..96cb4544736468 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -346,26 +346,6 @@ static int blkdev_pr_clear(struct block_device *bdev,
 	return ops->pr_clear(bdev, c.key);
 }
 
-/*
- * Is it an unrecognized ioctl? The correct returns are either
- * ENOTTY (final) or ENOIOCTLCMD ("I don't know this one, try a
- * fallback"). ENOIOCTLCMD gets turned into ENOTTY by the ioctl
- * code before returning.
- *
- * Confused drivers sometimes return EINVAL, which is wrong. It
- * means "I understood the ioctl command, but the parameters to
- * it were wrong".
- *
- * We should aim to just fix the broken drivers, the EINVAL case
- * should go away.
- */
-static inline int is_unrecognized_ioctl(int ret)
-{
-	return	ret == -EINVAL ||
-		ret == -ENOTTY ||
-		ret == -ENOIOCTLCMD;
-}
-
 static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
 		unsigned cmd, unsigned long arg)
 {
@@ -384,9 +364,6 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-	if (!is_unrecognized_ioctl(ret))
-		return ret;
 	if (get_user(n, (int __user *)arg))
 		return -EFAULT;
 	if (bdev->bd_disk->fops->set_read_only) {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 31/78] loop: use set_disk_ro
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (29 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 30/78] block: don't call into the driver for BLKROSET Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:20   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 32/78] block: remove set_device_ro Christoph Hellwig
                   ` (48 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use set_disk_ro instead of set_device_ro to match all other block
drivers and to ensure all partitions mirror the read-only flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 84a36c242e5550..41caf799df721f 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1134,7 +1134,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	if (error)
 		goto out_unlock;
 
-	set_device_ro(bdev, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
+	set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
 
 	lo->use_dio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
 	lo->lo_device = bdev;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 32/78] block: remove set_device_ro
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (30 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 31/78] loop: use set_disk_ro Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:20   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 33/78] block: remove __blkdev_driver_ioctl Christoph Hellwig
                   ` (47 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Fold set_device_ro into its only remaining caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c         | 7 -------
 block/ioctl.c         | 2 +-
 include/linux/genhd.h | 1 -
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 8c350fecfe8bfe..b0f0b0cac9aa7f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1843,13 +1843,6 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 	kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp);
 }
 
-void set_device_ro(struct block_device *bdev, int flag)
-{
-	bdev->bd_part->policy = flag;
-}
-
-EXPORT_SYMBOL(set_device_ro);
-
 void set_disk_ro(struct gendisk *disk, int flag)
 {
 	struct disk_part_iter piter;
diff --git a/block/ioctl.c b/block/ioctl.c
index 96cb4544736468..04255dc5f3bff3 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -371,7 +371,7 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
 		if (ret)
 			return ret;
 	}
-	set_device_ro(bdev, n);
+	bdev->bd_part->policy = n;
 	return 0;
 }
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 4b22bfd9336e1a..8427ad8bef520d 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -304,7 +304,6 @@ extern void del_gendisk(struct gendisk *gp);
 extern struct gendisk *get_gendisk(dev_t dev, int *partno);
 extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
 
-extern void set_device_ro(struct block_device *bdev, int flag);
 extern void set_disk_ro(struct gendisk *disk, int flag);
 
 static inline int get_disk_ro(struct gendisk *disk)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 33/78] block: remove __blkdev_driver_ioctl
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (31 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 32/78] block: remove set_device_ro Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:22   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 34/78] block: propagate BLKROSET to all partitions Christoph Hellwig
                   ` (46 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Just open code it in the few callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c               | 25 +++++--------------------
 drivers/block/pktcdvd.c     |  6 ++++--
 drivers/md/bcache/request.c |  5 +++--
 drivers/md/dm.c             |  5 ++++-
 include/linux/blkdev.h      |  2 --
 5 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 04255dc5f3bff3..6b785181344fe1 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -219,23 +219,6 @@ static int compat_put_ulong(compat_ulong_t __user *argp, compat_ulong_t val)
 }
 #endif
 
-int __blkdev_driver_ioctl(struct block_device *bdev, fmode_t mode,
-			unsigned cmd, unsigned long arg)
-{
-	struct gendisk *disk = bdev->bd_disk;
-
-	if (disk->fops->ioctl)
-		return disk->fops->ioctl(bdev, mode, cmd, arg);
-
-	return -ENOTTY;
-}
-/*
- * For the record: _GPL here is only because somebody decided to slap it
- * on the previous export.  Sheer idiocy, since it wasn't copyrightable
- * at all and could be open-coded without any exports by anybody who cares.
- */
-EXPORT_SYMBOL_GPL(__blkdev_driver_ioctl);
-
 #ifdef CONFIG_COMPAT
 /*
  * This is the equivalent of compat_ptr_ioctl(), to be used by block
@@ -594,10 +577,12 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 	}
 
 	ret = blkdev_common_ioctl(bdev, mode, cmd, arg, argp);
-	if (ret == -ENOIOCTLCMD)
-		return __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+	if (ret != -ENOIOCTLCMD)
+		return ret;
 
-	return ret;
+	if (!bdev->bd_disk->fops->ioctl)
+		return -ENOTTY;
+	return bdev->bd_disk->fops->ioctl(bdev, mode, cmd, arg);
 }
 EXPORT_SYMBOL_GPL(blkdev_ioctl); /* for /dev/raw */
 
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 4326401cede445..b8bb8ec7538d9b 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2583,9 +2583,11 @@ static int pkt_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd,
 	case CDROM_LAST_WRITTEN:
 	case CDROM_SEND_PACKET:
 	case SCSI_IOCTL_SEND_COMMAND:
-		ret = __blkdev_driver_ioctl(pd->bdev, mode, cmd, arg);
+		if (!bdev->bd_disk->fops->ioctl)
+			ret = -ENOTTY;
+		else
+			ret = bdev->bd_disk->fops->ioctl(bdev, mode, cmd, arg);
 		break;
-
 	default:
 		pkt_dbg(2, pd, "Unknown ioctl (%x)\n", cmd);
 		ret = -ENOTTY;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 21432638314562..afac8d07c1bd00 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1230,8 +1230,9 @@ static int cached_dev_ioctl(struct bcache_device *d, fmode_t mode,
 
 	if (dc->io_disable)
 		return -EIO;
-
-	return __blkdev_driver_ioctl(dc->bdev, mode, cmd, arg);
+	if (!dc->bdev->bd_disk->fops->ioctl)
+		return -ENOTTY;
+	return dc->bdev->bd_disk->fops->ioctl(dc->bdev, mode, cmd, arg);
 }
 
 void bch_cached_dev_request_init(struct cached_dev *dc)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 62ad44925e73ec..54739f1b579bc8 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -570,7 +570,10 @@ static int dm_blk_ioctl(struct block_device *bdev, fmode_t mode,
 		}
 	}
 
-	r =  __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+	if (!bdev->bd_disk->fops->ioctl)
+		r = -ENOTTY;
+	else
+		r = bdev->bd_disk->fops->ioctl(bdev, mode, cmd, arg);
 out:
 	dm_unprepare_ioctl(md, srcu_idx);
 	return r;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5c1ba8a8d2bc7e..05b346a68c2eee 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1867,8 +1867,6 @@ extern int blkdev_compat_ptr_ioctl(struct block_device *, fmode_t,
 #define blkdev_compat_ptr_ioctl NULL
 #endif
 
-extern int __blkdev_driver_ioctl(struct block_device *, fmode_t, unsigned int,
-				 unsigned long);
 extern int bdev_read_page(struct block_device *, sector_t, struct page *);
 extern int bdev_write_page(struct block_device *, sector_t, struct page *,
 						struct writeback_control *);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 34/78] block: propagate BLKROSET to all partitions
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (32 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 33/78] block: remove __blkdev_driver_ioctl Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:23   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 35/78] block: cleanup del_gendisk a bit Christoph Hellwig
                   ` (45 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

When setting the whole device read-only (or clearing the read-only
state), also update the policy for all partitions.  The s390 dasd
driver has awlways been doing this and it makes a lot of sense.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 6b785181344fe1..22f394d118c302 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -354,7 +354,10 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
 		if (ret)
 			return ret;
 	}
-	bdev->bd_part->policy = n;
+	if (bdev_is_partition(bdev))
+		bdev->bd_part->policy = n;
+	else
+		set_disk_ro(bdev->bd_disk, n);
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 35/78] block: cleanup del_gendisk a bit
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (33 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 34/78] block: propagate BLKROSET to all partitions Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 36/78] block: open code kobj_map into in block/genhd.c Christoph Hellwig
                   ` (44 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Merge three hidden gendisk checks into one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 block/genhd.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index b0f0b0cac9aa7f..8180195b76634b 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -892,6 +892,9 @@ void del_gendisk(struct gendisk *disk)
 
 	might_sleep();
 
+	if (WARN_ON_ONCE(!disk->queue))
+		return;
+
 	blk_integrity_del(disk);
 	disk_del_events(disk);
 
@@ -914,20 +917,18 @@ void del_gendisk(struct gendisk *disk)
 	disk->flags &= ~GENHD_FL_UP;
 	up_write(&disk->lookup_sem);
 
-	if (!(disk->flags & GENHD_FL_HIDDEN))
+	if (!(disk->flags & GENHD_FL_HIDDEN)) {
 		sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
-	if (disk->queue) {
+
 		/*
 		 * Unregister bdi before releasing device numbers (as they can
 		 * get reused and we'd get clashes in sysfs).
 		 */
-		if (!(disk->flags & GENHD_FL_HIDDEN))
-			bdi_unregister(disk->queue->backing_dev_info);
-		blk_unregister_queue(disk);
-	} else {
-		WARN_ON(1);
+		bdi_unregister(disk->queue->backing_dev_info);
 	}
 
+	blk_unregister_queue(disk);
+	
 	if (!(disk->flags & GENHD_FL_HIDDEN))
 		blk_unregister_region(disk_devt(disk), disk->minors);
 	/*
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 36/78] block: open code kobj_map into in block/genhd.c
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (34 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 35/78] block: cleanup del_gendisk a bit Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 37/78] block: split block_class_lock Christoph Hellwig
                   ` (43 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Copy and paste the kobj_map functionality in the block code in preparation
for completely rewriting it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c | 130 +++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 117 insertions(+), 13 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 8180195b76634b..482f7b89802010 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -17,7 +17,6 @@
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/kmod.h>
-#include <linux/kobj_map.h>
 #include <linux/mutex.h>
 #include <linux/idr.h>
 #include <linux/log2.h>
@@ -29,6 +28,16 @@
 static DEFINE_MUTEX(block_class_lock);
 static struct kobject *block_depr;
 
+struct bdev_map {
+	struct bdev_map *next;
+	dev_t dev;
+	unsigned long range;
+	struct module *owner;
+	struct kobject *(*probe)(dev_t, int *, void *);
+	int (*lock)(dev_t, void *);
+	void *data;
+} *bdev_map[255];
+
 /* for extended dynamic devt allocation, currently only one major is used */
 #define NR_EXT_DEVT		(1 << MINORBITS)
 
@@ -517,8 +526,6 @@ void unregister_blkdev(unsigned int major, const char *name)
 
 EXPORT_SYMBOL(unregister_blkdev);
 
-static struct kobj_map *bdev_map;
-
 /**
  * blk_mangle_minor - scatter minor numbers apart
  * @minor: minor number to mangle
@@ -645,16 +652,60 @@ void blk_register_region(dev_t devt, unsigned long range, struct module *module,
 			 struct kobject *(*probe)(dev_t, int *, void *),
 			 int (*lock)(dev_t, void *), void *data)
 {
-	kobj_map(bdev_map, devt, range, module, probe, lock, data);
-}
+	unsigned n = MAJOR(devt + range - 1) - MAJOR(devt) + 1;
+	unsigned index = MAJOR(devt);
+	unsigned i;
+	struct bdev_map *p;
+
+	n = min(n, 255u);
+	p = kmalloc_array(n, sizeof(struct bdev_map), GFP_KERNEL);
+	if (p == NULL)
+		return;
 
+	for (i = 0; i < n; i++, p++) {
+		p->owner = module;
+		p->probe = probe;
+		p->lock = lock;
+		p->dev = devt;
+		p->range = range;
+		p->data = data;
+	}
+
+	mutex_lock(&block_class_lock);
+	for (i = 0, p -= n; i < n; i++, p++, index++) {
+		struct bdev_map **s = &bdev_map[index % 255];
+		while (*s && (*s)->range < range)
+			s = &(*s)->next;
+		p->next = *s;
+		*s = p;
+	}
+	mutex_unlock(&block_class_lock);
+}
 EXPORT_SYMBOL(blk_register_region);
 
 void blk_unregister_region(dev_t devt, unsigned long range)
 {
-	kobj_unmap(bdev_map, devt, range);
-}
+	unsigned n = MAJOR(devt + range - 1) - MAJOR(devt) + 1;
+	unsigned index = MAJOR(devt);
+	unsigned i;
+	struct bdev_map *found = NULL;
 
+	mutex_lock(&block_class_lock);
+	for (i = 0; i < min(n, 255u); i++, index++) {
+		struct bdev_map **s;
+		for (s = &bdev_map[index % 255]; *s; s = &(*s)->next) {
+			struct bdev_map *p = *s;
+			if (p->dev == devt && p->range == range) {
+				*s = p->next;
+				if (!found)
+					found = p;
+				break;
+			}
+		}
+	}
+	mutex_unlock(&block_class_lock);
+	kfree(found);
+}
 EXPORT_SYMBOL(blk_unregister_region);
 
 static struct kobject *exact_match(dev_t devt, int *partno, void *data)
@@ -976,6 +1027,47 @@ static ssize_t disk_badblocks_store(struct device *dev,
 	return badblocks_store(disk->bb, page, len, 0);
 }
 
+static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
+{
+	struct kobject *kobj;
+	struct bdev_map *p;
+	unsigned long best = ~0UL;
+
+retry:
+	mutex_lock(&block_class_lock);
+	for (p = bdev_map[MAJOR(dev) % 255]; p; p = p->next) {
+		struct kobject *(*probe)(dev_t, int *, void *);
+		struct module *owner;
+		void *data;
+
+		if (p->dev > dev || p->dev + p->range - 1 < dev)
+			continue;
+		if (p->range - 1 >= best)
+			break;
+		if (!try_module_get(p->owner))
+			continue;
+		owner = p->owner;
+		data = p->data;
+		probe = p->probe;
+		best = p->range - 1;
+		*partno = dev - p->dev;
+		if (p->lock && p->lock(dev, data) < 0) {
+			module_put(owner);
+			continue;
+		}
+		mutex_unlock(&block_class_lock);
+		kobj = probe(dev, partno, data);
+		/* Currently ->owner protects _only_ ->probe() itself. */
+		module_put(owner);
+		if (kobj)
+			return dev_to_disk(kobj_to_dev(kobj));
+		goto retry;
+	}
+	mutex_unlock(&block_class_lock);
+	return NULL;
+}
+
+
 /**
  * get_gendisk - get partitioning information for a given device
  * @devt: device to get partitioning information for
@@ -993,11 +1085,7 @@ struct gendisk *get_gendisk(dev_t devt, int *partno)
 	might_sleep();
 
 	if (MAJOR(devt) != BLOCK_EXT_MAJOR) {
-		struct kobject *kobj;
-
-		kobj = kobj_lookup(bdev_map, devt, partno);
-		if (kobj)
-			disk = dev_to_disk(kobj_to_dev(kobj));
+		disk = lookup_gendisk(devt, partno);
 	} else {
 		struct hd_struct *part;
 
@@ -1210,6 +1298,22 @@ static struct kobject *base_probe(dev_t devt, int *partno, void *data)
 	return NULL;
 }
 
+static void bdev_map_init(void)
+{
+	struct bdev_map *base;
+	int i;
+
+	base = kzalloc(sizeof(*base), GFP_KERNEL);
+	if (!base)
+		panic("cannot allocate bdev_map");
+
+	base->dev = 1;
+	base->range = ~0 ;
+	base->probe = base_probe;
+	for (i = 0; i < 255; i++)
+		bdev_map[i] = base;
+}
+
 static int __init genhd_device_init(void)
 {
 	int error;
@@ -1218,7 +1322,7 @@ static int __init genhd_device_init(void)
 	error = class_register(&block_class);
 	if (unlikely(error))
 		return error;
-	bdev_map = kobj_map_init(base_probe, &block_class_lock);
+	bdev_map_init();
 	blk_dev_init();
 
 	register_blkdev(BLOCK_EXT_MAJOR, "blkext");
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 37/78] block: split block_class_lock
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (35 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 36/78] block: open code kobj_map into in block/genhd.c Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 38/78] block: rework requesting modules for unclaimed devices Christoph Hellwig
                   ` (42 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Split the block_class_lock mutex into one each to protect bdev_map
and major_names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 block/genhd.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 482f7b89802010..2a20372756625e 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -25,7 +25,6 @@
 
 #include "blk.h"
 
-static DEFINE_MUTEX(block_class_lock);
 static struct kobject *block_depr;
 
 struct bdev_map {
@@ -37,6 +36,7 @@ struct bdev_map {
 	int (*lock)(dev_t, void *);
 	void *data;
 } *bdev_map[255];
+static DEFINE_MUTEX(bdev_map_lock);
 
 /* for extended dynamic devt allocation, currently only one major is used */
 #define NR_EXT_DEVT		(1 << MINORBITS)
@@ -400,6 +400,7 @@ static struct blk_major_name {
 	int major;
 	char name[16];
 } *major_names[BLKDEV_MAJOR_HASH_SIZE];
+static DEFINE_MUTEX(major_names_lock);
 
 /* index in the above - for now: assume no multimajor ranges */
 static inline int major_to_index(unsigned major)
@@ -412,11 +413,11 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
 {
 	struct blk_major_name *dp;
 
-	mutex_lock(&block_class_lock);
+	mutex_lock(&major_names_lock);
 	for (dp = major_names[major_to_index(offset)]; dp; dp = dp->next)
 		if (dp->major == offset)
 			seq_printf(seqf, "%3d %s\n", dp->major, dp->name);
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&major_names_lock);
 }
 #endif /* CONFIG_PROC_FS */
 
@@ -445,7 +446,7 @@ int register_blkdev(unsigned int major, const char *name)
 	struct blk_major_name **n, *p;
 	int index, ret = 0;
 
-	mutex_lock(&block_class_lock);
+	mutex_lock(&major_names_lock);
 
 	/* temporary */
 	if (major == 0) {
@@ -498,7 +499,7 @@ int register_blkdev(unsigned int major, const char *name)
 		kfree(p);
 	}
 out:
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&major_names_lock);
 	return ret;
 }
 
@@ -510,7 +511,7 @@ void unregister_blkdev(unsigned int major, const char *name)
 	struct blk_major_name *p = NULL;
 	int index = major_to_index(major);
 
-	mutex_lock(&block_class_lock);
+	mutex_lock(&major_names_lock);
 	for (n = &major_names[index]; *n; n = &(*n)->next)
 		if ((*n)->major == major)
 			break;
@@ -520,7 +521,7 @@ void unregister_blkdev(unsigned int major, const char *name)
 		p = *n;
 		*n = p->next;
 	}
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&major_names_lock);
 	kfree(p);
 }
 
@@ -671,7 +672,7 @@ void blk_register_region(dev_t devt, unsigned long range, struct module *module,
 		p->data = data;
 	}
 
-	mutex_lock(&block_class_lock);
+	mutex_lock(&bdev_map_lock);
 	for (i = 0, p -= n; i < n; i++, p++, index++) {
 		struct bdev_map **s = &bdev_map[index % 255];
 		while (*s && (*s)->range < range)
@@ -679,7 +680,7 @@ void blk_register_region(dev_t devt, unsigned long range, struct module *module,
 		p->next = *s;
 		*s = p;
 	}
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&bdev_map_lock);
 }
 EXPORT_SYMBOL(blk_register_region);
 
@@ -690,7 +691,7 @@ void blk_unregister_region(dev_t devt, unsigned long range)
 	unsigned i;
 	struct bdev_map *found = NULL;
 
-	mutex_lock(&block_class_lock);
+	mutex_lock(&bdev_map_lock);
 	for (i = 0; i < min(n, 255u); i++, index++) {
 		struct bdev_map **s;
 		for (s = &bdev_map[index % 255]; *s; s = &(*s)->next) {
@@ -703,7 +704,7 @@ void blk_unregister_region(dev_t devt, unsigned long range)
 			}
 		}
 	}
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&bdev_map_lock);
 	kfree(found);
 }
 EXPORT_SYMBOL(blk_unregister_region);
@@ -1034,7 +1035,7 @@ static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
 	unsigned long best = ~0UL;
 
 retry:
-	mutex_lock(&block_class_lock);
+	mutex_lock(&bdev_map_lock);
 	for (p = bdev_map[MAJOR(dev) % 255]; p; p = p->next) {
 		struct kobject *(*probe)(dev_t, int *, void *);
 		struct module *owner;
@@ -1055,7 +1056,7 @@ static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
 			module_put(owner);
 			continue;
 		}
-		mutex_unlock(&block_class_lock);
+		mutex_unlock(&bdev_map_lock);
 		kobj = probe(dev, partno, data);
 		/* Currently ->owner protects _only_ ->probe() itself. */
 		module_put(owner);
@@ -1063,7 +1064,7 @@ static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
 			return dev_to_disk(kobj_to_dev(kobj));
 		goto retry;
 	}
-	mutex_unlock(&block_class_lock);
+	mutex_unlock(&bdev_map_lock);
 	return NULL;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 38/78] block: rework requesting modules for unclaimed devices
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (36 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 37/78] block: split block_class_lock Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 39/78] block: add an optional probe callback to major_names Christoph Hellwig
                   ` (41 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Instead of reusing the ranges in bdev_map, add a new helper that is
called if no ranges was found.  This is a first step to unpeel and
eventually remove the complex ranges structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 block/genhd.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 2a20372756625e..8391e7d83a6920 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1028,6 +1028,13 @@ static ssize_t disk_badblocks_store(struct device *dev,
 	return badblocks_store(disk->bb, page, len, 0);
 }
 
+static void request_gendisk_module(dev_t devt)
+{
+	if (request_module("block-major-%d-%d", MAJOR(devt), MINOR(devt)) > 0)
+		/* Make old-style 2.4 aliases work */
+		request_module("block-major-%d", MAJOR(devt));
+}
+
 static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
 {
 	struct kobject *kobj;
@@ -1052,6 +1059,14 @@ static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
 		probe = p->probe;
 		best = p->range - 1;
 		*partno = dev - p->dev;
+
+		if (!probe) {
+			mutex_unlock(&bdev_map_lock);
+			module_put(owner);
+			request_gendisk_module(dev);
+			goto retry;
+		}
+
 		if (p->lock && p->lock(dev, data) < 0) {
 			module_put(owner);
 			continue;
@@ -1290,15 +1305,6 @@ static const struct seq_operations partitions_op = {
 };
 #endif
 
-
-static struct kobject *base_probe(dev_t devt, int *partno, void *data)
-{
-	if (request_module("block-major-%d-%d", MAJOR(devt), MINOR(devt)) > 0)
-		/* Make old-style 2.4 aliases work */
-		request_module("block-major-%d", MAJOR(devt));
-	return NULL;
-}
-
 static void bdev_map_init(void)
 {
 	struct bdev_map *base;
@@ -1310,7 +1316,6 @@ static void bdev_map_init(void)
 
 	base->dev = 1;
 	base->range = ~0 ;
-	base->probe = base_probe;
 	for (i = 0; i < 255; i++)
 		bdev_map[i] = base;
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 39/78] block: add an optional probe callback to major_names
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (37 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 38/78] block: rework requesting modules for unclaimed devices Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 40/78] ide: remove ide_{,un}register_region Christoph Hellwig
                   ` (40 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Add a callback to the major_names array that allows a driver to override
how to probe for dev_t that doesn't currently have a gendisk registered.
This will help separating the lookup of the gendisk by dev_t vs probe
action for a not currently registered dev_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 block/genhd.c         | 21 ++++++++++++++++++---
 include/linux/genhd.h |  5 ++++-
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 8391e7d83a6920..dc8690bc281c16 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -399,6 +399,7 @@ static struct blk_major_name {
 	struct blk_major_name *next;
 	int major;
 	char name[16];
+	void (*probe)(dev_t devt);
 } *major_names[BLKDEV_MAJOR_HASH_SIZE];
 static DEFINE_MUTEX(major_names_lock);
 
@@ -441,7 +442,8 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
  * See Documentation/admin-guide/devices.txt for the list of allocated
  * major numbers.
  */
-int register_blkdev(unsigned int major, const char *name)
+int __register_blkdev(unsigned int major, const char *name,
+		void (*probe)(dev_t devt))
 {
 	struct blk_major_name **n, *p;
 	int index, ret = 0;
@@ -480,6 +482,7 @@ int register_blkdev(unsigned int major, const char *name)
 	}
 
 	p->major = major;
+	p->probe = probe;
 	strlcpy(p->name, name, sizeof(p->name));
 	p->next = NULL;
 	index = major_to_index(major);
@@ -502,8 +505,7 @@ int register_blkdev(unsigned int major, const char *name)
 	mutex_unlock(&major_names_lock);
 	return ret;
 }
-
-EXPORT_SYMBOL(register_blkdev);
+EXPORT_SYMBOL(__register_blkdev);
 
 void unregister_blkdev(unsigned int major, const char *name)
 {
@@ -1030,6 +1032,19 @@ static ssize_t disk_badblocks_store(struct device *dev,
 
 static void request_gendisk_module(dev_t devt)
 {
+	unsigned int major = MAJOR(devt);
+	struct blk_major_name **n;
+
+	mutex_lock(&major_names_lock);
+	for (n = &major_names[major_to_index(major)]; *n; n = &(*n)->next) {
+		if ((*n)->major == major && (*n)->probe) {
+			(*n)->probe(devt);
+			mutex_unlock(&major_names_lock);
+			return;
+		}
+	}
+	mutex_unlock(&major_names_lock);
+
 	if (request_module("block-major-%d-%d", MAJOR(devt), MINOR(devt)) > 0)
 		/* Make old-style 2.4 aliases work */
 		request_module("block-major-%d", MAJOR(devt));
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 8427ad8bef520d..04f6a6bf577a90 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -366,7 +366,10 @@ extern void blk_unregister_region(dev_t devt, unsigned long range);
 
 #define alloc_disk(minors) alloc_disk_node(minors, NUMA_NO_NODE)
 
-int register_blkdev(unsigned int major, const char *name);
+int __register_blkdev(unsigned int major, const char *name,
+		void (*probe)(dev_t devt));
+#define register_blkdev(major, name) \
+	__register_blkdev(major, name, NULL)
 void unregister_blkdev(unsigned int major, const char *name);
 
 void revalidate_disk_size(struct gendisk *disk, bool verbose);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 40/78] ide: remove ide_{,un}register_region
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (38 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 39/78] block: add an optional probe callback to major_names Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 41/78] swim: don't call blk_register_region Christoph Hellwig
                   ` (39 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

There is no need to ever register the fake gendisk used for ide-tape.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ide/ide-probe.c | 32 --------------------------------
 drivers/ide/ide-tape.c  |  2 --
 include/linux/ide.h     |  3 ---
 3 files changed, 37 deletions(-)

diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 1ddc45a04418cd..076d34b381720f 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -929,38 +929,6 @@ static struct kobject *ata_probe(dev_t dev, int *part, void *data)
 	return NULL;
 }
 
-static struct kobject *exact_match(dev_t dev, int *part, void *data)
-{
-	struct gendisk *p = data;
-	*part &= (1 << PARTN_BITS) - 1;
-	return &disk_to_dev(p)->kobj;
-}
-
-static int exact_lock(dev_t dev, void *data)
-{
-	struct gendisk *p = data;
-
-	if (!get_disk_and_module(p))
-		return -1;
-	return 0;
-}
-
-void ide_register_region(struct gendisk *disk)
-{
-	blk_register_region(MKDEV(disk->major, disk->first_minor),
-			    disk->minors, NULL, exact_match, exact_lock, disk);
-}
-
-EXPORT_SYMBOL_GPL(ide_register_region);
-
-void ide_unregister_region(struct gendisk *disk)
-{
-	blk_unregister_region(MKDEV(disk->major, disk->first_minor),
-			      disk->minors);
-}
-
-EXPORT_SYMBOL_GPL(ide_unregister_region);
-
 void ide_init_disk(struct gendisk *disk, ide_drive_t *drive)
 {
 	ide_hwif_t *hwif = drive->hwif;
diff --git a/drivers/ide/ide-tape.c b/drivers/ide/ide-tape.c
index 6f26634b22bbec..88b96437b22e62 100644
--- a/drivers/ide/ide-tape.c
+++ b/drivers/ide/ide-tape.c
@@ -1822,7 +1822,6 @@ static void ide_tape_remove(ide_drive_t *drive)
 
 	ide_proc_unregister_driver(drive, tape->driver);
 	device_del(&tape->dev);
-	ide_unregister_region(tape->disk);
 
 	mutex_lock(&idetape_ref_mutex);
 	put_device(&tape->dev);
@@ -2026,7 +2025,6 @@ static int ide_tape_probe(ide_drive_t *drive)
 		      "n%s", tape->name);
 
 	g->fops = &idetape_block_ops;
-	ide_register_region(g);
 
 	return 0;
 
diff --git a/include/linux/ide.h b/include/linux/ide.h
index 62653769509f89..2c300689a51a5c 100644
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -1493,9 +1493,6 @@ static inline void ide_acpi_port_init_devices(ide_hwif_t *hwif) { ; }
 static inline void ide_acpi_set_state(ide_hwif_t *hwif, int on) {}
 #endif
 
-void ide_register_region(struct gendisk *);
-void ide_unregister_region(struct gendisk *);
-
 void ide_check_nien_quirk_list(ide_drive_t *);
 void ide_undecoded_slave(ide_drive_t *);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 41/78] swim: don't call blk_register_region
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (39 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 40/78] ide: remove ide_{,un}register_region Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 42/78] sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t Christoph Hellwig
                   ` (38 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

The swim driver (unlike various other floppy drivers) doesn't have
magic device nodes for certain modes, and already registers a gendisk
for each of the floppies supported by a device.  Thus the region
registered is a no-op and can be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/swim.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/block/swim.c b/drivers/block/swim.c
index 52dd1efa00f9c5..cc6a0bc6c005a7 100644
--- a/drivers/block/swim.c
+++ b/drivers/block/swim.c
@@ -745,18 +745,6 @@ static const struct block_device_operations floppy_fops = {
 	.check_events	 = floppy_check_events,
 };
 
-static struct kobject *floppy_find(dev_t dev, int *part, void *data)
-{
-	struct swim_priv *swd = data;
-	int drive = (*part & 3);
-
-	if (drive >= swd->floppy_count)
-		return NULL;
-
-	*part = 0;
-	return get_disk_and_module(swd->unit[drive].disk);
-}
-
 static int swim_add_floppy(struct swim_priv *swd, enum drive_location location)
 {
 	struct floppy_state *fs = &swd->unit[swd->floppy_count];
@@ -846,9 +834,6 @@ static int swim_floppy_init(struct swim_priv *swd)
 		add_disk(swd->unit[drive].disk);
 	}
 
-	blk_register_region(MKDEV(FLOPPY_MAJOR, 0), 256, THIS_MODULE,
-			    floppy_find, NULL, swd);
-
 	return 0;
 
 exit_put_disks:
@@ -932,8 +917,6 @@ static int swim_remove(struct platform_device *dev)
 	int drive;
 	struct resource *res;
 
-	blk_unregister_region(MKDEV(FLOPPY_MAJOR, 0), 256);
-
 	for (drive = 0; drive < swd->floppy_count; drive++) {
 		del_gendisk(swd->unit[drive].disk);
 		blk_cleanup_queue(swd->unit[drive].disk->queue);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 42/78] sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (40 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 41/78] swim: don't call blk_register_region Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 43/78] brd: use __register_blkdev to allocate devices on demand Christoph Hellwig
                   ` (37 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Switch from using blk_register_region to the probe callback passed to
__register_blkdev to disable the request_module call for an unclaimed
dev_t in the SD majors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/sd.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index a2a4f385833d6c..679c2c02504763 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -630,13 +630,11 @@ static struct scsi_driver sd_template = {
 };
 
 /*
- * Dummy kobj_map->probe function.
- * The default ->probe function will call modprobe, which is
- * pointless as this module is already loaded.
+ * Don't request a new module, as that could deadlock in multipath
+ * environment.
  */
-static struct kobject *sd_default_probe(dev_t devt, int *partno, void *data)
+static void sd_default_probe(dev_t devt)
 {
-	return NULL;
 }
 
 /*
@@ -3525,9 +3523,6 @@ static int sd_remove(struct device *dev)
 
 	free_opal_dev(sdkp->opal_dev);
 
-	blk_register_region(devt, SD_MINORS, NULL,
-			    sd_default_probe, NULL, NULL);
-
 	mutex_lock(&sd_ref_mutex);
 	dev_set_drvdata(dev, NULL);
 	put_device(&sdkp->dev);
@@ -3717,11 +3712,9 @@ static int __init init_sd(void)
 	SCSI_LOG_HLQUEUE(3, printk("init_sd: sd driver entry point\n"));
 
 	for (i = 0; i < SD_MAJORS; i++) {
-		if (register_blkdev(sd_major(i), "sd") != 0)
+		if (__register_blkdev(sd_major(i), "sd", sd_default_probe))
 			continue;
 		majors++;
-		blk_register_region(sd_major(i), SD_MINORS, NULL,
-				    sd_default_probe, NULL, NULL);
 	}
 
 	if (!majors)
@@ -3794,10 +3787,8 @@ static void __exit exit_sd(void)
 
 	class_unregister(&sd_disk_class);
 
-	for (i = 0; i < SD_MAJORS; i++) {
-		blk_unregister_region(sd_major(i), SD_MINORS);
+	for (i = 0; i < SD_MAJORS; i++)
 		unregister_blkdev(sd_major(i), "sd");
-	}
 }
 
 module_init(init_sd);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 43/78] brd: use __register_blkdev to allocate devices on demand
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (41 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 42/78] sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 44/78] loop: " Christoph Hellwig
                   ` (36 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use the simpler mechanism attached to major_name to allocate a brd device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/brd.c | 39 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 28 deletions(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index cc49a921339f77..c43a6ab4b1f39f 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -426,14 +426,15 @@ static void brd_free(struct brd_device *brd)
 	kfree(brd);
 }
 
-static struct brd_device *brd_init_one(int i, bool *new)
+static void brd_probe(dev_t dev)
 {
 	struct brd_device *brd;
+	int i = MINOR(dev) / max_part;
 
-	*new = false;
+	mutex_lock(&brd_devices_mutex);
 	list_for_each_entry(brd, &brd_devices, brd_list) {
 		if (brd->brd_number == i)
-			goto out;
+			goto out_unlock;
 	}
 
 	brd = brd_alloc(i);
@@ -442,9 +443,9 @@ static struct brd_device *brd_init_one(int i, bool *new)
 		add_disk(brd->brd_disk);
 		list_add_tail(&brd->brd_list, &brd_devices);
 	}
-	*new = true;
-out:
-	return brd;
+
+out_unlock:
+	mutex_unlock(&brd_devices_mutex);
 }
 
 static void brd_del_one(struct brd_device *brd)
@@ -454,23 +455,6 @@ static void brd_del_one(struct brd_device *brd)
 	brd_free(brd);
 }
 
-static struct kobject *brd_probe(dev_t dev, int *part, void *data)
-{
-	struct brd_device *brd;
-	struct kobject *kobj;
-	bool new;
-
-	mutex_lock(&brd_devices_mutex);
-	brd = brd_init_one(MINOR(dev) / max_part, &new);
-	kobj = brd ? get_disk_and_module(brd->brd_disk) : NULL;
-	mutex_unlock(&brd_devices_mutex);
-
-	if (new)
-		*part = 0;
-
-	return kobj;
-}
-
 static inline void brd_check_and_reset_par(void)
 {
 	if (unlikely(!max_part))
@@ -510,11 +494,12 @@ static int __init brd_init(void)
 	 *	dynamically.
 	 */
 
-	if (register_blkdev(RAMDISK_MAJOR, "ramdisk"))
+	if (__register_blkdev(RAMDISK_MAJOR, "ramdisk", brd_probe))
 		return -EIO;
 
 	brd_check_and_reset_par();
 
+	mutex_lock(&brd_devices_mutex);
 	for (i = 0; i < rd_nr; i++) {
 		brd = brd_alloc(i);
 		if (!brd)
@@ -532,9 +517,7 @@ static int __init brd_init(void)
 		brd->brd_disk->queue = brd->brd_queue;
 		add_disk(brd->brd_disk);
 	}
-
-	blk_register_region(MKDEV(RAMDISK_MAJOR, 0), 1UL << MINORBITS,
-				  THIS_MODULE, brd_probe, NULL, NULL);
+	mutex_unlock(&brd_devices_mutex);
 
 	pr_info("brd: module loaded\n");
 	return 0;
@@ -544,6 +527,7 @@ static int __init brd_init(void)
 		list_del(&brd->brd_list);
 		brd_free(brd);
 	}
+	mutex_unlock(&brd_devices_mutex);
 	unregister_blkdev(RAMDISK_MAJOR, "ramdisk");
 
 	pr_info("brd: module NOT loaded !!!\n");
@@ -557,7 +541,6 @@ static void __exit brd_exit(void)
 	list_for_each_entry_safe(brd, next, &brd_devices, brd_list)
 		brd_del_one(brd);
 
-	blk_unregister_region(MKDEV(RAMDISK_MAJOR, 0), 1UL << MINORBITS);
 	unregister_blkdev(RAMDISK_MAJOR, "ramdisk");
 
 	pr_info("brd: module unloaded\n");
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 44/78] loop: use __register_blkdev to allocate devices on demand
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (42 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 43/78] brd: use __register_blkdev to allocate devices on demand Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 45/78] md: " Christoph Hellwig
                   ` (35 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use the simpler mechanism attached to major_name to allocate a brd device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/loop.c | 30 ++++++++----------------------
 1 file changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 41caf799df721f..9a27d4f1c08aac 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2231,24 +2231,18 @@ static int loop_lookup(struct loop_device **l, int i)
 	return ret;
 }
 
-static struct kobject *loop_probe(dev_t dev, int *part, void *data)
+static void loop_probe(dev_t dev)
 {
+	int idx = MINOR(dev) >> part_shift;
 	struct loop_device *lo;
-	struct kobject *kobj;
-	int err;
+
+	if (max_loop && idx >= max_loop)
+		return;
 
 	mutex_lock(&loop_ctl_mutex);
-	err = loop_lookup(&lo, MINOR(dev) >> part_shift);
-	if (err < 0)
-		err = loop_add(&lo, MINOR(dev) >> part_shift);
-	if (err < 0)
-		kobj = NULL;
-	else
-		kobj = get_disk_and_module(lo->lo_disk);
+	if (loop_lookup(&lo, idx) < 0)
+		loop_add(&lo, idx);
 	mutex_unlock(&loop_ctl_mutex);
-
-	*part = 0;
-	return kobj;
 }
 
 static long loop_control_ioctl(struct file *file, unsigned int cmd,
@@ -2368,14 +2362,11 @@ static int __init loop_init(void)
 		goto err_out;
 
 
-	if (register_blkdev(LOOP_MAJOR, "loop")) {
+	if (__register_blkdev(LOOP_MAJOR, "loop", loop_probe)) {
 		err = -EIO;
 		goto misc_out;
 	}
 
-	blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
-				  THIS_MODULE, loop_probe, NULL, NULL);
-
 	/* pre-create number of devices given by config or max_loop */
 	mutex_lock(&loop_ctl_mutex);
 	for (i = 0; i < nr; i++)
@@ -2401,16 +2392,11 @@ static int loop_exit_cb(int id, void *ptr, void *data)
 
 static void __exit loop_exit(void)
 {
-	unsigned long range;
-
-	range = max_loop ? max_loop << part_shift : 1UL << MINORBITS;
-
 	mutex_lock(&loop_ctl_mutex);
 
 	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
 	idr_destroy(&loop_index_idr);
 
-	blk_unregister_region(MKDEV(LOOP_MAJOR, 0), range);
 	unregister_blkdev(LOOP_MAJOR, "loop");
 
 	misc_deregister(&loop_misc);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 45/78] md: use __register_blkdev to allocate devices on demand
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (43 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 44/78] loop: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 46/78] ide: switch to __register_blkdev for command set probing Christoph Hellwig
                   ` (34 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use the simpler mechanism attached to major_name to allocate a md device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/md.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index fa31b71a72a35d..b2edf5e0f965b5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5764,11 +5764,12 @@ static int md_alloc(dev_t dev, char *name)
 	return error;
 }
 
-static struct kobject *md_probe(dev_t dev, int *part, void *data)
+static void md_probe(dev_t dev)
 {
+	if (MAJOR(dev) == MD_MAJOR && MINOR(dev) >= 512)
+		return;
 	if (create_on_open)
 		md_alloc(dev, NULL);
-	return NULL;
 }
 
 static int add_named_array(const char *val, const struct kernel_param *kp)
@@ -6532,7 +6533,7 @@ static void autorun_devices(int part)
 			break;
 		}
 
-		md_probe(dev, NULL, NULL);
+		md_probe(dev);
 		mddev = mddev_find(dev);
 		if (!mddev || !mddev->gendisk) {
 			if (mddev)
@@ -9563,18 +9564,15 @@ static int __init md_init(void)
 	if (!md_rdev_misc_wq)
 		goto err_rdev_misc_wq;
 
-	if ((ret = register_blkdev(MD_MAJOR, "md")) < 0)
+	ret = __register_blkdev(MD_MAJOR, "md", md_probe);
+	if (ret < 0)
 		goto err_md;
 
-	if ((ret = register_blkdev(0, "mdp")) < 0)
+	ret = __register_blkdev(0, "mdp", md_probe);
+	if (ret < 0)
 		goto err_mdp;
 	mdp_major = ret;
 
-	blk_register_region(MKDEV(MD_MAJOR, 0), 512, THIS_MODULE,
-			    md_probe, NULL, NULL);
-	blk_register_region(MKDEV(mdp_major, 0), 1UL<<MINORBITS, THIS_MODULE,
-			    md_probe, NULL, NULL);
-
 	register_reboot_notifier(&md_notifier);
 	raid_table_header = register_sysctl_table(raid_root_table);
 
@@ -9841,9 +9839,6 @@ static __exit void md_exit(void)
 	struct list_head *tmp;
 	int delay = 1;
 
-	blk_unregister_region(MKDEV(MD_MAJOR,0), 512);
-	blk_unregister_region(MKDEV(mdp_major,0), 1U << MINORBITS);
-
 	unregister_blkdev(MD_MAJOR,"md");
 	unregister_blkdev(mdp_major, "mdp");
 	unregister_reboot_notifier(&md_notifier);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 46/78] ide: switch to __register_blkdev for command set probing
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (44 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 45/78] md: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 47/78] floppy: use a separate gendisk for each media format Christoph Hellwig
                   ` (33 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

ide is the last user of the blk_register_region framework except for the
tracking of allocated gendisk.  Switch to __register_blkdev, even if that
doesn't allow us to trivially find out which command set to probe for.
That means we now always request all modules when a user tries to access
an unclaimed ide device node, but except for a few potentially loaded
modules for a fringe use case of a deprecated and soon to be removed
driver that doesn't make a difference.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ide/ide-probe.c | 34 ++++++----------------------------
 1 file changed, 6 insertions(+), 28 deletions(-)

diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 076d34b381720f..1c1567bb519429 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -902,31 +902,12 @@ static int init_irq (ide_hwif_t *hwif)
 	return 1;
 }
 
-static int ata_lock(dev_t dev, void *data)
+static void ata_probe(dev_t dev)
 {
-	/* FIXME: we want to pin hwif down */
-	return 0;
-}
-
-static struct kobject *ata_probe(dev_t dev, int *part, void *data)
-{
-	ide_hwif_t *hwif = data;
-	int unit = *part >> PARTN_BITS;
-	ide_drive_t *drive = hwif->devices[unit];
-
-	if ((drive->dev_flags & IDE_DFLAG_PRESENT) == 0)
-		return NULL;
-
-	if (drive->media == ide_disk)
-		request_module("ide-disk");
-	if (drive->media == ide_cdrom || drive->media == ide_optical)
-		request_module("ide-cd");
-	if (drive->media == ide_tape)
-		request_module("ide-tape");
-	if (drive->media == ide_floppy)
-		request_module("ide-floppy");
-
-	return NULL;
+	request_module("ide-disk");
+	request_module("ide-cd");
+	request_module("ide-tape");
+	request_module("ide-floppy");
 }
 
 void ide_init_disk(struct gendisk *disk, ide_drive_t *drive)
@@ -967,7 +948,7 @@ static int hwif_init(ide_hwif_t *hwif)
 		return 0;
 	}
 
-	if (register_blkdev(hwif->major, hwif->name))
+	if (__register_blkdev(hwif->major, hwif->name, ata_probe))
 		return 0;
 
 	if (!hwif->sg_max_nents)
@@ -989,8 +970,6 @@ static int hwif_init(ide_hwif_t *hwif)
 		goto out;
 	}
 
-	blk_register_region(MKDEV(hwif->major, 0), MAX_DRIVES << PARTN_BITS,
-			    THIS_MODULE, ata_probe, ata_lock, hwif);
 	return 1;
 
 out:
@@ -1582,7 +1561,6 @@ static void ide_unregister(ide_hwif_t *hwif)
 	/*
 	 * Remove us from the kernel's knowledge
 	 */
-	blk_unregister_region(MKDEV(hwif->major, 0), MAX_DRIVES<<PARTN_BITS);
 	kfree(hwif->sg_table);
 	unregister_blkdev(hwif->major, hwif->name);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 47/78] floppy: use a separate gendisk for each media format
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (45 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 46/78] ide: switch to __register_blkdev for command set probing Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 48/78] amiflop: use separate gendisks for Amiga vs MS-DOS mode Christoph Hellwig
                   ` (32 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

The floppy driver usually autodetects the media when used with the
normal /dev/fd? devices, which also are the only nodes created by udev.
But it also supports various aliases that force a given media format.
That is currently supported using the blk_register_region framework
which finds the floppy gendisk even for a 'mismatched' dev_t.  The
problem with this (besides the code complexity) is that it creates
multiple struct block_device instances for the whole device of a
single gendisk, which can lead to interesting issues in code not
aware of that fact.

To fix this just create a separate gendisk for each of the aliases
if they are accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/floppy.c | 154 ++++++++++++++++++++++++++---------------
 1 file changed, 97 insertions(+), 57 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 7df79ae6b0a1e1..dfe1dfc901ccc2 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -402,7 +402,6 @@ static struct floppy_drive_params drive_params[N_DRIVE];
 static struct floppy_drive_struct drive_state[N_DRIVE];
 static struct floppy_write_errors write_errors[N_DRIVE];
 static struct timer_list motor_off_timer[N_DRIVE];
-static struct gendisk *disks[N_DRIVE];
 static struct blk_mq_tag_set tag_sets[N_DRIVE];
 static struct block_device *opened_bdev[N_DRIVE];
 static DEFINE_MUTEX(open_lock);
@@ -477,6 +476,8 @@ static struct floppy_struct floppy_type[32] = {
 	{ 3200,20,2,80,0,0x1C,0x00,0xCF,0x2C,"H1600" }, /* 31 1.6MB 3.5"    */
 };
 
+static struct gendisk *disks[N_DRIVE][ARRAY_SIZE(floppy_type)];
+
 #define SECTSIZE (_FD_SECTSIZE(*floppy))
 
 /* Auto-detection: Disk type used until the next media change occurs. */
@@ -4111,7 +4112,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
 
 	new_dev = MINOR(bdev->bd_dev);
 	drive_state[drive].fd_device = new_dev;
-	set_capacity(disks[drive], floppy_sizes[new_dev]);
+	set_capacity(disks[drive][ITYPE(new_dev)], floppy_sizes[new_dev]);
 	if (old_dev != -1 && old_dev != new_dev) {
 		if (buffer_drive == drive)
 			buffer_track = -1;
@@ -4579,15 +4580,58 @@ static bool floppy_available(int drive)
 	return true;
 }
 
-static struct kobject *floppy_find(dev_t dev, int *part, void *data)
+static int floppy_alloc_disk(unsigned int drive, unsigned int type)
 {
-	int drive = (*part & 3) | ((*part & 0x80) >> 5);
-	if (drive >= N_DRIVE || !floppy_available(drive))
-		return NULL;
-	if (((*part >> 2) & 0x1f) >= ARRAY_SIZE(floppy_type))
-		return NULL;
-	*part = 0;
-	return get_disk_and_module(disks[drive]);
+	struct gendisk *disk;
+	int err;
+
+	disk = alloc_disk(1);
+	if (!disk)
+		return -ENOMEM;
+
+	disk->queue = blk_mq_init_queue(&tag_sets[drive]);
+	if (IS_ERR(disk->queue)) {
+		err = PTR_ERR(disk->queue);
+		disk->queue = NULL;
+		put_disk(disk);
+		return err;
+	}
+
+	blk_queue_bounce_limit(disk->queue, BLK_BOUNCE_HIGH);
+	blk_queue_max_hw_sectors(disk->queue, 64);
+	disk->major = FLOPPY_MAJOR;
+	disk->first_minor = TOMINOR(drive) | (type << 2);
+	disk->fops = &floppy_fops;
+	disk->events = DISK_EVENT_MEDIA_CHANGE;
+	if (type)
+		sprintf(disk->disk_name, "fd%d_type%d", drive, type);
+	else
+		sprintf(disk->disk_name, "fd%d", drive);
+	/* to be cleaned up... */
+	disk->private_data = (void *)(long)drive;
+	disk->flags |= GENHD_FL_REMOVABLE;
+
+	disks[drive][type] = disk;
+	return 0;
+}
+
+static DEFINE_MUTEX(floppy_probe_lock);
+
+static void floppy_probe(dev_t dev)
+{
+	unsigned int drive = (MINOR(dev) & 3) | ((MINOR(dev) & 0x80) >> 5);
+	unsigned int type = (MINOR(dev) >> 2) & 0x1f;
+
+	if (drive >= N_DRIVE || !floppy_available(drive) ||
+	    type >= ARRAY_SIZE(floppy_type))
+		return;
+
+	mutex_lock(&floppy_probe_lock);
+	if (!disks[drive][type]) {
+		if (floppy_alloc_disk(drive, type) == 0)
+			add_disk(disks[drive][type]);
+	}
+	mutex_unlock(&floppy_probe_lock);
 }
 
 static int __init do_floppy_init(void)
@@ -4609,33 +4653,25 @@ static int __init do_floppy_init(void)
 		return -ENOMEM;
 
 	for (drive = 0; drive < N_DRIVE; drive++) {
-		disks[drive] = alloc_disk(1);
-		if (!disks[drive]) {
-			err = -ENOMEM;
+		memset(&tag_sets[drive], 0, sizeof(tag_sets[drive]));
+		tag_sets[drive].ops = &floppy_mq_ops;
+		tag_sets[drive].nr_hw_queues = 1;
+		tag_sets[drive].nr_maps = 1;
+		tag_sets[drive].queue_depth = 2;
+		tag_sets[drive].numa_node = NUMA_NO_NODE;
+		tag_sets[drive].flags = BLK_MQ_F_SHOULD_MERGE;
+		err = blk_mq_alloc_tag_set(&tag_sets[drive]);
+		if (err)
 			goto out_put_disk;
-		}
 
-		disks[drive]->queue = blk_mq_init_sq_queue(&tag_sets[drive],
-							   &floppy_mq_ops, 2,
-							   BLK_MQ_F_SHOULD_MERGE);
-		if (IS_ERR(disks[drive]->queue)) {
-			err = PTR_ERR(disks[drive]->queue);
-			disks[drive]->queue = NULL;
+		err = floppy_alloc_disk(drive, 0);
+		if (err)
 			goto out_put_disk;
-		}
-
-		blk_queue_bounce_limit(disks[drive]->queue, BLK_BOUNCE_HIGH);
-		blk_queue_max_hw_sectors(disks[drive]->queue, 64);
-		disks[drive]->major = FLOPPY_MAJOR;
-		disks[drive]->first_minor = TOMINOR(drive);
-		disks[drive]->fops = &floppy_fops;
-		disks[drive]->events = DISK_EVENT_MEDIA_CHANGE;
-		sprintf(disks[drive]->disk_name, "fd%d", drive);
 
 		timer_setup(&motor_off_timer[drive], motor_off_callback, 0);
 	}
 
-	err = register_blkdev(FLOPPY_MAJOR, "fd");
+	err = __register_blkdev(FLOPPY_MAJOR, "fd", floppy_probe);
 	if (err)
 		goto out_put_disk;
 
@@ -4643,9 +4679,6 @@ static int __init do_floppy_init(void)
 	if (err)
 		goto out_unreg_blkdev;
 
-	blk_register_region(MKDEV(FLOPPY_MAJOR, 0), 256, THIS_MODULE,
-			    floppy_find, NULL, NULL);
-
 	for (i = 0; i < 256; i++)
 		if (ITYPE(i))
 			floppy_sizes[i] = floppy_type[ITYPE(i)].size;
@@ -4673,7 +4706,7 @@ static int __init do_floppy_init(void)
 	if (fdc_state[0].address == -1) {
 		cancel_delayed_work(&fd_timeout);
 		err = -ENODEV;
-		goto out_unreg_region;
+		goto out_unreg_driver;
 	}
 #if N_FDC > 1
 	fdc_state[1].address = FDC2;
@@ -4684,7 +4717,7 @@ static int __init do_floppy_init(void)
 	if (err) {
 		cancel_delayed_work(&fd_timeout);
 		err = -EBUSY;
-		goto out_unreg_region;
+		goto out_unreg_driver;
 	}
 
 	/* initialise drive state */
@@ -4761,10 +4794,8 @@ static int __init do_floppy_init(void)
 		if (err)
 			goto out_remove_drives;
 
-		/* to be cleaned up... */
-		disks[drive]->private_data = (void *)(long)drive;
-		disks[drive]->flags |= GENHD_FL_REMOVABLE;
-		device_add_disk(&floppy_device[drive].dev, disks[drive], NULL);
+		device_add_disk(&floppy_device[drive].dev, disks[drive][0],
+				NULL);
 	}
 
 	return 0;
@@ -4772,30 +4803,27 @@ static int __init do_floppy_init(void)
 out_remove_drives:
 	while (drive--) {
 		if (floppy_available(drive)) {
-			del_gendisk(disks[drive]);
+			del_gendisk(disks[drive][0]);
 			platform_device_unregister(&floppy_device[drive]);
 		}
 	}
 out_release_dma:
 	if (atomic_read(&usage_count))
 		floppy_release_irq_and_dma();
-out_unreg_region:
-	blk_unregister_region(MKDEV(FLOPPY_MAJOR, 0), 256);
+out_unreg_driver:
 	platform_driver_unregister(&floppy_driver);
 out_unreg_blkdev:
 	unregister_blkdev(FLOPPY_MAJOR, "fd");
 out_put_disk:
 	destroy_workqueue(floppy_wq);
 	for (drive = 0; drive < N_DRIVE; drive++) {
-		if (!disks[drive])
+		if (!disks[drive][0])
 			break;
-		if (disks[drive]->queue) {
-			del_timer_sync(&motor_off_timer[drive]);
-			blk_cleanup_queue(disks[drive]->queue);
-			disks[drive]->queue = NULL;
-			blk_mq_free_tag_set(&tag_sets[drive]);
-		}
-		put_disk(disks[drive]);
+		del_timer_sync(&motor_off_timer[drive]);
+		blk_cleanup_queue(disks[drive][0]->queue);
+		disks[drive][0]->queue = NULL;
+		blk_mq_free_tag_set(&tag_sets[drive]);
+		put_disk(disks[drive][0]);
 	}
 	return err;
 }
@@ -5006,9 +5034,8 @@ module_init(floppy_module_init);
 
 static void __exit floppy_module_exit(void)
 {
-	int drive;
+	int drive, i;
 
-	blk_unregister_region(MKDEV(FLOPPY_MAJOR, 0), 256);
 	unregister_blkdev(FLOPPY_MAJOR, "fd");
 	platform_driver_unregister(&floppy_driver);
 
@@ -5018,10 +5045,16 @@ static void __exit floppy_module_exit(void)
 		del_timer_sync(&motor_off_timer[drive]);
 
 		if (floppy_available(drive)) {
-			del_gendisk(disks[drive]);
+			for (i = 0; i < ARRAY_SIZE(floppy_type); i++) {
+				if (disks[drive][i])
+					del_gendisk(disks[drive][i]);
+			}
 			platform_device_unregister(&floppy_device[drive]);
 		}
-		blk_cleanup_queue(disks[drive]->queue);
+		for (i = 0; i < ARRAY_SIZE(floppy_type); i++) {
+			if (disks[drive][i])
+				blk_cleanup_queue(disks[drive][i]->queue);
+		}
 		blk_mq_free_tag_set(&tag_sets[drive]);
 
 		/*
@@ -5029,10 +5062,17 @@ static void __exit floppy_module_exit(void)
 		 * queue reference in put_disk().
 		 */
 		if (!(allowed_drive_mask & (1 << drive)) ||
-		    fdc_state[FDC(drive)].version == FDC_NONE)
-			disks[drive]->queue = NULL;
+		    fdc_state[FDC(drive)].version == FDC_NONE) {
+			for (i = 0; i < ARRAY_SIZE(floppy_type); i++) {
+				if (disks[drive][i])
+					disks[drive][i]->queue = NULL;
+			}
+		}
 
-		put_disk(disks[drive]);
+		for (i = 0; i < ARRAY_SIZE(floppy_type); i++) {
+			if (disks[drive][i])
+				put_disk(disks[drive][i]);
+		}
 	}
 
 	cancel_delayed_work_sync(&fd_timeout);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 48/78] amiflop: use separate gendisks for Amiga vs MS-DOS mode
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (46 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 47/78] floppy: use a separate gendisk for each media format Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 49/78] ataflop: use a separate gendisk for each media format Christoph Hellwig
                   ` (31 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use separate gendisks (which share a tag_set) for the native Amgiga vs
the MS-DOS mode instead of redirecting the gendisk lookup using a probe
callback.  This avoids potential problems with aliased block_device
instances and will eventually allow for removing the blk_register_region
framework.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/amiflop.c | 98 +++++++++++++++++++++++------------------
 1 file changed, 55 insertions(+), 43 deletions(-)

diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c
index 71c2b156455860..9e2d0c6a387721 100644
--- a/drivers/block/amiflop.c
+++ b/drivers/block/amiflop.c
@@ -201,7 +201,7 @@ struct amiga_floppy_struct {
 	int busy;			/* true when drive is active */
 	int dirty;			/* true when trackbuf is not on disk */
 	int status;			/* current error code for unit */
-	struct gendisk *gendisk;
+	struct gendisk *gendisk[2];
 	struct blk_mq_tag_set tag_set;
 };
 
@@ -1669,6 +1669,11 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
 		return -EBUSY;
 	}
 
+	if (unit[drive].type->code == FD_NODRIVE) {
+		mutex_unlock(&amiflop_mutex);
+		return -ENXIO;
+	}
+
 	if (mode & (FMODE_READ|FMODE_WRITE)) {
 		bdev_check_media_change(bdev);
 		if (mode & FMODE_WRITE) {
@@ -1695,7 +1700,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
 	unit[drive].dtype=&data_types[system];
 	unit[drive].blocks=unit[drive].type->heads*unit[drive].type->tracks*
 		data_types[system].sects*unit[drive].type->sect_mult;
-	set_capacity(unit[drive].gendisk, unit[drive].blocks);
+	set_capacity(unit[drive].gendisk[system], unit[drive].blocks);
 
 	printk(KERN_INFO "fd%d: accessing %s-disk with %s-layout\n",drive,
 	       unit[drive].type->name, data_types[system].name);
@@ -1772,36 +1777,68 @@ static const struct blk_mq_ops amiflop_mq_ops = {
 	.queue_rq = amiflop_queue_rq,
 };
 
-static struct gendisk *fd_alloc_disk(int drive)
+static int fd_alloc_disk(int drive, int system)
 {
 	struct gendisk *disk;
 
 	disk = alloc_disk(1);
 	if (!disk)
 		goto out;
-
-	disk->queue = blk_mq_init_sq_queue(&unit[drive].tag_set, &amiflop_mq_ops,
-						2, BLK_MQ_F_SHOULD_MERGE);
-	if (IS_ERR(disk->queue)) {
-		disk->queue = NULL;
+	disk->queue = blk_mq_init_queue(&unit[drive].tag_set);
+	if (IS_ERR(disk->queue))
 		goto out_put_disk;
-	}
 
+	disk->major = FLOPPY_MAJOR;
+	disk->first_minor = drive + system;
+	disk->fops = &floppy_fops;
+	disk->events = DISK_EVENT_MEDIA_CHANGE;
+	if (system)
+		sprintf(disk->disk_name, "fd%d_msdos", drive);
+	else
+		sprintf(disk->disk_name, "fd%d", drive);
+	disk->private_data = &unit[drive];
+	set_capacity(disk, 880 * 2);
+
+	unit[drive].gendisk[system] = disk;
+	add_disk(disk);
+	return 0;
+
+out_put_disk:
+	disk->queue = NULL;
+	put_disk(disk);
+out:
+	return -ENOMEM;
+}
+
+static int fd_alloc_drive(int drive)
+{
 	unit[drive].trackbuf = kmalloc(FLOPPY_MAX_SECTORS * 512, GFP_KERNEL);
 	if (!unit[drive].trackbuf)
-		goto out_cleanup_queue;
+		goto out;
 
-	return disk;
+	memset(&unit[drive].tag_set, 0, sizeof(unit[drive].tag_set));
+	unit[drive].tag_set.ops = &amiflop_mq_ops;
+	unit[drive].tag_set.nr_hw_queues = 1;
+	unit[drive].tag_set.nr_maps = 1;
+	unit[drive].tag_set.queue_depth = 2;
+	unit[drive].tag_set.numa_node = NUMA_NO_NODE;
+	unit[drive].tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
+	if (blk_mq_alloc_tag_set(&unit[drive].tag_set))
+		goto out_cleanup_trackbuf;
 
-out_cleanup_queue:
-	blk_cleanup_queue(disk->queue);
-	disk->queue = NULL;
+	pr_cont(" fd%d", drive);
+
+	if (fd_alloc_disk(drive, 0) || fd_alloc_disk(drive, 1))
+		goto out_cleanup_tagset;
+	return 0;
+
+out_cleanup_tagset:
 	blk_mq_free_tag_set(&unit[drive].tag_set);
-out_put_disk:
-	put_disk(disk);
+out_cleanup_trackbuf:
+	kfree(unit[drive].trackbuf);
 out:
 	unit[drive].type->code = FD_NODRIVE;
-	return NULL;
+	return -ENOMEM;
 }
 
 static int __init fd_probe_drives(void)
@@ -1812,29 +1849,16 @@ static int __init fd_probe_drives(void)
 	drives=0;
 	nomem=0;
 	for(drive=0;drive<FD_MAX_UNITS;drive++) {
-		struct gendisk *disk;
 		fd_probe(drive);
 		if (unit[drive].type->code == FD_NODRIVE)
 			continue;
 
-		disk = fd_alloc_disk(drive);
-		if (!disk) {
+		if (fd_alloc_drive(drive) < 0) {
 			pr_cont(" no mem for fd%d", drive);
 			nomem = 1;
 			continue;
 		}
-		unit[drive].gendisk = disk;
 		drives++;
-
-		pr_cont(" fd%d",drive);
-		disk->major = FLOPPY_MAJOR;
-		disk->first_minor = drive;
-		disk->fops = &floppy_fops;
-		disk->events = DISK_EVENT_MEDIA_CHANGE;
-		sprintf(disk->disk_name, "fd%d", drive);
-		disk->private_data = &unit[drive];
-		set_capacity(disk, 880*2);
-		add_disk(disk);
 	}
 	if ((drives > 0) || (nomem == 0)) {
 		if (drives == 0)
@@ -1846,15 +1870,6 @@ static int __init fd_probe_drives(void)
 	return -ENOMEM;
 }
  
-static struct kobject *floppy_find(dev_t dev, int *part, void *data)
-{
-	int drive = *part & 3;
-	if (unit[drive].type->code == FD_NODRIVE)
-		return NULL;
-	*part = 0;
-	return get_disk_and_module(unit[drive].gendisk);
-}
-
 static int __init amiga_floppy_probe(struct platform_device *pdev)
 {
 	int i, ret;
@@ -1884,9 +1899,6 @@ static int __init amiga_floppy_probe(struct platform_device *pdev)
 	if (fd_probe_drives() < 1) /* No usable drives */
 		goto out_probe;
 
-	blk_register_region(MKDEV(FLOPPY_MAJOR, 0), 256, THIS_MODULE,
-				floppy_find, NULL, NULL);
-
 	/* initialize variables */
 	timer_setup(&motor_on_timer, motor_on_callback, 0);
 	motor_on_timer.expires = 0;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 49/78] ataflop: use a separate gendisk for each media format
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (47 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 48/78] amiflop: use separate gendisks for Amiga vs MS-DOS mode Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 50/78] z2ram: reindent Christoph Hellwig
                   ` (30 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

The Atari floppy driver usually autodetects the media when used with the
ormal /dev/fd? devices, which also are the only nodes created by udev.
But it also supports various aliases that force a given media format.
That is currently supported using the blk_register_region framework
which finds the floppy gendisk even for a 'mismatched' dev_t.  The
problem with this (besides the code complexity) is that it creates
multiple struct block_device instances for the whole device of a
single gendisk, which can lead to interesting issues in code not
aware of that fact.

To fix this just create a separate gendisk for each of the aliases
if they are accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/ataflop.c | 135 +++++++++++++++++++++++++---------------
 1 file changed, 86 insertions(+), 49 deletions(-)

diff --git a/drivers/block/ataflop.c b/drivers/block/ataflop.c
index 3e881fdb06e0ad..104b713f4055af 100644
--- a/drivers/block/ataflop.c
+++ b/drivers/block/ataflop.c
@@ -297,7 +297,7 @@ static struct atari_floppy_struct {
 	unsigned int wpstat;	/* current state of WP signal (for
 				   disk change detection) */
 	int flags;		/* flags */
-	struct gendisk *disk;
+	struct gendisk *disk[NUM_DISK_MINORS];
 	int ref;
 	int type;
 	struct blk_mq_tag_set tag_set;
@@ -723,12 +723,16 @@ static void fd_error( void )
 
 static int do_format(int drive, int type, struct atari_format_descr *desc)
 {
-	struct request_queue *q = unit[drive].disk->queue;
+	struct request_queue *q;
 	unsigned char	*p;
 	int sect, nsect;
 	unsigned long	flags;
 	int ret;
 
+	if (type)
+		type--;
+
+	q = unit[drive].disk[type]->queue;
 	blk_mq_freeze_queue(q);
 	blk_mq_quiesce_queue(q);
 
@@ -738,7 +742,7 @@ static int do_format(int drive, int type, struct atari_format_descr *desc)
 	local_irq_restore(flags);
 
 	if (type) {
-		if (--type >= NUM_DISK_MINORS ||
+		if (type >= NUM_DISK_MINORS ||
 		    minor2disktype[type].drive_types > DriveType) {
 			ret = -EINVAL;
 			goto out;
@@ -1154,7 +1158,7 @@ static void fd_rwsec_done1(int status)
 			    if (SUDT[-1].blocks > ReqBlock) {
 				/* try another disk type */
 				SUDT--;
-				set_capacity(unit[SelectedDrive].disk,
+				set_capacity(unit[SelectedDrive].disk[0],
 							SUDT->blocks);
 			    } else
 				Probing = 0;
@@ -1169,7 +1173,7 @@ static void fd_rwsec_done1(int status)
 /* record not found, but not probing. Maybe stretch wrong ? Restart probing */
 			if (SUD.autoprobe) {
 				SUDT = atari_disk_type + StartDiskType[DriveType];
-				set_capacity(unit[SelectedDrive].disk,
+				set_capacity(unit[SelectedDrive].disk[0],
 							SUDT->blocks);
 				Probing = 1;
 			}
@@ -1515,7 +1519,7 @@ static blk_status_t ataflop_queue_rq(struct blk_mq_hw_ctx *hctx,
 		if (!UDT) {
 			Probing = 1;
 			UDT = atari_disk_type + StartDiskType[DriveType];
-			set_capacity(floppy->disk, UDT->blocks);
+			set_capacity(bd->rq->rq_disk, UDT->blocks);
 			UD.autoprobe = 1;
 		}
 	} 
@@ -1533,7 +1537,7 @@ static blk_status_t ataflop_queue_rq(struct blk_mq_hw_ctx *hctx,
 		}
 		type = minor2disktype[type].index;
 		UDT = &atari_disk_type[type];
-		set_capacity(floppy->disk, UDT->blocks);
+		set_capacity(bd->rq->rq_disk, UDT->blocks);
 		UD.autoprobe = 0;
 	}
 
@@ -1658,7 +1662,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
 				    printk (KERN_INFO "floppy%d: setting %s %p!\n",
 				        drive, dtp->name, dtp);
 				UDT = dtp;
-				set_capacity(floppy->disk, UDT->blocks);
+				set_capacity(disk, UDT->blocks);
 
 				if (cmd == FDDEFPRM) {
 				  /* save settings as permanent default type */
@@ -1702,7 +1706,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
 			return -EINVAL;
 
 		UDT = dtp;
-		set_capacity(floppy->disk, UDT->blocks);
+		set_capacity(disk, UDT->blocks);
 
 		return 0;
 	case FDMSGON:
@@ -1725,7 +1729,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
 		UDT = NULL;
 		/* MSch: invalidate default_params */
 		default_params[drive].blocks  = 0;
-		set_capacity(floppy->disk, MAX_DISK_SIZE * 2);
+		set_capacity(disk, MAX_DISK_SIZE * 2);
 		fallthrough;
 	case FDFMTEND:
 	case FDFLUSH:
@@ -1962,14 +1966,50 @@ static const struct blk_mq_ops ataflop_mq_ops = {
 	.commit_rqs = ataflop_commit_rqs,
 };
 
-static struct kobject *floppy_find(dev_t dev, int *part, void *data)
+static int ataflop_alloc_disk(unsigned int drive, unsigned int type)
 {
-	int drive = *part & 3;
-	int type  = *part >> 2;
+	struct gendisk *disk;
+	int ret;
+
+	disk = alloc_disk(1);
+	if (!disk)
+		return -ENOMEM;
+
+	disk->queue = blk_mq_init_queue(&unit[drive].tag_set);
+	if (IS_ERR(disk->queue)) {
+		ret = PTR_ERR(disk->queue);
+		disk->queue = NULL;
+		put_disk(disk);
+		return ret;
+	}
+
+	disk->major = FLOPPY_MAJOR;
+	disk->first_minor = drive + (type << 2);
+	sprintf(disk->disk_name, "fd%d", drive);
+	disk->fops = &floppy_fops;
+	disk->events = DISK_EVENT_MEDIA_CHANGE;
+	disk->private_data = &unit[drive];
+	set_capacity(disk, MAX_DISK_SIZE * 2);
+
+	unit[drive].disk[type] = disk;
+	return 0;
+}
+
+static DEFINE_MUTEX(ataflop_probe_lock);
+
+static void ataflop_probe(dev_t dev)
+{
+	int drive = MINOR(dev) & 3;
+	int type  = MINOR(dev) >> 2;
+
 	if (drive >= FD_MAX_UNITS || type > NUM_DISK_MINORS)
-		return NULL;
-	*part = 0;
-	return get_disk_and_module(unit[drive].disk);
+		return;
+	mutex_lock(&ataflop_probe_lock);
+	if (!unit[drive].disk[type]) {
+		if (ataflop_alloc_disk(drive, type) == 0)
+			add_disk(unit[drive].disk[type]);
+	}
+	mutex_unlock(&ataflop_probe_lock);
 }
 
 static int __init atari_floppy_init (void)
@@ -1981,23 +2021,26 @@ static int __init atari_floppy_init (void)
 		/* Amiga, Mac, ... don't have Atari-compatible floppy :-) */
 		return -ENODEV;
 
-	if (register_blkdev(FLOPPY_MAJOR,"fd"))
-		return -EBUSY;
+	mutex_lock(&ataflop_probe_lock);
+	ret = __register_blkdev(FLOPPY_MAJOR, "fd", ataflop_probe);
+	if (ret)
+		goto out_unlock;
 
 	for (i = 0; i < FD_MAX_UNITS; i++) {
-		unit[i].disk = alloc_disk(1);
-		if (!unit[i].disk) {
-			ret = -ENOMEM;
+		memset(&unit[i].tag_set, 0, sizeof(unit[i].tag_set));
+		unit[i].tag_set.ops = &ataflop_mq_ops;
+		unit[i].tag_set.nr_hw_queues = 1;
+		unit[i].tag_set.nr_maps = 1;
+		unit[i].tag_set.queue_depth = 2;
+		unit[i].tag_set.numa_node = NUMA_NO_NODE;
+		unit[i].tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
+		ret = blk_mq_alloc_tag_set(&unit[i].tag_set);
+		if (ret)
 			goto err;
-		}
 
-		unit[i].disk->queue = blk_mq_init_sq_queue(&unit[i].tag_set,
-							   &ataflop_mq_ops, 2,
-							   BLK_MQ_F_SHOULD_MERGE);
-		if (IS_ERR(unit[i].disk->queue)) {
-			put_disk(unit[i].disk);
-			ret = PTR_ERR(unit[i].disk->queue);
-			unit[i].disk->queue = NULL;
+		ret = ataflop_alloc_disk(i, 0);
+		if (ret) {
+			blk_mq_free_tag_set(&unit[i].tag_set);
 			goto err;
 		}
 	}
@@ -2027,19 +2070,9 @@ static int __init atari_floppy_init (void)
 	for (i = 0; i < FD_MAX_UNITS; i++) {
 		unit[i].track = -1;
 		unit[i].flags = 0;
-		unit[i].disk->major = FLOPPY_MAJOR;
-		unit[i].disk->first_minor = i;
-		sprintf(unit[i].disk->disk_name, "fd%d", i);
-		unit[i].disk->fops = &floppy_fops;
-		unit[i].disk->events = DISK_EVENT_MEDIA_CHANGE;
-		unit[i].disk->private_data = &unit[i];
-		set_capacity(unit[i].disk, MAX_DISK_SIZE * 2);
-		add_disk(unit[i].disk);
+		add_disk(unit[i].disk[0]);
 	}
 
-	blk_register_region(MKDEV(FLOPPY_MAJOR, 0), 256, THIS_MODULE,
-				floppy_find, NULL, NULL);
-
 	printk(KERN_INFO "Atari floppy driver: max. %cD, %strack buffering\n",
 	       DriveType == 0 ? 'D' : DriveType == 1 ? 'H' : 'E',
 	       UseTrackbuffer ? "" : "no ");
@@ -2049,14 +2082,14 @@ static int __init atari_floppy_init (void)
 
 err:
 	while (--i >= 0) {
-		struct gendisk *disk = unit[i].disk;
-
-		blk_cleanup_queue(disk->queue);
+		blk_cleanup_queue(unit[i].disk[0]->queue);
+		put_disk(unit[i].disk[0]);
 		blk_mq_free_tag_set(&unit[i].tag_set);
-		put_disk(unit[i].disk);
 	}
 
 	unregister_blkdev(FLOPPY_MAJOR, "fd");
+out_unlock:
+	mutex_unlock(&ataflop_probe_lock);
 	return ret;
 }
 
@@ -2101,13 +2134,17 @@ __setup("floppy=", atari_floppy_setup);
 
 static void __exit atari_floppy_exit(void)
 {
-	int i;
-	blk_unregister_region(MKDEV(FLOPPY_MAJOR, 0), 256);
+	int i, type;
+
 	for (i = 0; i < FD_MAX_UNITS; i++) {
-		del_gendisk(unit[i].disk);
-		blk_cleanup_queue(unit[i].disk->queue);
+		for (type = 0; type < NUM_DISK_MINORS; type++) {
+			if (!unit[i].disk[type])
+				continue;
+			del_gendisk(unit[i].disk[type]);
+			blk_cleanup_queue(unit[i].disk[type]->queue);
+			put_disk(unit[i].disk[type]);
+		}
 		blk_mq_free_tag_set(&unit[i].tag_set);
-		put_disk(unit[i].disk);
 	}
 	unregister_blkdev(FLOPPY_MAJOR, "fd");
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 50/78] z2ram: reindent
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (48 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 49/78] ataflop: use a separate gendisk for each media format Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 51/78] z2ram: use separate gendisk for the different modes Christoph Hellwig
                   ` (29 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

reindent the driver using Lident as the code style was far away from
normal Linux code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/z2ram.c | 493 ++++++++++++++++++++----------------------
 1 file changed, 236 insertions(+), 257 deletions(-)

diff --git a/drivers/block/z2ram.c b/drivers/block/z2ram.c
index 0e734802ee7cc6..eafecc9a72b38d 100644
--- a/drivers/block/z2ram.c
+++ b/drivers/block/z2ram.c
@@ -42,7 +42,6 @@
 
 #include <linux/zorro.h>
 
-
 #define Z2MINOR_COMBINED      (0)
 #define Z2MINOR_Z2ONLY        (1)
 #define Z2MINOR_CHIPONLY      (2)
@@ -50,17 +49,17 @@
 #define Z2MINOR_MEMLIST2      (5)
 #define Z2MINOR_MEMLIST3      (6)
 #define Z2MINOR_MEMLIST4      (7)
-#define Z2MINOR_COUNT         (8) /* Move this down when adding a new minor */
+#define Z2MINOR_COUNT         (8)	/* Move this down when adding a new minor */
 
 #define Z2RAM_CHUNK1024       ( Z2RAM_CHUNKSIZE >> 10 )
 
 static DEFINE_MUTEX(z2ram_mutex);
-static u_long *z2ram_map    = NULL;
-static u_long z2ram_size    = 0;
-static int z2_count         = 0;
-static int chip_count       = 0;
-static int list_count       = 0;
-static int current_device   = -1;
+static u_long *z2ram_map = NULL;
+static u_long z2ram_size = 0;
+static int z2_count = 0;
+static int chip_count = 0;
+static int list_count = 0;
+static int current_device = -1;
 
 static DEFINE_SPINLOCK(z2ram_lock);
 
@@ -71,7 +70,7 @@ static blk_status_t z2_queue_rq(struct blk_mq_hw_ctx *hctx,
 {
 	struct request *req = bd->rq;
 	unsigned long start = blk_rq_pos(req) << 9;
-	unsigned long len  = blk_rq_cur_bytes(req);
+	unsigned long len = blk_rq_cur_bytes(req);
 
 	blk_mq_start_request(req);
 
@@ -92,7 +91,7 @@ static blk_status_t z2_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 		if (len < size)
 			size = len;
-		addr += z2ram_map[ start >> Z2RAM_CHUNKSHIFT ];
+		addr += z2ram_map[start >> Z2RAM_CHUNKSHIFT];
 		if (rq_data_dir(req) == READ)
 			memcpy(buffer, (char *)addr, size);
 		else
@@ -106,228 +105,214 @@ static blk_status_t z2_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
-static void
-get_z2ram( void )
+static void get_z2ram(void)
 {
-    int i;
-
-    for ( i = 0; i < Z2RAM_SIZE / Z2RAM_CHUNKSIZE; i++ )
-    {
-	if ( test_bit( i, zorro_unused_z2ram ) )
-	{
-	    z2_count++;
-	    z2ram_map[z2ram_size++] = (unsigned long)ZTWO_VADDR(Z2RAM_START) +
-				      (i << Z2RAM_CHUNKSHIFT);
-	    clear_bit( i, zorro_unused_z2ram );
+	int i;
+
+	for (i = 0; i < Z2RAM_SIZE / Z2RAM_CHUNKSIZE; i++) {
+		if (test_bit(i, zorro_unused_z2ram)) {
+			z2_count++;
+			z2ram_map[z2ram_size++] =
+			    (unsigned long)ZTWO_VADDR(Z2RAM_START) +
+			    (i << Z2RAM_CHUNKSHIFT);
+			clear_bit(i, zorro_unused_z2ram);
+		}
 	}
-    }
 
-    return;
+	return;
 }
 
-static void
-get_chipram( void )
+static void get_chipram(void)
 {
 
-    while ( amiga_chip_avail() > ( Z2RAM_CHUNKSIZE * 4 ) )
-    {
-	chip_count++;
-	z2ram_map[ z2ram_size ] =
-	    (u_long)amiga_chip_alloc( Z2RAM_CHUNKSIZE, "z2ram" );
+	while (amiga_chip_avail() > (Z2RAM_CHUNKSIZE * 4)) {
+		chip_count++;
+		z2ram_map[z2ram_size] =
+		    (u_long) amiga_chip_alloc(Z2RAM_CHUNKSIZE, "z2ram");
 
-	if ( z2ram_map[ z2ram_size ] == 0 )
-	{
-	    break;
+		if (z2ram_map[z2ram_size] == 0) {
+			break;
+		}
+
+		z2ram_size++;
 	}
 
-	z2ram_size++;
-    }
-	
-    return;
+	return;
 }
 
 static int z2_open(struct block_device *bdev, fmode_t mode)
 {
-    int device;
-    int max_z2_map = ( Z2RAM_SIZE / Z2RAM_CHUNKSIZE ) *
-	sizeof( z2ram_map[0] );
-    int max_chip_map = ( amiga_chip_size / Z2RAM_CHUNKSIZE ) *
-	sizeof( z2ram_map[0] );
-    int rc = -ENOMEM;
-
-    device = MINOR(bdev->bd_dev);
-
-    mutex_lock(&z2ram_mutex);
-    if ( current_device != -1 && current_device != device )
-    {
-	rc = -EBUSY;
-	goto err_out;
-    }
-
-    if ( current_device == -1 )
-    {
-	z2_count   = 0;
-	chip_count = 0;
-	list_count = 0;
-	z2ram_size = 0;
-
-	/* Use a specific list entry. */
-	if (device >= Z2MINOR_MEMLIST1 && device <= Z2MINOR_MEMLIST4) {
-		int index = device - Z2MINOR_MEMLIST1 + 1;
-		unsigned long size, paddr, vaddr;
-
-		if (index >= m68k_realnum_memory) {
-			printk( KERN_ERR DEVICE_NAME
-				": no such entry in z2ram_map\n" );
-		        goto err_out;
-		}
-
-		paddr = m68k_memory[index].addr;
-		size = m68k_memory[index].size & ~(Z2RAM_CHUNKSIZE-1);
-
-#ifdef __powerpc__
-		/* FIXME: ioremap doesn't build correct memory tables. */
-		{
-			vfree(vmalloc (size));
-		}
+	int device;
+	int max_z2_map = (Z2RAM_SIZE / Z2RAM_CHUNKSIZE) * sizeof(z2ram_map[0]);
+	int max_chip_map = (amiga_chip_size / Z2RAM_CHUNKSIZE) *
+	    sizeof(z2ram_map[0]);
+	int rc = -ENOMEM;
 
-		vaddr = (unsigned long)ioremap_wt(paddr, size);
+	device = MINOR(bdev->bd_dev);
 
-#else
-		vaddr = (unsigned long)z_remap_nocache_nonser(paddr, size);
-#endif
-		z2ram_map = 
-			kmalloc_array(size / Z2RAM_CHUNKSIZE,
-                                      sizeof(z2ram_map[0]),
-                                      GFP_KERNEL);
-		if ( z2ram_map == NULL )
-		{
-		    printk( KERN_ERR DEVICE_NAME
-			": cannot get mem for z2ram_map\n" );
-		    goto err_out;
-		}
+	mutex_lock(&z2ram_mutex);
+	if (current_device != -1 && current_device != device) {
+		rc = -EBUSY;
+		goto err_out;
+	}
 
-		while (size) {
-			z2ram_map[ z2ram_size++ ] = vaddr;
-			size -= Z2RAM_CHUNKSIZE;
-			vaddr += Z2RAM_CHUNKSIZE;
-			list_count++;
-		}
+	if (current_device == -1) {
+		z2_count = 0;
+		chip_count = 0;
+		list_count = 0;
+		z2ram_size = 0;
 
-		if ( z2ram_size != 0 )
-		    printk( KERN_INFO DEVICE_NAME
-			": using %iK List Entry %d Memory\n",
-			list_count * Z2RAM_CHUNK1024, index );
-	} else
-
-	switch ( device )
-	{
-	    case Z2MINOR_COMBINED:
-
-		z2ram_map = kmalloc( max_z2_map + max_chip_map, GFP_KERNEL );
-		if ( z2ram_map == NULL )
-		{
-		    printk( KERN_ERR DEVICE_NAME
-			": cannot get mem for z2ram_map\n" );
-		    goto err_out;
-		}
+		/* Use a specific list entry. */
+		if (device >= Z2MINOR_MEMLIST1 && device <= Z2MINOR_MEMLIST4) {
+			int index = device - Z2MINOR_MEMLIST1 + 1;
+			unsigned long size, paddr, vaddr;
 
-		get_z2ram();
-		get_chipram();
-
-		if ( z2ram_size != 0 )
-		    printk( KERN_INFO DEVICE_NAME 
-			": using %iK Zorro II RAM and %iK Chip RAM (Total %dK)\n",
-			z2_count * Z2RAM_CHUNK1024,
-			chip_count * Z2RAM_CHUNK1024,
-			( z2_count + chip_count ) * Z2RAM_CHUNK1024 );
-
-	    break;
-
-    	    case Z2MINOR_Z2ONLY:
-		z2ram_map = kmalloc( max_z2_map, GFP_KERNEL );
-		if ( z2ram_map == NULL )
-		{
-		    printk( KERN_ERR DEVICE_NAME
-			": cannot get mem for z2ram_map\n" );
-		    goto err_out;
-		}
+			if (index >= m68k_realnum_memory) {
+				printk(KERN_ERR DEVICE_NAME
+				       ": no such entry in z2ram_map\n");
+				goto err_out;
+			}
 
-		get_z2ram();
+			paddr = m68k_memory[index].addr;
+			size = m68k_memory[index].size & ~(Z2RAM_CHUNKSIZE - 1);
 
-		if ( z2ram_size != 0 )
-		    printk( KERN_INFO DEVICE_NAME 
-			": using %iK of Zorro II RAM\n",
-			z2_count * Z2RAM_CHUNK1024 );
+#ifdef __powerpc__
+			/* FIXME: ioremap doesn't build correct memory tables. */
+			{
+				vfree(vmalloc(size));
+			}
 
-	    break;
+			vaddr = (unsigned long)ioremap_wt(paddr, size);
 
-	    case Z2MINOR_CHIPONLY:
-		z2ram_map = kmalloc( max_chip_map, GFP_KERNEL );
-		if ( z2ram_map == NULL )
-		{
-		    printk( KERN_ERR DEVICE_NAME
-			": cannot get mem for z2ram_map\n" );
-		    goto err_out;
+#else
+			vaddr =
+			    (unsigned long)z_remap_nocache_nonser(paddr, size);
+#endif
+			z2ram_map =
+			    kmalloc_array(size / Z2RAM_CHUNKSIZE,
+					  sizeof(z2ram_map[0]), GFP_KERNEL);
+			if (z2ram_map == NULL) {
+				printk(KERN_ERR DEVICE_NAME
+				       ": cannot get mem for z2ram_map\n");
+				goto err_out;
+			}
+
+			while (size) {
+				z2ram_map[z2ram_size++] = vaddr;
+				size -= Z2RAM_CHUNKSIZE;
+				vaddr += Z2RAM_CHUNKSIZE;
+				list_count++;
+			}
+
+			if (z2ram_size != 0)
+				printk(KERN_INFO DEVICE_NAME
+				       ": using %iK List Entry %d Memory\n",
+				       list_count * Z2RAM_CHUNK1024, index);
+		} else
+			switch (device) {
+			case Z2MINOR_COMBINED:
+
+				z2ram_map =
+				    kmalloc(max_z2_map + max_chip_map,
+					    GFP_KERNEL);
+				if (z2ram_map == NULL) {
+					printk(KERN_ERR DEVICE_NAME
+					       ": cannot get mem for z2ram_map\n");
+					goto err_out;
+				}
+
+				get_z2ram();
+				get_chipram();
+
+				if (z2ram_size != 0)
+					printk(KERN_INFO DEVICE_NAME
+					       ": using %iK Zorro II RAM and %iK Chip RAM (Total %dK)\n",
+					       z2_count * Z2RAM_CHUNK1024,
+					       chip_count * Z2RAM_CHUNK1024,
+					       (z2_count +
+						chip_count) * Z2RAM_CHUNK1024);
+
+				break;
+
+			case Z2MINOR_Z2ONLY:
+				z2ram_map = kmalloc(max_z2_map, GFP_KERNEL);
+				if (z2ram_map == NULL) {
+					printk(KERN_ERR DEVICE_NAME
+					       ": cannot get mem for z2ram_map\n");
+					goto err_out;
+				}
+
+				get_z2ram();
+
+				if (z2ram_size != 0)
+					printk(KERN_INFO DEVICE_NAME
+					       ": using %iK of Zorro II RAM\n",
+					       z2_count * Z2RAM_CHUNK1024);
+
+				break;
+
+			case Z2MINOR_CHIPONLY:
+				z2ram_map = kmalloc(max_chip_map, GFP_KERNEL);
+				if (z2ram_map == NULL) {
+					printk(KERN_ERR DEVICE_NAME
+					       ": cannot get mem for z2ram_map\n");
+					goto err_out;
+				}
+
+				get_chipram();
+
+				if (z2ram_size != 0)
+					printk(KERN_INFO DEVICE_NAME
+					       ": using %iK Chip RAM\n",
+					       chip_count * Z2RAM_CHUNK1024);
+
+				break;
+
+			default:
+				rc = -ENODEV;
+				goto err_out;
+
+				break;
+			}
+
+		if (z2ram_size == 0) {
+			printk(KERN_NOTICE DEVICE_NAME
+			       ": no unused ZII/Chip RAM found\n");
+			goto err_out_kfree;
 		}
 
-		get_chipram();
-
-		if ( z2ram_size != 0 )
-		    printk( KERN_INFO DEVICE_NAME 
-			": using %iK Chip RAM\n",
-			chip_count * Z2RAM_CHUNK1024 );
-		    
-	    break;
-
-	    default:
-		rc = -ENODEV;
-		goto err_out;
-	
-	    break;
+		current_device = device;
+		z2ram_size <<= Z2RAM_CHUNKSHIFT;
+		set_capacity(z2ram_gendisk, z2ram_size >> 9);
 	}
 
-	if ( z2ram_size == 0 )
-	{
-	    printk( KERN_NOTICE DEVICE_NAME
-		": no unused ZII/Chip RAM found\n" );
-	    goto err_out_kfree;
-	}
-
-	current_device = device;
-	z2ram_size <<= Z2RAM_CHUNKSHIFT;
-	set_capacity(z2ram_gendisk, z2ram_size >> 9);
-    }
-
-    mutex_unlock(&z2ram_mutex);
-    return 0;
+	mutex_unlock(&z2ram_mutex);
+	return 0;
 
 err_out_kfree:
-    kfree(z2ram_map);
+	kfree(z2ram_map);
 err_out:
-    mutex_unlock(&z2ram_mutex);
-    return rc;
+	mutex_unlock(&z2ram_mutex);
+	return rc;
 }
 
-static void
-z2_release(struct gendisk *disk, fmode_t mode)
+static void z2_release(struct gendisk *disk, fmode_t mode)
 {
-    mutex_lock(&z2ram_mutex);
-    if ( current_device == -1 ) {
-    	mutex_unlock(&z2ram_mutex);
-    	return;
-    }
-    mutex_unlock(&z2ram_mutex);
-    /*
-     * FIXME: unmap memory
-     */
+	mutex_lock(&z2ram_mutex);
+	if (current_device == -1) {
+		mutex_unlock(&z2ram_mutex);
+		return;
+	}
+	mutex_unlock(&z2ram_mutex);
+	/*
+	 * FIXME: unmap memory
+	 */
 }
 
-static const struct block_device_operations z2_fops =
-{
-	.owner		= THIS_MODULE,
-	.open		= z2_open,
-	.release	= z2_release,
+static const struct block_device_operations z2_fops = {
+	.owner = THIS_MODULE,
+	.open = z2_open,
+	.release = z2_release,
 };
 
 static struct kobject *z2_find(dev_t dev, int *part, void *data)
@@ -340,89 +325,83 @@ static struct request_queue *z2_queue;
 static struct blk_mq_tag_set tag_set;
 
 static const struct blk_mq_ops z2_mq_ops = {
-	.queue_rq	= z2_queue_rq,
+	.queue_rq = z2_queue_rq,
 };
 
-static int __init 
-z2_init(void)
+static int __init z2_init(void)
 {
-    int ret;
+	int ret;
 
-    if (!MACH_IS_AMIGA)
-	return -ENODEV;
+	if (!MACH_IS_AMIGA)
+		return -ENODEV;
 
-    ret = -EBUSY;
-    if (register_blkdev(Z2RAM_MAJOR, DEVICE_NAME))
-	goto err;
+	ret = -EBUSY;
+	if (register_blkdev(Z2RAM_MAJOR, DEVICE_NAME))
+		goto err;
 
-    ret = -ENOMEM;
-    z2ram_gendisk = alloc_disk(1);
-    if (!z2ram_gendisk)
-	goto out_disk;
+	ret = -ENOMEM;
+	z2ram_gendisk = alloc_disk(1);
+	if (!z2ram_gendisk)
+		goto out_disk;
 
-    z2_queue = blk_mq_init_sq_queue(&tag_set, &z2_mq_ops, 16,
+	z2_queue = blk_mq_init_sq_queue(&tag_set, &z2_mq_ops, 16,
 					BLK_MQ_F_SHOULD_MERGE);
-    if (IS_ERR(z2_queue)) {
-	ret = PTR_ERR(z2_queue);
-	z2_queue = NULL;
-	goto out_queue;
-    }
+	if (IS_ERR(z2_queue)) {
+		ret = PTR_ERR(z2_queue);
+		z2_queue = NULL;
+		goto out_queue;
+	}
 
-    z2ram_gendisk->major = Z2RAM_MAJOR;
-    z2ram_gendisk->first_minor = 0;
-    z2ram_gendisk->fops = &z2_fops;
-    sprintf(z2ram_gendisk->disk_name, "z2ram");
+	z2ram_gendisk->major = Z2RAM_MAJOR;
+	z2ram_gendisk->first_minor = 0;
+	z2ram_gendisk->fops = &z2_fops;
+	sprintf(z2ram_gendisk->disk_name, "z2ram");
 
-    z2ram_gendisk->queue = z2_queue;
-    add_disk(z2ram_gendisk);
-    blk_register_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT, THIS_MODULE,
-				z2_find, NULL, NULL);
+	z2ram_gendisk->queue = z2_queue;
+	add_disk(z2ram_gendisk);
+	blk_register_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT, THIS_MODULE,
+			    z2_find, NULL, NULL);
 
-    return 0;
+	return 0;
 
 out_queue:
-    put_disk(z2ram_gendisk);
+	put_disk(z2ram_gendisk);
 out_disk:
-    unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
+	unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
 err:
-    return ret;
+	return ret;
 }
 
 static void __exit z2_exit(void)
 {
-    int i, j;
-    blk_unregister_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT);
-    unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
-    del_gendisk(z2ram_gendisk);
-    put_disk(z2ram_gendisk);
-    blk_cleanup_queue(z2_queue);
-    blk_mq_free_tag_set(&tag_set);
-
-    if ( current_device != -1 )
-    {
-	i = 0;
-
-	for ( j = 0 ; j < z2_count; j++ )
-	{
-	    set_bit( i++, zorro_unused_z2ram ); 
-	}
+	int i, j;
+	blk_unregister_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT);
+	unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
+	del_gendisk(z2ram_gendisk);
+	put_disk(z2ram_gendisk);
+	blk_cleanup_queue(z2_queue);
+	blk_mq_free_tag_set(&tag_set);
+
+	if (current_device != -1) {
+		i = 0;
+
+		for (j = 0; j < z2_count; j++) {
+			set_bit(i++, zorro_unused_z2ram);
+		}
 
-	for ( j = 0 ; j < chip_count; j++ )
-	{
-	    if ( z2ram_map[ i ] )
-	    {
-		amiga_chip_free( (void *) z2ram_map[ i++ ] );
-	    }
-	}
+		for (j = 0; j < chip_count; j++) {
+			if (z2ram_map[i]) {
+				amiga_chip_free((void *)z2ram_map[i++]);
+			}
+		}
 
-	if ( z2ram_map != NULL )
-	{
-	    kfree( z2ram_map );
+		if (z2ram_map != NULL) {
+			kfree(z2ram_map);
+		}
 	}
-    }
 
-    return;
-} 
+	return;
+}
 
 module_init(z2_init);
 module_exit(z2_exit);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 51/78] z2ram: use separate gendisk for the different modes
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (49 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 50/78] z2ram: reindent Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 52/78] block: switch gendisk lookup to a simple xarray Christoph Hellwig
                   ` (28 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke

Use separate gendisks (which share a tag_set) for the different operating
modes instead of redirecting the gendisk lookup using a probe callback.
This avoids potential problems with aliased block_device instances and
will eventually allow for removing the blk_register_region framework.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/block/z2ram.c | 100 ++++++++++++++++++++++++------------------
 1 file changed, 58 insertions(+), 42 deletions(-)

diff --git a/drivers/block/z2ram.c b/drivers/block/z2ram.c
index eafecc9a72b38d..c1d20818e64920 100644
--- a/drivers/block/z2ram.c
+++ b/drivers/block/z2ram.c
@@ -63,7 +63,7 @@ static int current_device = -1;
 
 static DEFINE_SPINLOCK(z2ram_lock);
 
-static struct gendisk *z2ram_gendisk;
+static struct gendisk *z2ram_gendisk[Z2MINOR_COUNT];
 
 static blk_status_t z2_queue_rq(struct blk_mq_hw_ctx *hctx,
 				const struct blk_mq_queue_data *bd)
@@ -283,7 +283,7 @@ static int z2_open(struct block_device *bdev, fmode_t mode)
 
 		current_device = device;
 		z2ram_size <<= Z2RAM_CHUNKSHIFT;
-		set_capacity(z2ram_gendisk, z2ram_size >> 9);
+		set_capacity(z2ram_gendisk[device], z2ram_size >> 9);
 	}
 
 	mutex_unlock(&z2ram_mutex);
@@ -315,71 +315,87 @@ static const struct block_device_operations z2_fops = {
 	.release = z2_release,
 };
 
-static struct kobject *z2_find(dev_t dev, int *part, void *data)
-{
-	*part = 0;
-	return get_disk_and_module(z2ram_gendisk);
-}
-
-static struct request_queue *z2_queue;
 static struct blk_mq_tag_set tag_set;
 
 static const struct blk_mq_ops z2_mq_ops = {
 	.queue_rq = z2_queue_rq,
 };
 
+static int z2ram_register_disk(int minor)
+{
+	struct request_queue *q;
+	struct gendisk *disk;
+
+	disk = alloc_disk(1);
+	if (!disk)
+		return -ENOMEM;
+
+	q = blk_mq_init_queue(&tag_set);
+	if (IS_ERR(q)) {
+		put_disk(disk);
+		return PTR_ERR(q);
+	}
+
+	disk->major = Z2RAM_MAJOR;
+	disk->first_minor = minor;
+	disk->fops = &z2_fops;
+	if (minor)
+		sprintf(disk->disk_name, "z2ram%d", minor);
+	else
+		sprintf(disk->disk_name, "z2ram");
+	disk->queue = q;
+
+	z2ram_gendisk[minor] = disk;
+	add_disk(disk);
+	return 0;
+}
+
 static int __init z2_init(void)
 {
-	int ret;
+	int ret, i;
 
 	if (!MACH_IS_AMIGA)
 		return -ENODEV;
 
-	ret = -EBUSY;
 	if (register_blkdev(Z2RAM_MAJOR, DEVICE_NAME))
-		goto err;
-
-	ret = -ENOMEM;
-	z2ram_gendisk = alloc_disk(1);
-	if (!z2ram_gendisk)
-		goto out_disk;
-
-	z2_queue = blk_mq_init_sq_queue(&tag_set, &z2_mq_ops, 16,
-					BLK_MQ_F_SHOULD_MERGE);
-	if (IS_ERR(z2_queue)) {
-		ret = PTR_ERR(z2_queue);
-		z2_queue = NULL;
-		goto out_queue;
+		return -EBUSY;
+
+	tag_set.ops = &z2_mq_ops;
+	tag_set.nr_hw_queues = 1;
+	tag_set.nr_maps = 1;
+	tag_set.queue_depth = 16;
+	tag_set.numa_node = NUMA_NO_NODE;
+	tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
+	ret = blk_mq_alloc_tag_set(&tag_set);
+	if (ret)
+		goto out_unregister_blkdev;
+
+	for (i = 0; i < Z2MINOR_COUNT; i++) {
+		ret = z2ram_register_disk(i);
+		if (ret && i == 0)
+			goto out_free_tagset;
 	}
 
-	z2ram_gendisk->major = Z2RAM_MAJOR;
-	z2ram_gendisk->first_minor = 0;
-	z2ram_gendisk->fops = &z2_fops;
-	sprintf(z2ram_gendisk->disk_name, "z2ram");
-
-	z2ram_gendisk->queue = z2_queue;
-	add_disk(z2ram_gendisk);
-	blk_register_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT, THIS_MODULE,
-			    z2_find, NULL, NULL);
-
 	return 0;
 
-out_queue:
-	put_disk(z2ram_gendisk);
-out_disk:
+out_free_tagset:
+	blk_mq_free_tag_set(&tag_set);
+out_unregister_blkdev:
 	unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
-err:
 	return ret;
 }
 
 static void __exit z2_exit(void)
 {
 	int i, j;
-	blk_unregister_region(MKDEV(Z2RAM_MAJOR, 0), Z2MINOR_COUNT);
+
 	unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
-	del_gendisk(z2ram_gendisk);
-	put_disk(z2ram_gendisk);
-	blk_cleanup_queue(z2_queue);
+
+	for (i = 0; i < Z2MINOR_COUNT; i++) {
+		del_gendisk(z2ram_gendisk[i]);
+		blk_cleanup_queue(z2ram_gendisk[i]->queue);
+		put_disk(z2ram_gendisk[i]);
+	}
 	blk_mq_free_tag_set(&tag_set);
 
 	if (current_device != -1) {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 52/78] block: switch gendisk lookup to a simple xarray
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (50 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 51/78] z2ram: use separate gendisk for the different modes Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats Christoph Hellwig
                   ` (27 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel,
	Hannes Reinecke, Greg Kroah-Hartman

Now that bdev_map is only used for finding gendisks, we can use
a simple xarray instead of the regions tracking structure for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/genhd.c         | 208 ++++++++----------------------------------
 include/linux/genhd.h |   7 --
 2 files changed, 37 insertions(+), 178 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index dc8690bc281c16..4a224a3c8e1071 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -27,15 +27,7 @@
 
 static struct kobject *block_depr;
 
-struct bdev_map {
-	struct bdev_map *next;
-	dev_t dev;
-	unsigned long range;
-	struct module *owner;
-	struct kobject *(*probe)(dev_t, int *, void *);
-	int (*lock)(dev_t, void *);
-	void *data;
-} *bdev_map[255];
+static DEFINE_XARRAY(bdev_map);
 static DEFINE_MUTEX(bdev_map_lock);
 
 /* for extended dynamic devt allocation, currently only one major is used */
@@ -646,85 +638,26 @@ static char *bdevt_str(dev_t devt, char *buf)
 	return buf;
 }
 
-/*
- * Register device numbers dev..(dev+range-1)
- * range must be nonzero
- * The hash chain is sorted on range, so that subranges can override.
- */
-void blk_register_region(dev_t devt, unsigned long range, struct module *module,
-			 struct kobject *(*probe)(dev_t, int *, void *),
-			 int (*lock)(dev_t, void *), void *data)
-{
-	unsigned n = MAJOR(devt + range - 1) - MAJOR(devt) + 1;
-	unsigned index = MAJOR(devt);
-	unsigned i;
-	struct bdev_map *p;
-
-	n = min(n, 255u);
-	p = kmalloc_array(n, sizeof(struct bdev_map), GFP_KERNEL);
-	if (p == NULL)
-		return;
-
-	for (i = 0; i < n; i++, p++) {
-		p->owner = module;
-		p->probe = probe;
-		p->lock = lock;
-		p->dev = devt;
-		p->range = range;
-		p->data = data;
-	}
+static void blk_register_region(struct gendisk *disk)
+{
+	int i;
 
 	mutex_lock(&bdev_map_lock);
-	for (i = 0, p -= n; i < n; i++, p++, index++) {
-		struct bdev_map **s = &bdev_map[index % 255];
-		while (*s && (*s)->range < range)
-			s = &(*s)->next;
-		p->next = *s;
-		*s = p;
+	for (i = 0; i < disk->minors; i++) {
+		if (xa_insert(&bdev_map, disk_devt(disk) + i, disk, GFP_KERNEL))
+			WARN_ON_ONCE(1);
 	}
 	mutex_unlock(&bdev_map_lock);
 }
-EXPORT_SYMBOL(blk_register_region);
 
-void blk_unregister_region(dev_t devt, unsigned long range)
+static void blk_unregister_region(struct gendisk *disk)
 {
-	unsigned n = MAJOR(devt + range - 1) - MAJOR(devt) + 1;
-	unsigned index = MAJOR(devt);
-	unsigned i;
-	struct bdev_map *found = NULL;
+	int i;
 
 	mutex_lock(&bdev_map_lock);
-	for (i = 0; i < min(n, 255u); i++, index++) {
-		struct bdev_map **s;
-		for (s = &bdev_map[index % 255]; *s; s = &(*s)->next) {
-			struct bdev_map *p = *s;
-			if (p->dev == devt && p->range == range) {
-				*s = p->next;
-				if (!found)
-					found = p;
-				break;
-			}
-		}
-	}
+	for (i = 0; i < disk->minors; i++)
+		xa_erase(&bdev_map, disk_devt(disk) + i);
 	mutex_unlock(&bdev_map_lock);
-	kfree(found);
-}
-EXPORT_SYMBOL(blk_unregister_region);
-
-static struct kobject *exact_match(dev_t devt, int *partno, void *data)
-{
-	struct gendisk *p = data;
-
-	return &disk_to_dev(p)->kobj;
-}
-
-static int exact_lock(dev_t devt, void *data)
-{
-	struct gendisk *p = data;
-
-	if (!get_disk_and_module(p))
-		return -1;
-	return 0;
 }
 
 static void disk_scan_partitions(struct gendisk *disk)
@@ -870,8 +803,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 		ret = bdi_register(bdi, "%u:%u", MAJOR(devt), MINOR(devt));
 		WARN_ON(ret);
 		bdi_set_owner(bdi, dev);
-		blk_register_region(disk_devt(disk), disk->minors, NULL,
-				    exact_match, exact_lock, disk);
+		blk_register_region(disk);
 	}
 	register_disk(parent, disk, groups);
 	if (register_queue)
@@ -984,7 +916,7 @@ void del_gendisk(struct gendisk *disk)
 	blk_unregister_queue(disk);
 	
 	if (!(disk->flags & GENHD_FL_HIDDEN))
-		blk_unregister_region(disk_devt(disk), disk->minors);
+		blk_unregister_region(disk);
 	/*
 	 * Remove gendisk pointer from idr so that it cannot be looked up
 	 * while RCU period before freeing gendisk is running to prevent
@@ -1050,54 +982,22 @@ static void request_gendisk_module(dev_t devt)
 		request_module("block-major-%d", MAJOR(devt));
 }
 
-static struct gendisk *lookup_gendisk(dev_t dev, int *partno)
+static bool get_disk_and_module(struct gendisk *disk)
 {
-	struct kobject *kobj;
-	struct bdev_map *p;
-	unsigned long best = ~0UL;
-
-retry:
-	mutex_lock(&bdev_map_lock);
-	for (p = bdev_map[MAJOR(dev) % 255]; p; p = p->next) {
-		struct kobject *(*probe)(dev_t, int *, void *);
-		struct module *owner;
-		void *data;
-
-		if (p->dev > dev || p->dev + p->range - 1 < dev)
-			continue;
-		if (p->range - 1 >= best)
-			break;
-		if (!try_module_get(p->owner))
-			continue;
-		owner = p->owner;
-		data = p->data;
-		probe = p->probe;
-		best = p->range - 1;
-		*partno = dev - p->dev;
-
-		if (!probe) {
-			mutex_unlock(&bdev_map_lock);
-			module_put(owner);
-			request_gendisk_module(dev);
-			goto retry;
-		}
+	struct module *owner;
 
-		if (p->lock && p->lock(dev, data) < 0) {
-			module_put(owner);
-			continue;
-		}
-		mutex_unlock(&bdev_map_lock);
-		kobj = probe(dev, partno, data);
-		/* Currently ->owner protects _only_ ->probe() itself. */
+	if (!disk->fops)
+		return false;
+	owner = disk->fops->owner;
+	if (owner && !try_module_get(owner))
+		return false;
+	if (!kobject_get_unless_zero(&disk_to_dev(disk)->kobj)) {
 		module_put(owner);
-		if (kobj)
-			return dev_to_disk(kobj_to_dev(kobj));
-		goto retry;
+		return false;
 	}
-	mutex_unlock(&bdev_map_lock);
-	return NULL;
-}
+	return true;
 
+}
 
 /**
  * get_gendisk - get partitioning information for a given device
@@ -1116,7 +1016,19 @@ struct gendisk *get_gendisk(dev_t devt, int *partno)
 	might_sleep();
 
 	if (MAJOR(devt) != BLOCK_EXT_MAJOR) {
-		disk = lookup_gendisk(devt, partno);
+		mutex_lock(&bdev_map_lock);
+		disk = xa_load(&bdev_map, devt);
+		if (!disk) {
+			mutex_unlock(&bdev_map_lock);
+			request_gendisk_module(devt);
+			mutex_lock(&bdev_map_lock);
+			disk = xa_load(&bdev_map, devt);
+		}
+		if (disk && !get_disk_and_module(disk))
+			disk = NULL;
+		if (disk)
+			*partno = devt - disk_devt(disk);
+		mutex_unlock(&bdev_map_lock);
 	} else {
 		struct hd_struct *part;
 
@@ -1320,21 +1232,6 @@ static const struct seq_operations partitions_op = {
 };
 #endif
 
-static void bdev_map_init(void)
-{
-	struct bdev_map *base;
-	int i;
-
-	base = kzalloc(sizeof(*base), GFP_KERNEL);
-	if (!base)
-		panic("cannot allocate bdev_map");
-
-	base->dev = 1;
-	base->range = ~0 ;
-	for (i = 0; i < 255; i++)
-		bdev_map[i] = base;
-}
-
 static int __init genhd_device_init(void)
 {
 	int error;
@@ -1343,7 +1240,6 @@ static int __init genhd_device_init(void)
 	error = class_register(&block_class);
 	if (unlikely(error))
 		return error;
-	bdev_map_init();
 	blk_dev_init();
 
 	register_blkdev(BLOCK_EXT_MAJOR, "blkext");
@@ -1892,35 +1788,6 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 }
 EXPORT_SYMBOL(__alloc_disk_node);
 
-/**
- * get_disk_and_module - increments the gendisk and gendisk fops module refcount
- * @disk: the struct gendisk to increment the refcount for
- *
- * This increments the refcount for the struct gendisk, and the gendisk's
- * fops module owner.
- *
- * Context: Any context.
- */
-struct kobject *get_disk_and_module(struct gendisk *disk)
-{
-	struct module *owner;
-	struct kobject *kobj;
-
-	if (!disk->fops)
-		return NULL;
-	owner = disk->fops->owner;
-	if (owner && !try_module_get(owner))
-		return NULL;
-	kobj = kobject_get_unless_zero(&disk_to_dev(disk)->kobj);
-	if (kobj == NULL) {
-		module_put(owner);
-		return NULL;
-	}
-	return kobj;
-
-}
-EXPORT_SYMBOL(get_disk_and_module);
-
 /**
  * put_disk - decrements the gendisk refcount
  * @disk: the struct gendisk to decrement the refcount for
@@ -1957,7 +1824,6 @@ void put_disk_and_module(struct gendisk *disk)
 		module_put(owner);
 	}
 }
-EXPORT_SYMBOL(put_disk_and_module);
 
 static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 04f6a6bf577a90..46553d6d602563 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -338,15 +338,8 @@ int blk_add_partitions(struct gendisk *disk, struct block_device *bdev);
 int blk_drop_partitions(struct block_device *bdev);
 
 extern struct gendisk *__alloc_disk_node(int minors, int node_id);
-extern struct kobject *get_disk_and_module(struct gendisk *disk);
 extern void put_disk(struct gendisk *disk);
 extern void put_disk_and_module(struct gendisk *disk);
-extern void blk_register_region(dev_t devt, unsigned long range,
-			struct module *module,
-			struct kobject *(*probe)(dev_t, int *, void *),
-			int (*lock)(dev_t, void *),
-			void *data);
-extern void blk_unregister_region(dev_t devt, unsigned long range);
 
 #define alloc_disk_node(minors, node_id)				\
 ({									\
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (51 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 52/78] block: switch gendisk lookup to a simple xarray Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:25   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 54/78] block: remove a duplicate __disk_get_part prototype Christoph Hellwig
                   ` (26 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

disk_get_part needs to be paired with a disk_put_part.

Fixes: ef45fe470e1 ("blk-cgroup: show global disk stats in root cgroup io.stat")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index c68bdf58c9a6e1..54fbe1e80cc41a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -849,6 +849,7 @@ static void blkcg_fill_root_iostats(void)
 			blkg_iostat_set(&blkg->iostat.cur, &tmp);
 			u64_stats_update_end(&blkg->iostat.sync);
 		}
+		disk_put_part(part);
 	}
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 54/78] block: remove a duplicate __disk_get_part prototype
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (52 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:25   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 55/78] block: change the hash used for looking up block devices Christoph Hellwig
                   ` (25 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/genhd.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 46553d6d602563..22f5b9fd96f8bf 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -250,7 +250,6 @@ static inline dev_t part_devt(struct hd_struct *part)
 	return part_to_dev(part)->devt;
 }
 
-extern struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
 extern struct hd_struct *disk_get_part(struct gendisk *disk, int partno);
 
 static inline void disk_put_part(struct hd_struct *part)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 55/78] block: change the hash used for looking up block devices
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (53 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 54/78] block: remove a duplicate __disk_get_part prototype Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:26   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 56/78] init: refactor name_to_dev_t Christoph Hellwig
                   ` (24 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Adding the minor to the major creates tons of pointless conflicts. Just
use the dev_t itself, which is 32-bits and thus is guaranteed to fit
into ino_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/block_dev.c | 26 ++------------------------
 1 file changed, 2 insertions(+), 24 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index d8664f5c1ff669..29db12c3bb501c 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -870,35 +870,12 @@ void __init bdev_cache_init(void)
 	blockdev_superblock = bd_mnt->mnt_sb;   /* For writeback */
 }
 
-/*
- * Most likely _very_ bad one - but then it's hardly critical for small
- * /dev and can be fixed when somebody will need really large one.
- * Keep in mind that it will be fed through icache hash function too.
- */
-static inline unsigned long hash(dev_t dev)
-{
-	return MAJOR(dev)+MINOR(dev);
-}
-
-static int bdev_test(struct inode *inode, void *data)
-{
-	return BDEV_I(inode)->bdev.bd_dev == *(dev_t *)data;
-}
-
-static int bdev_set(struct inode *inode, void *data)
-{
-	BDEV_I(inode)->bdev.bd_dev = *(dev_t *)data;
-	return 0;
-}
-
 static struct block_device *bdget(dev_t dev)
 {
 	struct block_device *bdev;
 	struct inode *inode;
 
-	inode = iget5_locked(blockdev_superblock, hash(dev),
-			bdev_test, bdev_set, &dev);
-
+	inode = iget_locked(blockdev_superblock, dev);
 	if (!inode)
 		return NULL;
 
@@ -910,6 +887,7 @@ static struct block_device *bdget(dev_t dev)
 		bdev->bd_super = NULL;
 		bdev->bd_inode = inode;
 		bdev->bd_part_count = 0;
+		bdev->bd_dev = dev;
 		inode->i_mode = S_IFBLK;
 		inode->i_rdev = dev;
 		inode->i_bdev = bdev;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 56/78] init: refactor name_to_dev_t
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (54 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 55/78] block: change the hash used for looking up block devices Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:31   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 57/78] init: refactor devt_from_partuuid Christoph Hellwig
                   ` (23 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Split each case into a self-contained helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/genhd.h |   7 +-
 init/do_mounts.c      | 183 +++++++++++++++++++++---------------------
 2 files changed, 91 insertions(+), 99 deletions(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 22f5b9fd96f8bf..ca5e356084c353 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -388,18 +388,13 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
 }
 #endif /* CONFIG_SYSFS */
 
+dev_t blk_lookup_devt(const char *name, int partno);
 #ifdef CONFIG_BLOCK
 void printk_all_partitions(void);
-dev_t blk_lookup_devt(const char *name, int partno);
 #else /* CONFIG_BLOCK */
 static inline void printk_all_partitions(void)
 {
 }
-static inline dev_t blk_lookup_devt(const char *name, int partno)
-{
-	dev_t devt = MKDEV(0, 0);
-	return devt;
-}
 #endif /* CONFIG_BLOCK */
 
 #endif /* _LINUX_GENHD_H */
diff --git a/init/do_mounts.c b/init/do_mounts.c
index b5f9604d0c98a2..aef2f24461c7f1 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -90,7 +90,6 @@ static int match_dev_by_uuid(struct device *dev, const void *data)
 	return 0;
 }
 
-
 /**
  * devt_from_partuuid - looks up the dev_t of a partition by its UUID
  * @uuid_str:	char array containing ascii UUID
@@ -186,7 +185,83 @@ static int match_dev_by_label(struct device *dev, const void *data)
 
 	return 0;
 }
-#endif
+
+static dev_t devt_from_partlabel(const char *label)
+{
+	struct device *dev;
+	dev_t devt = 0;
+
+	dev = class_find_device(&block_class, NULL, label, &match_dev_by_label);
+	if (dev) {
+		devt = dev->devt;
+		put_device(dev);
+	}
+
+	return devt;
+}
+
+static dev_t devt_from_devname(const char *name)
+{
+	dev_t devt = 0;
+	int part;
+	char s[32];
+	char *p;
+
+	if (strlen(name) > 31)
+		return 0;
+	strcpy(s, name);
+	for (p = s; *p; p++) {
+		if (*p == '/')
+			*p = '!';
+	}
+
+	devt = blk_lookup_devt(s, 0);
+	if (devt)
+		return devt;
+
+	/*
+	 * Try non-existent, but valid partition, which may only exist after
+	 * opening the device, like partitioned md devices.
+	 */
+	while (p > s && isdigit(p[-1]))
+		p--;
+	if (p == s || !*p || *p == '0')
+		return 0;
+
+	/* try disk name without <part number> */
+	part = simple_strtoul(p, NULL, 10);
+	*p = '\0';
+	devt = blk_lookup_devt(s, part);
+	if (devt)
+		return devt;
+
+	/* try disk name without p<part number> */
+	if (p < s + 2 || !isdigit(p[-2]) || p[-1] != 'p')
+		return 0;
+	p[-1] = '\0';
+	return blk_lookup_devt(s, part);
+}
+#endif /* CONFIG_BLOCK */
+
+static dev_t devt_from_devnum(const char *name)
+{
+	unsigned maj, min, offset;
+	dev_t devt = 0;
+	char *p, dummy;
+
+	if (sscanf(name, "%u:%u%c", &maj, &min, &dummy) == 2 ||
+	    sscanf(name, "%u:%u:%u:%c", &maj, &min, &offset, &dummy) == 3) {
+		devt = MKDEV(maj, min);
+		if (maj != MAJOR(devt) || min != MINOR(devt))
+			return 0;
+	} else {
+		devt = new_decode_dev(simple_strtoul(name, &p, 16));
+		if (*p)
+			return 0;
+	}
+
+	return devt;
+}
 
 /*
  *	Convert a name into device number.  We accept the following variants:
@@ -218,101 +293,23 @@ static int match_dev_by_label(struct device *dev, const void *data)
  *	name contains slashes, the device name has them replaced with
  *	bangs.
  */
-
 dev_t name_to_dev_t(const char *name)
 {
-	char s[32];
-	char *p;
-	dev_t res = 0;
-	int part;
-
+	if (strcmp(name, "/dev/nfs") == 0)
+		return Root_NFS;
+	if (strcmp(name, "/dev/cifs") == 0)
+		return Root_CIFS;
+	if (strcmp(name, "/dev/ram") == 0)
+		return Root_RAM0;
 #ifdef CONFIG_BLOCK
-	if (strncmp(name, "PARTUUID=", 9) == 0) {
-		name += 9;
-		res = devt_from_partuuid(name);
-		if (!res)
-			goto fail;
-		goto done;
-	} else if (strncmp(name, "PARTLABEL=", 10) == 0) {
-		struct device *dev;
-
-		dev = class_find_device(&block_class, NULL, name + 10,
-					&match_dev_by_label);
-		if (!dev)
-			goto fail;
-
-		res = dev->devt;
-		put_device(dev);
-		goto done;
-	}
+	if (strncmp(name, "PARTUUID=", 9) == 0)
+		return devt_from_partuuid(name + 9);
+	if (strncmp(name, "PARTLABEL=", 10) == 0)
+		return devt_from_partlabel(name + 10);
+	if (strncmp(name, "/dev/", 5) == 0)
+		return devt_from_devname(name + 5);
 #endif
-
-	if (strncmp(name, "/dev/", 5) != 0) {
-		unsigned maj, min, offset;
-		char dummy;
-
-		if ((sscanf(name, "%u:%u%c", &maj, &min, &dummy) == 2) ||
-		    (sscanf(name, "%u:%u:%u:%c", &maj, &min, &offset, &dummy) == 3)) {
-			res = MKDEV(maj, min);
-			if (maj != MAJOR(res) || min != MINOR(res))
-				goto fail;
-		} else {
-			res = new_decode_dev(simple_strtoul(name, &p, 16));
-			if (*p)
-				goto fail;
-		}
-		goto done;
-	}
-
-	name += 5;
-	res = Root_NFS;
-	if (strcmp(name, "nfs") == 0)
-		goto done;
-	res = Root_CIFS;
-	if (strcmp(name, "cifs") == 0)
-		goto done;
-	res = Root_RAM0;
-	if (strcmp(name, "ram") == 0)
-		goto done;
-
-	if (strlen(name) > 31)
-		goto fail;
-	strcpy(s, name);
-	for (p = s; *p; p++)
-		if (*p == '/')
-			*p = '!';
-	res = blk_lookup_devt(s, 0);
-	if (res)
-		goto done;
-
-	/*
-	 * try non-existent, but valid partition, which may only exist
-	 * after revalidating the disk, like partitioned md devices
-	 */
-	while (p > s && isdigit(p[-1]))
-		p--;
-	if (p == s || !*p || *p == '0')
-		goto fail;
-
-	/* try disk name without <part number> */
-	part = simple_strtoul(p, NULL, 10);
-	*p = '\0';
-	res = blk_lookup_devt(s, part);
-	if (res)
-		goto done;
-
-	/* try disk name without p<part number> */
-	if (p < s + 2 || !isdigit(p[-2]) || p[-1] != 'p')
-		goto fail;
-	p[-1] = '\0';
-	res = blk_lookup_devt(s, part);
-	if (res)
-		goto done;
-
-fail:
-	return 0;
-done:
-	return res;
+	return devt_from_devnum(name);
 }
 EXPORT_SYMBOL_GPL(name_to_dev_t);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 57/78] init: refactor devt_from_partuuid
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (55 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 56/78] init: refactor name_to_dev_t Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:33   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label Christoph Hellwig
                   ` (22 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

The code in devt_from_partuuid is very convoluted.  Refactor a bit by
sanitizing the goto and variable name usage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 init/do_mounts.c | 68 ++++++++++++++++++++++--------------------------
 1 file changed, 31 insertions(+), 37 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index aef2f24461c7f1..afa26a4028d25e 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -105,13 +105,10 @@ static int match_dev_by_uuid(struct device *dev, const void *data)
  */
 static dev_t devt_from_partuuid(const char *uuid_str)
 {
-	dev_t res = 0;
 	struct uuidcmp cmp;
 	struct device *dev = NULL;
-	struct gendisk *disk;
-	struct hd_struct *part;
+	dev_t devt = 0;
 	int offset = 0;
-	bool clear_root_wait = false;
 	char *slash;
 
 	cmp.uuid = uuid_str;
@@ -120,52 +117,49 @@ static dev_t devt_from_partuuid(const char *uuid_str)
 	/* Check for optional partition number offset attributes. */
 	if (slash) {
 		char c = 0;
+
 		/* Explicitly fail on poor PARTUUID syntax. */
-		if (sscanf(slash + 1,
-			   "PARTNROFF=%d%c", &offset, &c) != 1) {
-			clear_root_wait = true;
-			goto done;
-		}
+		if (sscanf(slash + 1, "PARTNROFF=%d%c", &offset, &c) != 1)
+			goto clear_root_wait;
 		cmp.len = slash - uuid_str;
 	} else {
 		cmp.len = strlen(uuid_str);
 	}
 
-	if (!cmp.len) {
-		clear_root_wait = true;
-		goto done;
-	}
+	if (!cmp.len)
+		goto clear_root_wait;
 
-	dev = class_find_device(&block_class, NULL, &cmp,
-				&match_dev_by_uuid);
+	dev = class_find_device(&block_class, NULL, &cmp, &match_dev_by_uuid);
 	if (!dev)
-		goto done;
-
-	res = dev->devt;
+		return 0;
 
-	/* Attempt to find the partition by offset. */
-	if (!offset)
-		goto no_offset;
+	if (offset) {
+		/*
+		 * Attempt to find the requested partition by adding an offset
+		 * to the partition number found by UUID.
+		 */
+		struct hd_struct *part;
 
-	res = 0;
-	disk = part_to_disk(dev_to_part(dev));
-	part = disk_get_part(disk, dev_to_part(dev)->partno + offset);
-	if (part) {
-		res = part_devt(part);
-		put_device(part_to_dev(part));
+		part = disk_get_part(dev_to_disk(dev),
+				     dev_to_part(dev)->partno + offset);
+		if (part) {
+			devt = part_devt(part);
+			put_device(part_to_dev(part));
+		}
+	} else {
+		devt = dev->devt;
 	}
 
-no_offset:
 	put_device(dev);
-done:
-	if (clear_root_wait) {
-		pr_err("VFS: PARTUUID= is invalid.\n"
-		       "Expected PARTUUID=<valid-uuid-id>[/PARTNROFF=%%d]\n");
-		if (root_wait)
-			pr_err("Disabling rootwait; root= is invalid.\n");
-		root_wait = 0;
-	}
-	return res;
+	return devt;
+
+clear_root_wait:
+	pr_err("VFS: PARTUUID= is invalid.\n"
+	       "Expected PARTUUID=<valid-uuid-id>[/PARTNROFF=%%d]\n");
+	if (root_wait)
+		pr_err("Disabling rootwait; root= is invalid.\n");
+	root_wait = 0;
+	return 0;
 }
 
 /**
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (56 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 57/78] init: refactor devt_from_partuuid Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:34   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal Christoph Hellwig
                   ` (21 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Avoid a totally pointless goto label, and use the same style of
comparism for both helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 init/do_mounts.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index afa26a4028d25e..5879edf083b318 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -79,15 +79,10 @@ static int match_dev_by_uuid(struct device *dev, const void *data)
 	const struct uuidcmp *cmp = data;
 	struct hd_struct *part = dev_to_part(dev);
 
-	if (!part->info)
-		goto no_match;
-
-	if (strncasecmp(cmp->uuid, part->info->uuid, cmp->len))
-		goto no_match;
-
+	if (!part->info ||
+	    strncasecmp(cmp->uuid, part->info->uuid, cmp->len))
+		return 0;
 	return 1;
-no_match:
-	return 0;
 }
 
 /**
@@ -174,10 +169,9 @@ static int match_dev_by_label(struct device *dev, const void *data)
 	const char *label = data;
 	struct hd_struct *part = dev_to_part(dev);
 
-	if (part->info && !strcmp(label, part->info->volname))
-		return 1;
-
-	return 0;
+	if (!part->info || strcmp(label, part->info->volname))
+		return 0;
+	return 1;
 }
 
 static dev_t devt_from_partlabel(const char *label)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (57 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:35   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 60/78] zram: remove the claim mechanism Christoph Hellwig
                   ` (20 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

del_gendisk already calls fsync_bdev for every partition, no need
to do this twice.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/mtip32xx/mtip32xx.c | 15 ---------------
 drivers/block/mtip32xx/mtip32xx.h |  2 --
 2 files changed, 17 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 153e2cdecb4d40..53ac59d19ae530 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3687,7 +3687,6 @@ static int mtip_block_initialize(struct driver_data *dd)
 	/* Enable the block device and add it to /dev */
 	device_add_disk(&dd->pdev->dev, dd->disk, NULL);
 
-	dd->bdev = bdget_disk(dd->disk, 0);
 	/*
 	 * Now that the disk is active, initialize any sysfs attributes
 	 * managed by the protocol layer.
@@ -3721,9 +3720,6 @@ static int mtip_block_initialize(struct driver_data *dd)
 	return rv;
 
 kthread_run_error:
-	bdput(dd->bdev);
-	dd->bdev = NULL;
-
 	/* Delete our gendisk. This also removes the device from /dev */
 	del_gendisk(dd->disk);
 
@@ -3804,14 +3800,6 @@ static int mtip_block_remove(struct driver_data *dd)
 	blk_mq_tagset_busy_iter(&dd->tags, mtip_no_dev_cleanup, dd);
 	blk_mq_unquiesce_queue(dd->queue);
 
-	/*
-	 * Delete our gendisk structure. This also removes the device
-	 * from /dev
-	 */
-	if (dd->bdev) {
-		bdput(dd->bdev);
-		dd->bdev = NULL;
-	}
 	if (dd->disk) {
 		if (test_bit(MTIP_DDF_INIT_DONE_BIT, &dd->dd_flag))
 			del_gendisk(dd->disk);
@@ -4206,9 +4194,6 @@ static void mtip_pci_remove(struct pci_dev *pdev)
 	} while (atomic_read(&dd->irq_workers_active) != 0 &&
 		time_before(jiffies, to));
 
-	if (!dd->sr)
-		fsync_bdev(dd->bdev);
-
 	if (atomic_read(&dd->irq_workers_active) != 0) {
 		dev_warn(&dd->pdev->dev,
 			"Completion workers still active!\n");
diff --git a/drivers/block/mtip32xx/mtip32xx.h b/drivers/block/mtip32xx/mtip32xx.h
index e22a7f0523bf30..88f4206310e4c8 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -463,8 +463,6 @@ struct driver_data {
 
 	int isr_binding;
 
-	struct block_device *bdev;
-
 	struct list_head online_list; /* linkage for online list */
 
 	struct list_head remove_list; /* linkage for removing list */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 60/78] zram: remove the claim mechanism
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (58 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:37   ` Hannes Reinecke
  2020-11-26  1:11   ` Minchan Kim
  2020-11-16 14:57 ` [PATCH 61/78] zram: do not call set_blocksize Christoph Hellwig
                   ` (19 subsequent siblings)
  79 siblings, 2 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

The zram claim mechanism was added to ensure no new opens come in
during teardown.  But the proper way to archive that is to call
del_gendisk first, which takes care of all that.  Once del_gendisk
is called in the right place, the reset side can also be simplified
as no I/O can be outstanding on a block device that is not open.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/zram/zram_drv.c | 76 ++++++++++-------------------------
 1 file changed, 21 insertions(+), 55 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6d15d51cee2b7e..3641434a9b154d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1756,64 +1756,33 @@ static ssize_t disksize_store(struct device *dev,
 static ssize_t reset_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
-	int ret;
-	unsigned short do_reset;
-	struct zram *zram;
+	struct zram *zram = dev_to_zram(dev);
 	struct block_device *bdev;
+	unsigned short do_reset;
+	int ret = 0;
 
 	ret = kstrtou16(buf, 10, &do_reset);
 	if (ret)
 		return ret;
-
 	if (!do_reset)
 		return -EINVAL;
 
-	zram = dev_to_zram(dev);
 	bdev = bdget_disk(zram->disk, 0);
 	if (!bdev)
 		return -ENOMEM;
 
 	mutex_lock(&bdev->bd_mutex);
-	/* Do not reset an active device or claimed device */
-	if (bdev->bd_openers || zram->claim) {
-		mutex_unlock(&bdev->bd_mutex);
-		bdput(bdev);
-		return -EBUSY;
-	}
-
-	/* From now on, anyone can't open /dev/zram[0-9] */
-	zram->claim = true;
+	if (bdev->bd_openers)
+		ret = -EBUSY;
+	else
+		zram_reset_device(zram);
 	mutex_unlock(&bdev->bd_mutex);
-
-	/* Make sure all the pending I/O are finished */
-	fsync_bdev(bdev);
-	zram_reset_device(zram);
 	bdput(bdev);
 
-	mutex_lock(&bdev->bd_mutex);
-	zram->claim = false;
-	mutex_unlock(&bdev->bd_mutex);
-
-	return len;
-}
-
-static int zram_open(struct block_device *bdev, fmode_t mode)
-{
-	int ret = 0;
-	struct zram *zram;
-
-	WARN_ON(!mutex_is_locked(&bdev->bd_mutex));
-
-	zram = bdev->bd_disk->private_data;
-	/* zram was claimed to reset so open request fails */
-	if (zram->claim)
-		ret = -EBUSY;
-
-	return ret;
+	return ret ? ret : len;
 }
 
 static const struct block_device_operations zram_devops = {
-	.open = zram_open,
 	.submit_bio = zram_submit_bio,
 	.swap_slot_free_notify = zram_slot_free_notify,
 	.rw_page = zram_rw_page,
@@ -1821,7 +1790,6 @@ static const struct block_device_operations zram_devops = {
 };
 
 static const struct block_device_operations zram_wb_devops = {
-	.open = zram_open,
 	.submit_bio = zram_submit_bio,
 	.swap_slot_free_notify = zram_slot_free_notify,
 	.owner = THIS_MODULE
@@ -1972,34 +1940,32 @@ static int zram_add(void)
 	return ret;
 }
 
-static int zram_remove(struct zram *zram)
+static bool zram_busy(struct zram *zram)
 {
 	struct block_device *bdev;
+	bool busy = false;
 
 	bdev = bdget_disk(zram->disk, 0);
-	if (!bdev)
-		return -ENOMEM;
-
-	mutex_lock(&bdev->bd_mutex);
-	if (bdev->bd_openers || zram->claim) {
-		mutex_unlock(&bdev->bd_mutex);
+	if (bdev) {
+		if (bdev->bd_openers)
+			busy = true;
 		bdput(bdev);
-		return -EBUSY;
 	}
 
-	zram->claim = true;
-	mutex_unlock(&bdev->bd_mutex);
+	return busy;
+}
 
-	zram_debugfs_unregister(zram);
+static int zram_remove(struct zram *zram)
+{
+	if (zram_busy(zram))
+		return -EBUSY;
 
-	/* Make sure all the pending I/O are finished */
-	fsync_bdev(bdev);
+	del_gendisk(zram->disk);
+	zram_debugfs_unregister(zram);
 	zram_reset_device(zram);
-	bdput(bdev);
 
 	pr_info("Removed device: %s\n", zram->disk->disk_name);
 
-	del_gendisk(zram->disk);
 	blk_cleanup_queue(zram->disk->queue);
 	put_disk(zram->disk);
 	kfree(zram);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 61/78] zram:  do not call set_blocksize
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (59 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 60/78] zram: remove the claim mechanism Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:38   ` Hannes Reinecke
  2020-11-26  1:16   ` Minchan Kim
  2020-11-16 14:57 ` [PATCH 62/78] loop: " Christoph Hellwig
                   ` (18 subsequent siblings)
  79 siblings, 2 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

set_blocksize is used by file systems to use their preferred buffer cache
block size.  Block drivers should not set it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/zram/zram_drv.c | 11 +----------
 drivers/block/zram/zram_drv.h |  1 -
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 3641434a9b154d..d00b5761ec0b21 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -403,13 +403,10 @@ static void reset_bdev(struct zram *zram)
 		return;
 
 	bdev = zram->bdev;
-	if (zram->old_block_size)
-		set_blocksize(bdev, zram->old_block_size);
 	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
 	/* hope filp_close flush all of IO */
 	filp_close(zram->backing_dev, NULL);
 	zram->backing_dev = NULL;
-	zram->old_block_size = 0;
 	zram->bdev = NULL;
 	zram->disk->fops = &zram_devops;
 	kvfree(zram->bitmap);
@@ -454,7 +451,7 @@ static ssize_t backing_dev_store(struct device *dev,
 	struct file *backing_dev = NULL;
 	struct inode *inode;
 	struct address_space *mapping;
-	unsigned int bitmap_sz, old_block_size = 0;
+	unsigned int bitmap_sz;
 	unsigned long nr_pages, *bitmap = NULL;
 	struct block_device *bdev = NULL;
 	int err;
@@ -509,14 +506,8 @@ static ssize_t backing_dev_store(struct device *dev,
 		goto out;
 	}
 
-	old_block_size = block_size(bdev);
-	err = set_blocksize(bdev, PAGE_SIZE);
-	if (err)
-		goto out;
-
 	reset_bdev(zram);
 
-	zram->old_block_size = old_block_size;
 	zram->bdev = bdev;
 	zram->backing_dev = backing_dev;
 	zram->bitmap = bitmap;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index f2fd46daa76045..712354a4207c77 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -118,7 +118,6 @@ struct zram {
 	bool wb_limit_enable;
 	u64 bd_wb_limit;
 	struct block_device *bdev;
-	unsigned int old_block_size;
 	unsigned long *bitmap;
 	unsigned long nr_pages;
 #endif
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 62/78] loop: do not call set_blocksize
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (60 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 61/78] zram: do not call set_blocksize Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:38   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 63/78] bcache: remove a superflous lookup_bdev all Christoph Hellwig
                   ` (17 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

set_blocksize is used by file systems to use their preferred buffer cache
block size.  Block drivers should not set it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9a27d4f1c08aac..b42c728620c9e4 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1164,9 +1164,6 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	size = get_loop_size(lo, file);
 	loop_set_size(lo, size);
 
-	set_blocksize(bdev, S_ISBLK(inode->i_mode) ?
-		      block_size(inode->i_bdev) : PAGE_SIZE);
-
 	lo->lo_state = Lo_bound;
 	if (part_shift)
 		lo->lo_flags |= LO_FLAGS_PARTSCAN;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 63/78] bcache: remove a superflous lookup_bdev all
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (61 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 62/78] loop: " Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-16 14:57 ` [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush Christoph Hellwig
                   ` (16 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Don't bother to call lookup_bdev for just a slightly different error
message without any functional change.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/bcache/super.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 46a00134a36ae1..d36ccdda16ed2e 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2538,15 +2538,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 				  sb);
 	if (IS_ERR(bdev)) {
 		if (bdev == ERR_PTR(-EBUSY)) {
-			bdev = lookup_bdev(strim(path));
-			mutex_lock(&bch_register_lock);
-			if (!IS_ERR(bdev) && bch_is_open(bdev))
-				err = "device already registered";
-			else
-				err = "device busy";
-			mutex_unlock(&bch_register_lock);
-			if (!IS_ERR(bdev))
-				bdput(bdev);
+			err = "device busy";
 			if (attr == &ksysfs_register_quiet)
 				goto done;
 		}
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (62 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 63/78] bcache: remove a superflous lookup_bdev all Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:41   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 65/78] dm: remove the block_device reference in struct mapped_device Christoph Hellwig
                   ` (15 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

We don't really need the struct block_device to initialize a bio.  So
switch from using bio_set_dev to manually setting up bi_disk (bi_partno
will always be zero and has been cleared by bio_init already).

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/dm.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 54739f1b579bc8..6d7eb72d41f9ea 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1422,18 +1422,12 @@ static int __send_empty_flush(struct clone_info *ci)
 	 */
 	bio_init(&flush_bio, NULL, 0);
 	flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
+	flush_bio.bi_disk = ci->io->md->disk;
+	bio_associate_blkg(&flush_bio);
+
 	ci->bio = &flush_bio;
 	ci->sector_count = 0;
 
-	/*
-	 * Empty flush uses a statically initialized bio, as the base for
-	 * cloning.  However, blkg association requires that a bdev is
-	 * associated with a gendisk, which doesn't happen until the bdev is
-	 * opened.  So, blkg association is done at issue time of the flush
-	 * rather than when the device is created in alloc_dev().
-	 */
-	bio_set_dev(ci->bio, ci->io->md->bdev);
-
 	BUG_ON(bio_has_data(ci->bio));
 	while ((ti = dm_table_get_target(ci->map, target_nr++)))
 		__send_duplicate_bios(ci, ti, ti->num_flush_bios, NULL);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 65/78] dm: remove the block_device reference in struct mapped_device
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (63 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:43   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 66/78] block: keep a block_device reference for each hd_struct Christoph Hellwig
                   ` (14 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Get rid of the long-lasting struct block_device reference in
struct mapped_device.  The only remaining user is the freeze code,
where we can trivially look up the block device at freeze time
and release the reference at thaw time.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/dm-core.h |  2 --
 drivers/md/dm.c      | 22 +++++++++++-----------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index d522093cb39dda..b1b400ed76fe90 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -107,8 +107,6 @@ struct mapped_device {
 	/* kobject and completion */
 	struct dm_kobject_holder kobj_holder;
 
-	struct block_device *bdev;
-
 	struct dm_stats stats;
 
 	/* for blk-mq request-based DM support */
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6d7eb72d41f9ea..c789ffea2badde 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1744,11 +1744,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
 
 	cleanup_srcu_struct(&md->io_barrier);
 
-	if (md->bdev) {
-		bdput(md->bdev);
-		md->bdev = NULL;
-	}
-
 	mutex_destroy(&md->suspend_lock);
 	mutex_destroy(&md->type_lock);
 	mutex_destroy(&md->table_devices_lock);
@@ -1840,10 +1835,6 @@ static struct mapped_device *alloc_dev(int minor)
 	if (!md->wq)
 		goto bad;
 
-	md->bdev = bdget_disk(md->disk, 0);
-	if (!md->bdev)
-		goto bad;
-
 	dm_stats_init(&md->stats);
 
 	/* Populate the mapping, nobody knows we exist yet */
@@ -2384,12 +2375,17 @@ struct dm_table *dm_swap_table(struct mapped_device *md, struct dm_table *table)
  */
 static int lock_fs(struct mapped_device *md)
 {
+	struct block_device *bdev;
 	int r;
 
 	WARN_ON(md->frozen_sb);
 
-	md->frozen_sb = freeze_bdev(md->bdev);
+	bdev = bdget_disk(md->disk, 0);
+	if (!bdev)
+		return -ENOMEM;
+	md->frozen_sb = freeze_bdev(bdev);
 	if (IS_ERR(md->frozen_sb)) {
+		bdput(bdev);
 		r = PTR_ERR(md->frozen_sb);
 		md->frozen_sb = NULL;
 		return r;
@@ -2402,10 +2398,14 @@ static int lock_fs(struct mapped_device *md)
 
 static void unlock_fs(struct mapped_device *md)
 {
+	struct block_device *bdev;
+
 	if (!test_bit(DMF_FROZEN, &md->flags))
 		return;
 
-	thaw_bdev(md->bdev, md->frozen_sb);
+	bdev = md->frozen_sb->s_bdev;
+	thaw_bdev(bdev, md->frozen_sb);
+	bdput(bdev);
 	md->frozen_sb = NULL;
 	clear_bit(DMF_FROZEN, &md->flags);
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 66/78] block: keep a block_device reference for each hd_struct
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (64 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 65/78] dm: remove the block_device reference in struct mapped_device Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:50   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 67/78] block: simplify the block device claiming interface Christoph Hellwig
                   ` (13 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

To simplify block device lookup and a few other upcomdin areas, make sure
that we always have a struct block_device available for each disk and
each partition.  The only downside of this is that each device and
partition uses a little more memories.  The upside will be that a lot of
code can be simplified.

With that all we need to look up the block device is to lookup the inode
and do a few sanity checks on the gendisk, instead of the separate lookup
for the gendisk.

As part of the change switch bdget() to only find existing block devices,
given that we know that the block_device structure must be allocated at
probe / partition scan time.

blk-cgroup needed a bit of a special treatment as the only place that
wanted to lookup a gendisk outside of the normal blkdev_get path.  It is
switched to lookup using the block device hash now that this is the
primary lookup path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-cgroup.c         |  42 ++++-----
 block/blk-iocost.c         |  36 +++----
 block/blk.h                |   1 -
 block/genhd.c              | 188 +++----------------------------------
 block/partitions/core.c    |  28 +++---
 fs/block_dev.c             | 133 +++++++++++++++-----------
 include/linux/blk-cgroup.h |   4 +-
 include/linux/blkdev.h     |   3 +
 include/linux/genhd.h      |   4 +-
 9 files changed, 153 insertions(+), 286 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 54fbe1e80cc41a..4c0ae0f6bce02d 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -556,22 +556,22 @@ static struct blkcg_gq *blkg_lookup_check(struct blkcg *blkcg,
 }
 
 /**
- * blkg_conf_prep - parse and prepare for per-blkg config update
+ * blkcg_conf_get_bdev - parse and open bdev for per-blkg config update
  * @inputp: input string pointer
  *
  * Parse the device node prefix part, MAJ:MIN, of per-blkg config update
- * from @input and get and return the matching gendisk.  *@inputp is
+ * from @input and get and return the matching bdev.  *@inputp is
  * updated to point past the device node prefix.  Returns an ERR_PTR()
  * value on error.
  *
  * Use this function iff blkg_conf_prep() can't be used for some reason.
  */
-struct gendisk *blkcg_conf_get_disk(char **inputp)
+struct block_device *blkcg_conf_get_bdev(char **inputp)
 {
 	char *input = *inputp;
 	unsigned int major, minor;
-	struct gendisk *disk;
-	int key_len, part;
+	struct block_device *bdev;
+	int key_len;
 
 	if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) != 2)
 		return ERR_PTR(-EINVAL);
@@ -581,16 +581,16 @@ struct gendisk *blkcg_conf_get_disk(char **inputp)
 		return ERR_PTR(-EINVAL);
 	input = skip_spaces(input);
 
-	disk = get_gendisk(MKDEV(major, minor), &part);
-	if (!disk)
+	bdev = bdget(MKDEV(major, minor));
+	if (!bdev)
 		return ERR_PTR(-ENODEV);
-	if (part) {
-		put_disk_and_module(disk);
+	if (bdev_is_partition(bdev)) {
+		bdput(bdev);
 		return ERR_PTR(-ENODEV);
 	}
 
 	*inputp = input;
-	return disk;
+	return bdev;
 }
 
 /**
@@ -607,18 +607,18 @@ struct gendisk *blkcg_conf_get_disk(char **inputp)
  */
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 		   char *input, struct blkg_conf_ctx *ctx)
-	__acquires(rcu) __acquires(&disk->queue->queue_lock)
+	__acquires(rcu) __acquires(&bdev->bd_disk->queue->queue_lock)
 {
-	struct gendisk *disk;
+	struct block_device *bdev;
 	struct request_queue *q;
 	struct blkcg_gq *blkg;
 	int ret;
 
-	disk = blkcg_conf_get_disk(&input);
-	if (IS_ERR(disk))
-		return PTR_ERR(disk);
+	bdev = blkcg_conf_get_bdev(&input);
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
 
-	q = disk->queue;
+	q = bdev->bd_disk->queue;
 
 	rcu_read_lock();
 	spin_lock_irq(&q->queue_lock);
@@ -689,7 +689,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 			goto success;
 	}
 success:
-	ctx->disk = disk;
+	ctx->bdev = bdev;
 	ctx->blkg = blkg;
 	ctx->body = input;
 	return 0;
@@ -700,7 +700,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	spin_unlock_irq(&q->queue_lock);
 	rcu_read_unlock();
 fail:
-	put_disk_and_module(disk);
+	bdput(bdev);
 	/*
 	 * If queue was bypassing, we should retry.  Do so after a
 	 * short msleep().  It isn't strictly necessary but queue
@@ -723,11 +723,11 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep);
  * with blkg_conf_prep().
  */
 void blkg_conf_finish(struct blkg_conf_ctx *ctx)
-	__releases(&ctx->disk->queue->queue_lock) __releases(rcu)
+	__releases(&ctx->bdev->bd_disk->queue->queue_lock) __releases(rcu)
 {
-	spin_unlock_irq(&ctx->disk->queue->queue_lock);
+	spin_unlock_irq(&ctx->bdev->bd_disk->queue->queue_lock);
 	rcu_read_unlock();
-	put_disk_and_module(ctx->disk);
+	bdput(ctx->bdev);
 }
 EXPORT_SYMBOL_GPL(blkg_conf_finish);
 
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index bbe86d1199dc5b..bd8bfccf6b9ec3 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3120,23 +3120,23 @@ static const match_table_t qos_tokens = {
 static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 			     size_t nbytes, loff_t off)
 {
-	struct gendisk *disk;
+	struct block_device *bdev;
 	struct ioc *ioc;
 	u32 qos[NR_QOS_PARAMS];
 	bool enable, user;
 	char *p;
 	int ret;
 
-	disk = blkcg_conf_get_disk(&input);
-	if (IS_ERR(disk))
-		return PTR_ERR(disk);
+	bdev = blkcg_conf_get_bdev(&input);
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
 
-	ioc = q_to_ioc(disk->queue);
+	ioc = q_to_ioc(bdev->bd_disk->queue);
 	if (!ioc) {
-		ret = blk_iocost_init(disk->queue);
+		ret = blk_iocost_init(bdev->bd_disk->queue);
 		if (ret)
 			goto err;
-		ioc = q_to_ioc(disk->queue);
+		ioc = q_to_ioc(bdev->bd_disk->queue);
 	}
 
 	spin_lock_irq(&ioc->lock);
@@ -3231,12 +3231,12 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	ioc_refresh_params(ioc, true);
 	spin_unlock_irq(&ioc->lock);
 
-	put_disk_and_module(disk);
+	bdput(bdev);
 	return nbytes;
 einval:
 	ret = -EINVAL;
 err:
-	put_disk_and_module(disk);
+	bdput(bdev);
 	return ret;
 }
 
@@ -3287,23 +3287,23 @@ static const match_table_t i_lcoef_tokens = {
 static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 				    size_t nbytes, loff_t off)
 {
-	struct gendisk *disk;
+	struct block_device *bdev;
 	struct ioc *ioc;
 	u64 u[NR_I_LCOEFS];
 	bool user;
 	char *p;
 	int ret;
 
-	disk = blkcg_conf_get_disk(&input);
-	if (IS_ERR(disk))
-		return PTR_ERR(disk);
+	bdev = blkcg_conf_get_bdev(&input);
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
 
-	ioc = q_to_ioc(disk->queue);
+	ioc = q_to_ioc(bdev->bd_disk->queue);
 	if (!ioc) {
-		ret = blk_iocost_init(disk->queue);
+		ret = blk_iocost_init(bdev->bd_disk->queue);
 		if (ret)
 			goto err;
-		ioc = q_to_ioc(disk->queue);
+		ioc = q_to_ioc(bdev->bd_disk->queue);
 	}
 
 	spin_lock_irq(&ioc->lock);
@@ -3356,13 +3356,13 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 	ioc_refresh_params(ioc, true);
 	spin_unlock_irq(&ioc->lock);
 
-	put_disk_and_module(disk);
+	bdput(bdev);
 	return nbytes;
 
 einval:
 	ret = -EINVAL;
 err:
-	put_disk_and_module(disk);
+	bdput(bdev);
 	return ret;
 }
 
diff --git a/block/blk.h b/block/blk.h
index dfab98465db9a5..d74159bf61eb8f 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -352,7 +352,6 @@ struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, sector_t sector);
 
 int blk_alloc_devt(struct hd_struct *part, dev_t *devt);
 void blk_free_devt(dev_t devt);
-void blk_invalidate_devt(dev_t devt);
 char *disk_name(struct gendisk *hd, int partno, char *buf);
 #define ADDPART_FLAG_NONE	0
 #define ADDPART_FLAG_RAID	1
diff --git a/block/genhd.c b/block/genhd.c
index 4a224a3c8e1071..40ec5473a21dd2 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -27,17 +27,9 @@
 
 static struct kobject *block_depr;
 
-static DEFINE_XARRAY(bdev_map);
-static DEFINE_MUTEX(bdev_map_lock);
-
 /* for extended dynamic devt allocation, currently only one major is used */
 #define NR_EXT_DEVT		(1 << MINORBITS)
-
-/* For extended devt allocation.  ext_devt_lock prevents look up
- * results from going away underneath its user.
- */
-static DEFINE_SPINLOCK(ext_devt_lock);
-static DEFINE_IDR(ext_devt_idr);
+static DEFINE_IDA(ext_devt_ida);
 
 static void disk_check_events(struct disk_events *ev,
 			      unsigned int *clearing_ptr);
@@ -578,14 +570,7 @@ int blk_alloc_devt(struct hd_struct *part, dev_t *devt)
 		return 0;
 	}
 
-	/* allocate ext devt */
-	idr_preload(GFP_KERNEL);
-
-	spin_lock_bh(&ext_devt_lock);
-	idx = idr_alloc(&ext_devt_idr, part, 0, NR_EXT_DEVT, GFP_NOWAIT);
-	spin_unlock_bh(&ext_devt_lock);
-
-	idr_preload_end();
+	idx = ida_alloc_range(&ext_devt_ida, 0, NR_EXT_DEVT, GFP_KERNEL);
 	if (idx < 0)
 		return idx == -ENOSPC ? -EBUSY : idx;
 
@@ -604,26 +589,8 @@ int blk_alloc_devt(struct hd_struct *part, dev_t *devt)
  */
 void blk_free_devt(dev_t devt)
 {
-	if (devt == MKDEV(0, 0))
-		return;
-
-	if (MAJOR(devt) == BLOCK_EXT_MAJOR) {
-		spin_lock_bh(&ext_devt_lock);
-		idr_remove(&ext_devt_idr, blk_mangle_minor(MINOR(devt)));
-		spin_unlock_bh(&ext_devt_lock);
-	}
-}
-
-/*
- * We invalidate devt by assigning NULL pointer for devt in idr.
- */
-void blk_invalidate_devt(dev_t devt)
-{
-	if (MAJOR(devt) == BLOCK_EXT_MAJOR) {
-		spin_lock_bh(&ext_devt_lock);
-		idr_replace(&ext_devt_idr, NULL, blk_mangle_minor(MINOR(devt)));
-		spin_unlock_bh(&ext_devt_lock);
-	}
+	if (MAJOR(devt) == BLOCK_EXT_MAJOR)
+		ida_free(&ext_devt_ida, blk_mangle_minor(MINOR(devt)));
 }
 
 static char *bdevt_str(dev_t devt, char *buf)
@@ -638,28 +605,6 @@ static char *bdevt_str(dev_t devt, char *buf)
 	return buf;
 }
 
-static void blk_register_region(struct gendisk *disk)
-{
-	int i;
-
-	mutex_lock(&bdev_map_lock);
-	for (i = 0; i < disk->minors; i++) {
-		if (xa_insert(&bdev_map, disk_devt(disk) + i, disk, GFP_KERNEL))
-			WARN_ON_ONCE(1);
-	}
-	mutex_unlock(&bdev_map_lock);
-}
-
-static void blk_unregister_region(struct gendisk *disk)
-{
-	int i;
-
-	mutex_lock(&bdev_map_lock);
-	for (i = 0; i < disk->minors; i++)
-		xa_erase(&bdev_map, disk_devt(disk) + i);
-	mutex_unlock(&bdev_map_lock);
-}
-
 static void disk_scan_partitions(struct gendisk *disk)
 {
 	struct block_device *bdev;
@@ -803,7 +748,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 		ret = bdi_register(bdi, "%u:%u", MAJOR(devt), MINOR(devt));
 		WARN_ON(ret);
 		bdi_set_owner(bdi, dev);
-		blk_register_region(disk);
+		bdev_add(disk->part0.bdev, devt);
 	}
 	register_disk(parent, disk, groups);
 	if (register_queue)
@@ -914,16 +859,6 @@ void del_gendisk(struct gendisk *disk)
 	}
 
 	blk_unregister_queue(disk);
-	
-	if (!(disk->flags & GENHD_FL_HIDDEN))
-		blk_unregister_region(disk);
-	/*
-	 * Remove gendisk pointer from idr so that it cannot be looked up
-	 * while RCU period before freeing gendisk is running to prevent
-	 * use-after-free issues. Note that the device number stays
-	 * "in-use" until we really free the gendisk.
-	 */
-	blk_invalidate_devt(disk_devt(disk));
 
 	kobject_put(disk->part0.holder_dir);
 	kobject_put(disk->slave_dir);
@@ -962,7 +897,7 @@ static ssize_t disk_badblocks_store(struct device *dev,
 	return badblocks_store(disk->bb, page, len, 0);
 }
 
-static void request_gendisk_module(dev_t devt)
+void blk_request_module(dev_t devt)
 {
 	unsigned int major = MAJOR(devt);
 	struct blk_major_name **n;
@@ -982,84 +917,6 @@ static void request_gendisk_module(dev_t devt)
 		request_module("block-major-%d", MAJOR(devt));
 }
 
-static bool get_disk_and_module(struct gendisk *disk)
-{
-	struct module *owner;
-
-	if (!disk->fops)
-		return false;
-	owner = disk->fops->owner;
-	if (owner && !try_module_get(owner))
-		return false;
-	if (!kobject_get_unless_zero(&disk_to_dev(disk)->kobj)) {
-		module_put(owner);
-		return false;
-	}
-	return true;
-
-}
-
-/**
- * get_gendisk - get partitioning information for a given device
- * @devt: device to get partitioning information for
- * @partno: returned partition index
- *
- * This function gets the structure containing partitioning
- * information for the given device @devt.
- *
- * Context: can sleep
- */
-struct gendisk *get_gendisk(dev_t devt, int *partno)
-{
-	struct gendisk *disk = NULL;
-
-	might_sleep();
-
-	if (MAJOR(devt) != BLOCK_EXT_MAJOR) {
-		mutex_lock(&bdev_map_lock);
-		disk = xa_load(&bdev_map, devt);
-		if (!disk) {
-			mutex_unlock(&bdev_map_lock);
-			request_gendisk_module(devt);
-			mutex_lock(&bdev_map_lock);
-			disk = xa_load(&bdev_map, devt);
-		}
-		if (disk && !get_disk_and_module(disk))
-			disk = NULL;
-		if (disk)
-			*partno = devt - disk_devt(disk);
-		mutex_unlock(&bdev_map_lock);
-	} else {
-		struct hd_struct *part;
-
-		spin_lock_bh(&ext_devt_lock);
-		part = idr_find(&ext_devt_idr, blk_mangle_minor(MINOR(devt)));
-		if (part && get_disk_and_module(part_to_disk(part))) {
-			*partno = part->partno;
-			disk = part_to_disk(part);
-		}
-		spin_unlock_bh(&ext_devt_lock);
-	}
-
-	if (!disk)
-		return NULL;
-
-	/*
-	 * Synchronize with del_gendisk() to not return disk that is being
-	 * destroyed.
-	 */
-	down_read(&disk->lookup_sem);
-	if (unlikely((disk->flags & GENHD_FL_HIDDEN) ||
-		     !(disk->flags & GENHD_FL_UP))) {
-		up_read(&disk->lookup_sem);
-		put_disk_and_module(disk);
-		disk = NULL;
-	} else {
-		up_read(&disk->lookup_sem);
-	}
-	return disk;
-}
-
 /**
  * bdget_disk - do bdget() by gendisk and partition number
  * @disk: gendisk of interest
@@ -1557,11 +1414,6 @@ int disk_expand_part_tbl(struct gendisk *disk, int partno)
  *
  * This function releases all allocated resources of the gendisk.
  *
- * The struct gendisk refcount is incremented with get_gendisk() or
- * get_disk_and_module(), and its refcount is decremented with
- * put_disk_and_module() or put_disk(). Once the refcount reaches 0 this
- * function is called.
- *
  * Drivers which used __device_add_disk() have a gendisk with a request_queue
  * assigned. Since the request_queue sits on top of the gendisk for these
  * drivers we also call blk_put_queue() for them, and we expect the
@@ -1746,9 +1598,13 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	if (!disk)
 		return NULL;
 
+	disk->part0.bdev = bdev_alloc(disk, 0);
+	if (!disk->part0.bdev)
+		goto out_free_disk;
+
 	disk->part0.dkstats = alloc_percpu(struct disk_stats);
 	if (!disk->part0.dkstats)
-		goto out_free_disk;
+		goto out_bdput;
 
 	init_rwsem(&disk->lookup_sem);
 	disk->node_id = node_id;
@@ -1782,6 +1638,8 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 
 out_free_part0:
 	hd_free_part(&disk->part0);
+out_bdput:
+	bdput(disk->part0.bdev);
 out_free_disk:
 	kfree(disk);
 	return NULL;
@@ -1805,26 +1663,6 @@ void put_disk(struct gendisk *disk)
 }
 EXPORT_SYMBOL(put_disk);
 
-/**
- * put_disk_and_module - decrements the module and gendisk refcount
- * @disk: the struct gendisk to decrement the refcount for
- *
- * This is a counterpart of get_disk_and_module() and thus also of
- * get_gendisk().
- *
- * Context: Any context, but the last reference must not be dropped from
- *          atomic context.
- */
-void put_disk_and_module(struct gendisk *disk)
-{
-	if (disk) {
-		struct module *owner = disk->fops->owner;
-
-		put_disk(disk);
-		module_put(owner);
-	}
-}
-
 static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 {
 	char event[] = "DISK_RO=1";
diff --git a/block/partitions/core.c b/block/partitions/core.c
index a02e224115943d..8b44f46ab1fbfc 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -340,12 +340,11 @@ void delete_partition(struct hd_struct *part)
 	device_del(part_to_dev(part));
 
 	/*
-	 * Remove gendisk pointer from idr so that it cannot be looked up
-	 * while RCU period before freeing gendisk is running to prevent
-	 * use-after-free issues. Note that the device number stays
-	 * "in-use" until we really free the gendisk.
+	 * Remove the block device from the inode hash, so that it cannot be
+	 * looked up while waiting for the RCU grace period.
 	 */
-	blk_invalidate_devt(part_devt(part));
+	bdput(part->bdev);
+
 	percpu_ref_kill(&part->ref);
 }
 
@@ -402,11 +401,14 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	if (!p)
 		return ERR_PTR(-EBUSY);
 
+	err = -ENOMEM;
 	p->dkstats = alloc_percpu(struct disk_stats);
-	if (!p->dkstats) {
-		err = -ENOMEM;
+	if (!p->dkstats)
 		goto out_free;
-	}
+
+	p->bdev = bdev_alloc(disk, partno);
+	if (!p->bdev)
+		goto out_free_stats;
 
 	hd_sects_seq_init(p);
 	pdev = part_to_dev(p);
@@ -420,10 +422,8 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 		struct partition_meta_info *pinfo;
 
 		pinfo = kzalloc_node(sizeof(*pinfo), GFP_KERNEL, disk->node_id);
-		if (!pinfo) {
-			err = -ENOMEM;
-			goto out_free_stats;
-		}
+		if (!pinfo)
+			goto out_bdput;
 		memcpy(pinfo, info, sizeof(*info));
 		p->info = pinfo;
 	}
@@ -470,6 +470,7 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	}
 
 	/* everything is up and running, commence */
+	bdev_add(p->bdev, devt);
 	rcu_assign_pointer(ptbl->part[partno], p);
 
 	/* suppress uevent if the disk suppresses it */
@@ -479,11 +480,14 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 
 out_free_info:
 	kfree(p->info);
+out_bdput:
+	bdput(p->bdev);
 out_free_stats:
 	free_percpu(p->dkstats);
 out_free:
 	kfree(p);
 	return ERR_PTR(err);
+
 out_remove_file:
 	device_remove_file(pdev, &dev_attr_whole_disk);
 out_del:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 29db12c3bb501c..f36788d7699302 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -795,6 +795,12 @@ static void bdev_free_inode(struct inode *inode)
 	kmem_cache_free(bdev_cachep, BDEV_I(inode));
 }
 
+static void bdev_destroy_inode(struct inode *inode)
+{
+	if (inode->i_rdev)
+		put_device(disk_to_dev(I_BDEV(inode)->bd_disk));
+}
+
 static void init_once(void *foo)
 {
 	struct bdev_inode *ei = (struct bdev_inode *) foo;
@@ -829,6 +835,7 @@ static const struct super_operations bdev_sops = {
 	.statfs = simple_statfs,
 	.alloc_inode = bdev_alloc_inode,
 	.free_inode = bdev_free_inode,
+	.destroy_inode = bdev_destroy_inode,
 	.drop_inode = generic_delete_inode,
 	.evict_inode = bdev_evict_inode,
 };
@@ -870,34 +877,51 @@ void __init bdev_cache_init(void)
 	blockdev_superblock = bd_mnt->mnt_sb;   /* For writeback */
 }
 
-static struct block_device *bdget(dev_t dev)
+struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 {
 	struct block_device *bdev;
 	struct inode *inode;
 
-	inode = iget_locked(blockdev_superblock, dev);
+	inode = new_inode(blockdev_superblock);
 	if (!inode)
 		return NULL;
 
-	bdev = &BDEV_I(inode)->bdev;
+	bdev = I_BDEV(inode);
+	spin_lock_init(&bdev->bd_size_lock);
+	bdev->bd_disk = disk;
+	bdev->bd_partno = partno;
+	bdev->bd_contains = NULL;
+	bdev->bd_super = NULL;
+	bdev->bd_inode = inode;
+	bdev->bd_part_count = 0;
+
+	inode->i_mode = S_IFBLK;
+	inode->i_rdev = 0;
+	inode->i_bdev = bdev;
+	inode->i_data.a_ops = &def_blk_aops;
 
-	if (inode->i_state & I_NEW) {
-		spin_lock_init(&bdev->bd_size_lock);
-		bdev->bd_contains = NULL;
-		bdev->bd_super = NULL;
-		bdev->bd_inode = inode;
-		bdev->bd_part_count = 0;
-		bdev->bd_dev = dev;
-		inode->i_mode = S_IFBLK;
-		inode->i_rdev = dev;
-		inode->i_bdev = bdev;
-		inode->i_data.a_ops = &def_blk_aops;
-		mapping_set_gfp_mask(&inode->i_data, GFP_USER);
-		unlock_new_inode(inode);
-	}
 	return bdev;
 }
 
+void bdev_add(struct block_device *bdev, dev_t dev)
+{
+	bdev->bd_dev = dev;
+	get_device(disk_to_dev(bdev->bd_disk));
+	bdev->bd_inode->i_rdev = dev;
+	bdev->bd_inode->i_ino = dev;
+	insert_inode_hash(bdev->bd_inode);
+}
+
+struct block_device *bdget(dev_t dev)
+{
+	struct inode *inode;
+
+	inode = ilookup(blockdev_superblock, dev);
+	if (!inode)
+		return NULL;
+	return &BDEV_I(inode)->bdev;
+}
+
 /**
  * bdgrab -- Grab a reference to an already referenced block device
  * @bdev:	Block device to grab a reference to.
@@ -957,6 +981,10 @@ static struct block_device *bd_acquire(struct inode *inode)
 		bd_forget(inode);
 
 	bdev = bdget(inode->i_rdev);
+	if (!bdev) {
+		blk_request_module(inode->i_rdev);
+		bdev = bdget(inode->i_rdev);
+	}
 	if (bdev) {
 		spin_lock(&bdev_lock);
 		if (!inode->i_bdev) {
@@ -1067,27 +1095,6 @@ int bd_prepare_to_claim(struct block_device *bdev, struct block_device *whole,
 }
 EXPORT_SYMBOL_GPL(bd_prepare_to_claim); /* only for the loop driver */
 
-static struct gendisk *bdev_get_gendisk(struct block_device *bdev, int *partno)
-{
-	struct gendisk *disk = get_gendisk(bdev->bd_dev, partno);
-
-	if (!disk)
-		return NULL;
-	/*
-	 * Now that we hold gendisk reference we make sure bdev we looked up is
-	 * not stale. If it is, it means device got removed and created before
-	 * we looked up gendisk and we fail open in such case. Associating
-	 * unhashed bdev with newly created gendisk could lead to two bdevs
-	 * (and thus two independent caches) being associated with one device
-	 * which is bad.
-	 */
-	if (inode_unhashed(bdev->bd_inode)) {
-		put_disk_and_module(disk);
-		return NULL;
-	}
-	return disk;
-}
-
 static void bd_clear_claiming(struct block_device *whole, void *holder)
 {
 	lockdep_assert_held(&bdev_lock);
@@ -1404,6 +1411,24 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
  */
 EXPORT_SYMBOL_GPL(bdev_disk_changed);
 
+/*
+ * Synchronize with del_gendisk() to not return a disk that is being destroyed.
+ * Callers needs to drop the reference on disk->fops->owner.
+ */
+static int bdev_get_gendisk(struct gendisk *disk)
+{
+	down_read(&disk->lookup_sem);
+	if ((disk->flags & (GENHD_FL_HIDDEN | GENHD_FL_UP)) != GENHD_FL_UP)
+		goto out_unlock;
+	if (disk->fops->owner && !try_module_get(disk->fops->owner))
+		goto out_unlock;
+	up_read(&disk->lookup_sem);
+	return 0;
+out_unlock:
+	up_read(&disk->lookup_sem);
+	return -ENXIO;
+}
+
 /*
  * bd_mutex locking:
  *
@@ -1415,19 +1440,17 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		int for_part)
 {
 	struct block_device *whole = NULL, *claiming = NULL;
-	struct gendisk *disk;
+	struct gendisk *disk = bdev->bd_disk;
 	int ret;
-	int partno;
 	bool first_open = false, unblock_events = true, need_restart;
 
  restart:
 	need_restart = false;
-	ret = -ENXIO;
-	disk = bdev_get_gendisk(bdev, &partno);
-	if (!disk)
+	ret = bdev_get_gendisk(bdev->bd_disk);
+	if (ret)
 		goto out;
 
-	if (partno) {
+	if (bdev->bd_partno) {
 		whole = bdget_disk(disk, 0);
 		if (!whole) {
 			ret = -ENOMEM;
@@ -1450,13 +1473,11 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	mutex_lock_nested(&bdev->bd_mutex, for_part);
 	if (!bdev->bd_openers) {
 		first_open = true;
-		bdev->bd_disk = disk;
 		bdev->bd_contains = bdev;
-		bdev->bd_partno = partno;
 
-		if (!partno) {
+		if (!bdev->bd_partno) {
 			ret = -ENXIO;
-			bdev->bd_part = disk_get_part(disk, partno);
+			bdev->bd_part = disk_get_part(disk, 0);
 			if (!bdev->bd_part)
 				goto out_clear;
 
@@ -1494,7 +1515,7 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 			if (ret)
 				goto out_clear;
 			bdev->bd_contains = bdgrab(whole);
-			bdev->bd_part = disk_get_part(disk, partno);
+			bdev->bd_part = disk_get_part(disk, bdev->bd_partno);
 			if (!(disk->flags & GENHD_FL_UP) ||
 			    !bdev->bd_part || !bdev->bd_part->nr_sects) {
 				ret = -ENXIO;
@@ -1541,16 +1562,15 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	if (unblock_events)
 		disk_unblock_events(disk);
 
-	/* only one opener holds refs to the module and disk */
+	/* only one opener holds the module reference */
 	if (!first_open)
-		put_disk_and_module(disk);
+		module_put(disk->fops->owner);
 	if (whole)
 		bdput(whole);
 	return 0;
 
  out_clear:
 	disk_put_part(bdev->bd_part);
-	bdev->bd_disk = NULL;
 	bdev->bd_part = NULL;
 	if (bdev != bdev->bd_contains)
 		__blkdev_put(bdev->bd_contains, mode, 1);
@@ -1564,7 +1584,7 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
  	if (whole)
 		bdput(whole);
  out_put_disk:
-	put_disk_and_module(disk);
+	module_put(disk->fops->owner);
 	if (need_restart)
 		goto restart;
  out:
@@ -1680,6 +1700,10 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
 	int err;
 
 	bdev = bdget(dev);
+	if (!bdev) {
+		blk_request_module(dev);
+		bdev = bdget(dev);
+	}
 	if (!bdev)
 		return ERR_PTR(-ENOMEM);
 
@@ -1755,12 +1779,11 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 	if (!bdev->bd_openers) {
 		disk_put_part(bdev->bd_part);
 		bdev->bd_part = NULL;
-		bdev->bd_disk = NULL;
 		if (bdev != bdev->bd_contains)
 			victim = bdev->bd_contains;
 		bdev->bd_contains = NULL;
 
-		put_disk_and_module(disk);
+		module_put(disk->fops->owner);
 	}
 	mutex_unlock(&bdev->bd_mutex);
 	bdput(bdev);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index c8fc9792ac776d..064f14daedebca 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -197,12 +197,12 @@ void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg,
 u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v);
 
 struct blkg_conf_ctx {
-	struct gendisk			*disk;
+	struct block_device		*bdev;
 	struct blkcg_gq			*blkg;
 	char				*body;
 };
 
-struct gendisk *blkcg_conf_get_disk(char **inputp);
+struct block_device *blkcg_conf_get_bdev(char **inputp);
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 		   char *input, struct blkg_conf_ctx *ctx);
 void blkg_conf_finish(struct blkg_conf_ctx *ctx);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 05b346a68c2eee..044d9dd159d882 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1994,6 +1994,9 @@ void bd_abort_claiming(struct block_device *bdev, struct block_device *whole,
 		void *holder);
 void blkdev_put(struct block_device *bdev, fmode_t mode);
 
+struct block_device *bdev_alloc(struct gendisk *disk, u8 partno);
+void bdev_add(struct block_device *bdev, dev_t dev);
+struct block_device *bdget(dev_t dev);
 struct block_device *I_BDEV(struct inode *inode);
 struct block_device *bdget_part(struct hd_struct *part);
 struct block_device *bdgrab(struct block_device *bdev);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ca5e356084c353..ab5fca99764e7a 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -65,6 +65,7 @@ struct hd_struct {
 	struct disk_stats __percpu *dkstats;
 	struct percpu_ref ref;
 
+	struct block_device *bdev;
 	struct device __dev;
 	struct kobject *holder_dir;
 	int policy, partno;
@@ -300,7 +301,6 @@ static inline void add_disk_no_queue_reg(struct gendisk *disk)
 }
 
 extern void del_gendisk(struct gendisk *gp);
-extern struct gendisk *get_gendisk(dev_t dev, int *partno);
 extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
 
 extern void set_disk_ro(struct gendisk *disk, int flag);
@@ -338,7 +338,6 @@ int blk_drop_partitions(struct block_device *bdev);
 
 extern struct gendisk *__alloc_disk_node(int minors, int node_id);
 extern void put_disk(struct gendisk *disk);
-extern void put_disk_and_module(struct gendisk *disk);
 
 #define alloc_disk_node(minors, node_id)				\
 ({									\
@@ -389,6 +388,7 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
 #endif /* CONFIG_SYSFS */
 
 dev_t blk_lookup_devt(const char *name, int partno);
+void blk_request_module(dev_t devt);
 #ifdef CONFIG_BLOCK
 void printk_all_partitions(void);
 #else /* CONFIG_BLOCK */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 67/78] block: simplify the block device claiming interface
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (65 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 66/78] block: keep a block_device reference for each hd_struct Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:51   ` Hannes Reinecke
  2020-11-16 14:57 ` [PATCH 68/78] block: remove ->bd_contains Christoph Hellwig
                   ` (12 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Stop passing the whole device as a separate argument given that it
can be trivially deducted.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c   | 12 +++-----
 fs/block_dev.c         | 69 +++++++++++++++++++-----------------------
 include/linux/blkdev.h |  6 ++--
 3 files changed, 38 insertions(+), 49 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index b42c728620c9e4..599e94a7e69259 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1071,7 +1071,6 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	struct file	*file;
 	struct inode	*inode;
 	struct address_space *mapping;
-	struct block_device *claimed_bdev = NULL;
 	int		error;
 	loff_t		size;
 	bool		partscan;
@@ -1090,8 +1089,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	 * here to avoid changing device under exclusive owner.
 	 */
 	if (!(mode & FMODE_EXCL)) {
-		claimed_bdev = bdev->bd_contains;
-		error = bd_prepare_to_claim(bdev, claimed_bdev, loop_configure);
+		error = bd_prepare_to_claim(bdev, loop_configure);
 		if (error)
 			goto out_putf;
 	}
@@ -1178,15 +1176,15 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	mutex_unlock(&loop_ctl_mutex);
 	if (partscan)
 		loop_reread_partitions(lo, bdev);
-	if (claimed_bdev)
-		bd_abort_claiming(bdev, claimed_bdev, loop_configure);
+	if (!(mode & FMODE_EXCL))
+		bd_abort_claiming(bdev, loop_configure);
 	return 0;
 
 out_unlock:
 	mutex_unlock(&loop_ctl_mutex);
 out_bdev:
-	if (claimed_bdev)
-		bd_abort_claiming(bdev, claimed_bdev, loop_configure);
+	if (!(mode & FMODE_EXCL))
+		bd_abort_claiming(bdev, loop_configure);
 out_putf:
 	fput(file);
 out:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index f36788d7699302..fd4df132a97590 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -110,24 +110,20 @@ EXPORT_SYMBOL(invalidate_bdev);
 int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
 			loff_t lstart, loff_t lend)
 {
-	struct block_device *claimed_bdev = NULL;
-	int err;
-
 	/*
 	 * If we don't hold exclusive handle for the device, upgrade to it
 	 * while we discard the buffer cache to avoid discarding buffers
 	 * under live filesystem.
 	 */
 	if (!(mode & FMODE_EXCL)) {
-		claimed_bdev = bdev->bd_contains;
-		err = bd_prepare_to_claim(bdev, claimed_bdev,
-					  truncate_bdev_range);
+		int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
 		if (err)
 			return err;
 	}
+
 	truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
-	if (claimed_bdev)
-		bd_abort_claiming(bdev, claimed_bdev, truncate_bdev_range);
+	if (!(mode & FMODE_EXCL))
+		bd_abort_claiming(bdev, truncate_bdev_range);
 	return 0;
 }
 EXPORT_SYMBOL(truncate_bdev_range);
@@ -1055,7 +1051,6 @@ static bool bd_may_claim(struct block_device *bdev, struct block_device *whole,
 /**
  * bd_prepare_to_claim - claim a block device
  * @bdev: block device of interest
- * @whole: the whole device containing @bdev, may equal @bdev
  * @holder: holder trying to claim @bdev
  *
  * Claim @bdev.  This function fails if @bdev is already claimed by another
@@ -1065,9 +1060,10 @@ static bool bd_may_claim(struct block_device *bdev, struct block_device *whole,
  * RETURNS:
  * 0 if @bdev can be claimed, -EBUSY otherwise.
  */
-int bd_prepare_to_claim(struct block_device *bdev, struct block_device *whole,
-		void *holder)
+int bd_prepare_to_claim(struct block_device *bdev, void *holder)
 {
+	struct block_device *whole = bdev->bd_contains;
+
 retry:
 	spin_lock(&bdev_lock);
 	/* if someone else claimed, fail */
@@ -1107,15 +1103,15 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
 /**
  * bd_finish_claiming - finish claiming of a block device
  * @bdev: block device of interest
- * @whole: whole block device
  * @holder: holder that has claimed @bdev
  *
  * Finish exclusive open of a block device. Mark the device as exlusively
  * open by the holder and wake up all waiters for exclusive open to finish.
  */
-static void bd_finish_claiming(struct block_device *bdev,
-		struct block_device *whole, void *holder)
+static void bd_finish_claiming(struct block_device *bdev, void *holder)
 {
+	struct block_device *whole = bdev->bd_contains;
+
 	spin_lock(&bdev_lock);
 	BUG_ON(!bd_may_claim(bdev, whole, holder));
 	/*
@@ -1140,11 +1136,10 @@ static void bd_finish_claiming(struct block_device *bdev,
  * also used when exclusive open is not actually desired and we just needed
  * to block other exclusive openers for a while.
  */
-void bd_abort_claiming(struct block_device *bdev, struct block_device *whole,
-		       void *holder)
+void bd_abort_claiming(struct block_device *bdev, void *holder)
 {
 	spin_lock(&bdev_lock);
-	bd_clear_claiming(whole, holder);
+	bd_clear_claiming(bdev->bd_contains, holder);
 	spin_unlock(&bdev_lock);
 }
 EXPORT_SYMBOL(bd_abort_claiming);
@@ -1439,7 +1434,7 @@ static int bdev_get_gendisk(struct gendisk *disk)
 static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		int for_part)
 {
-	struct block_device *whole = NULL, *claiming = NULL;
+	struct block_device *whole = NULL;
 	struct gendisk *disk = bdev->bd_disk;
 	int ret;
 	bool first_open = false, unblock_events = true, need_restart;
@@ -1460,11 +1455,7 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 
 	if (!for_part && (mode & FMODE_EXCL)) {
 		WARN_ON_ONCE(!holder);
-		if (whole)
-			claiming = whole;
-		else
-			claiming = bdev;
-		ret = bd_prepare_to_claim(bdev, claiming, holder);
+		ret = bd_prepare_to_claim(bdev, holder);
 		if (ret)
 			goto out_put_whole;
 	}
@@ -1541,21 +1532,23 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		}
 	}
 	bdev->bd_openers++;
-	if (for_part)
+	if (for_part) {
 		bdev->bd_part_count++;
-	if (claiming)
-		bd_finish_claiming(bdev, claiming, holder);
+	} else if (mode & FMODE_EXCL) {
+		bd_finish_claiming(bdev, holder);
 
-	/*
-	 * Block event polling for write claims if requested.  Any write holder
-	 * makes the write_holder state stick until all are released.  This is
-	 * good enough and tracking individual writeable reference is too
-	 * fragile given the way @mode is used in blkdev_get/put().
-	 */
-	if (claiming && (mode & FMODE_WRITE) && !bdev->bd_write_holder &&
-	    (disk->flags & GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE)) {
-		bdev->bd_write_holder = true;
-		unblock_events = false;
+		/*
+		 * Block event polling for write claims if requested.  Any write
+		 * holder makes the write_holder state stick until all are
+		 * released.  This is good enough and tracking individual
+		 * writeable reference is too fragile given the way @mode is
+		 * used in blkdev_get/put().
+		 */
+		if ((mode & FMODE_WRITE) && !bdev->bd_write_holder &&
+		    (disk->flags & GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE)) {
+			bdev->bd_write_holder = true;
+			unblock_events = false;
+		}
 	}
 	mutex_unlock(&bdev->bd_mutex);
 
@@ -1576,8 +1569,8 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		__blkdev_put(bdev->bd_contains, mode, 1);
 	bdev->bd_contains = NULL;
  out_unlock_bdev:
-	if (claiming)
-		bd_abort_claiming(bdev, claiming, holder);
+	if (!for_part && (mode & FMODE_EXCL))
+		bd_abort_claiming(bdev, holder);
 	mutex_unlock(&bdev->bd_mutex);
 	disk_unblock_events(disk);
  out_put_whole:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 044d9dd159d882..696b2f9c5529d8 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1988,10 +1988,8 @@ void blkdev_show(struct seq_file *seqf, off_t offset);
 struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
 		void *holder);
 struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder);
-int bd_prepare_to_claim(struct block_device *bdev, struct block_device *whole,
-		void *holder);
-void bd_abort_claiming(struct block_device *bdev, struct block_device *whole,
-		void *holder);
+int bd_prepare_to_claim(struct block_device *bdev, void *holder);
+void bd_abort_claiming(struct block_device *bdev, void *holder);
 void blkdev_put(struct block_device *bdev, fmode_t mode);
 
 struct block_device *bdev_alloc(struct gendisk *disk, u8 partno);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 68/78] block: remove ->bd_contains
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (66 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 67/78] block: simplify the block device claiming interface Christoph Hellwig
@ 2020-11-16 14:57 ` Christoph Hellwig
  2020-11-20  7:52   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 69/78] block: remove the nr_sects field in struct hd_struct Christoph Hellwig
                   ` (11 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Now that each gendisk has a reference to the block_device referencing
it, we can just use that everywhere and get rid of ->bd_contain.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/scsicam.c    |  2 +-
 fs/block_dev.c            | 50 +++++++++++++--------------------------
 include/linux/blk_types.h |  4 +++-
 3 files changed, 20 insertions(+), 36 deletions(-)

diff --git a/drivers/scsi/scsicam.c b/drivers/scsi/scsicam.c
index 682cf08ab04153..f1553a453616fd 100644
--- a/drivers/scsi/scsicam.c
+++ b/drivers/scsi/scsicam.c
@@ -32,7 +32,7 @@
  */
 unsigned char *scsi_bios_ptable(struct block_device *dev)
 {
-	struct address_space *mapping = dev->bd_contains->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_whole(dev)->bd_inode->i_mapping;
 	unsigned char *res = NULL;
 	struct page *page;
 
diff --git a/fs/block_dev.c b/fs/block_dev.c
index fd4df132a97590..2348f218d45deb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -886,7 +886,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 	spin_lock_init(&bdev->bd_size_lock);
 	bdev->bd_disk = disk;
 	bdev->bd_partno = partno;
-	bdev->bd_contains = NULL;
 	bdev->bd_super = NULL;
 	bdev->bd_inode = inode;
 	bdev->bd_part_count = 0;
@@ -1062,7 +1061,7 @@ static bool bd_may_claim(struct block_device *bdev, struct block_device *whole,
  */
 int bd_prepare_to_claim(struct block_device *bdev, void *holder)
 {
-	struct block_device *whole = bdev->bd_contains;
+	struct block_device *whole = bdev_whole(bdev);
 
 retry:
 	spin_lock(&bdev_lock);
@@ -1110,7 +1109,7 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
  */
 static void bd_finish_claiming(struct block_device *bdev, void *holder)
 {
-	struct block_device *whole = bdev->bd_contains;
+	struct block_device *whole = bdev_whole(bdev);
 
 	spin_lock(&bdev_lock);
 	BUG_ON(!bd_may_claim(bdev, whole, holder));
@@ -1139,7 +1138,7 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder)
 void bd_abort_claiming(struct block_device *bdev, void *holder)
 {
 	spin_lock(&bdev_lock);
-	bd_clear_claiming(bdev->bd_contains, holder);
+	bd_clear_claiming(bdev_whole(bdev), holder);
 	spin_unlock(&bdev_lock);
 }
 EXPORT_SYMBOL(bd_abort_claiming);
@@ -1434,7 +1433,6 @@ static int bdev_get_gendisk(struct gendisk *disk)
 static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		int for_part)
 {
-	struct block_device *whole = NULL;
 	struct gendisk *disk = bdev->bd_disk;
 	int ret;
 	bool first_open = false, unblock_events = true, need_restart;
@@ -1445,26 +1443,17 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	if (ret)
 		goto out;
 
-	if (bdev->bd_partno) {
-		whole = bdget_disk(disk, 0);
-		if (!whole) {
-			ret = -ENOMEM;
-			goto out_put_disk;
-		}
-	}
-
 	if (!for_part && (mode & FMODE_EXCL)) {
 		WARN_ON_ONCE(!holder);
 		ret = bd_prepare_to_claim(bdev, holder);
 		if (ret)
-			goto out_put_whole;
+			goto out_put_disk;
 	}
 
 	disk_block_events(disk);
 	mutex_lock_nested(&bdev->bd_mutex, for_part);
 	if (!bdev->bd_openers) {
 		first_open = true;
-		bdev->bd_contains = bdev;
 
 		if (!bdev->bd_partno) {
 			ret = -ENXIO;
@@ -1502,10 +1491,10 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 				goto out_clear;
 		} else {
 			BUG_ON(for_part);
-			ret = __blkdev_get(whole, mode, NULL, 1);
+			bdgrab(bdev_whole(bdev));
+			ret = __blkdev_get(bdev_whole(bdev), mode, NULL, 1);
 			if (ret)
 				goto out_clear;
-			bdev->bd_contains = bdgrab(whole);
 			bdev->bd_part = disk_get_part(disk, bdev->bd_partno);
 			if (!(disk->flags & GENHD_FL_UP) ||
 			    !bdev->bd_part || !bdev->bd_part->nr_sects) {
@@ -1519,7 +1508,7 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		if (bdev->bd_bdi == &noop_backing_dev_info)
 			bdev->bd_bdi = bdi_get(disk->queue->backing_dev_info);
 	} else {
-		if (bdev->bd_contains == bdev) {
+		if (!bdev->bd_partno) {
 			ret = 0;
 			if (bdev->bd_disk->fops->open)
 				ret = bdev->bd_disk->fops->open(bdev, mode);
@@ -1558,24 +1547,18 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	/* only one opener holds the module reference */
 	if (!first_open)
 		module_put(disk->fops->owner);
-	if (whole)
-		bdput(whole);
 	return 0;
 
  out_clear:
 	disk_put_part(bdev->bd_part);
 	bdev->bd_part = NULL;
-	if (bdev != bdev->bd_contains)
-		__blkdev_put(bdev->bd_contains, mode, 1);
-	bdev->bd_contains = NULL;
+	if (bdev_is_partition(bdev))
+		__blkdev_put(bdev_whole(bdev), mode, 1);
  out_unlock_bdev:
 	if (!for_part && (mode & FMODE_EXCL))
 		bd_abort_claiming(bdev, holder);
 	mutex_unlock(&bdev->bd_mutex);
 	disk_unblock_events(disk);
- out_put_whole:
- 	if (whole)
-		bdput(whole);
  out_put_disk:
 	module_put(disk->fops->owner);
 	if (need_restart)
@@ -1765,16 +1748,15 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 
 		bdev_write_inode(bdev);
 	}
-	if (bdev->bd_contains == bdev) {
+	if (!bdev_is_partition(bdev)) {
 		if (disk->fops->release)
 			disk->fops->release(disk, mode);
 	}
 	if (!bdev->bd_openers) {
 		disk_put_part(bdev->bd_part);
 		bdev->bd_part = NULL;
-		if (bdev != bdev->bd_contains)
-			victim = bdev->bd_contains;
-		bdev->bd_contains = NULL;
+		if (bdev_is_partition(bdev))
+			victim = bdev_whole(bdev);
 
 		module_put(disk->fops->owner);
 	}
@@ -1789,6 +1771,7 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 	mutex_lock(&bdev->bd_mutex);
 
 	if (mode & FMODE_EXCL) {
+		struct block_device *whole = bdev_whole(bdev);
 		bool bdev_free;
 
 		/*
@@ -1799,13 +1782,12 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 		spin_lock(&bdev_lock);
 
 		WARN_ON_ONCE(--bdev->bd_holders < 0);
-		WARN_ON_ONCE(--bdev->bd_contains->bd_holders < 0);
+		WARN_ON_ONCE(--whole->bd_holders < 0);
 
-		/* bd_contains might point to self, check in a separate step */
 		if ((bdev_free = !bdev->bd_holders))
 			bdev->bd_holder = NULL;
-		if (!bdev->bd_contains->bd_holders)
-			bdev->bd_contains->bd_holder = NULL;
+		if (!whole->bd_holders)
+			whole->bd_holder = NULL;
 
 		spin_unlock(&bdev_lock);
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d9b69bbde5cc54..041caca25fc787 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -32,7 +32,6 @@ struct block_device {
 #ifdef CONFIG_SYSFS
 	struct list_head	bd_holder_disks;
 #endif
-	struct block_device *	bd_contains;
 	u8			bd_partno;
 	struct hd_struct *	bd_part;
 	/* number of times partitions within this device have been opened. */
@@ -48,6 +47,9 @@ struct block_device {
 	struct mutex		bd_fsfreeze_mutex;
 } __randomize_layout;
 
+#define bdev_whole(_bdev) \
+	((_bdev)->bd_disk->part0.bdev)
+
 /*
  * Block error status values.  See block/blk-core:blk_errors for the details.
  * Alpha cannot write a byte atomically, so we need to use 32-bit value.
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 69/78] block: remove the nr_sects field in struct hd_struct
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (67 preceding siblings ...)
  2020-11-16 14:57 ` [PATCH 68/78] block: remove ->bd_contains Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  7:55   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex Christoph Hellwig
                   ` (10 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additional the field in hd_struct did not use proper serializiation,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c                        |  2 +-
 block/blk-core.c                   |  2 +-
 block/blk.h                        | 53 ----------------------
 block/genhd.c                      | 34 +++++++-------
 block/partitions/core.c            | 17 ++++---
 drivers/block/loop.c               |  1 -
 drivers/block/nbd.c                |  2 +-
 drivers/block/xen-blkback/common.h |  4 +-
 drivers/md/bcache/super.c          |  2 +-
 drivers/s390/block/dasd_ioctl.c    |  4 +-
 drivers/target/target_core_pscsi.c |  7 +--
 fs/block_dev.c                     | 73 +-----------------------------
 fs/f2fs/super.c                    |  2 +-
 fs/pstore/blk.c                    |  2 +-
 include/linux/genhd.h              | 29 +++---------
 kernel/trace/blktrace.c            |  2 +-
 16 files changed, 47 insertions(+), 189 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fa01bef35bb1fe..0c5269997434d6 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -613,7 +613,7 @@ void guard_bio_eod(struct bio *bio)
 	rcu_read_lock();
 	part = __disk_get_part(bio->bi_disk, bio->bi_partno);
 	if (part)
-		maxsector = part_nr_sects_read(part);
+		maxsector = bdev_nr_sectors(part->bdev);
 	else
 		maxsector = get_capacity(bio->bi_disk);
 	rcu_read_unlock();
diff --git a/block/blk-core.c b/block/blk-core.c
index 2db8bda43b6e6d..988f45094a387b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -755,7 +755,7 @@ static inline int blk_partition_remap(struct bio *bio)
 		goto out;
 
 	if (bio_sectors(bio)) {
-		if (bio_check_eod(bio, part_nr_sects_read(p)))
+		if (bio_check_eod(bio, bdev_nr_sectors(p->bdev)))
 			goto out;
 		bio->bi_iter.bi_sector += p->start_sect;
 		trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p),
diff --git a/block/blk.h b/block/blk.h
index d74159bf61eb8f..7d10bb24eb282d 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -386,59 +386,6 @@ static inline void hd_free_part(struct hd_struct *part)
 	percpu_ref_exit(&part->ref);
 }
 
-/*
- * Any access of part->nr_sects which is not protected by partition
- * bd_mutex or gendisk bdev bd_mutex, should be done using this
- * accessor function.
- *
- * Code written along the lines of i_size_read() and i_size_write().
- * CONFIG_PREEMPTION case optimizes the case of UP kernel with preemption
- * on.
- */
-static inline sector_t part_nr_sects_read(struct hd_struct *part)
-{
-#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
-	sector_t nr_sects;
-	unsigned seq;
-	do {
-		seq = read_seqcount_begin(&part->nr_sects_seq);
-		nr_sects = part->nr_sects;
-	} while (read_seqcount_retry(&part->nr_sects_seq, seq));
-	return nr_sects;
-#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
-	sector_t nr_sects;
-
-	preempt_disable();
-	nr_sects = part->nr_sects;
-	preempt_enable();
-	return nr_sects;
-#else
-	return part->nr_sects;
-#endif
-}
-
-/*
- * Should be called with mutex lock held (typically bd_mutex) of partition
- * to provide mutual exlusion among writers otherwise seqcount might be
- * left in wrong state leaving the readers spinning infinitely.
- */
-static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
-{
-#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
-	preempt_disable();
-	write_seqcount_begin(&part->nr_sects_seq);
-	part->nr_sects = size;
-	write_seqcount_end(&part->nr_sects_seq);
-	preempt_enable();
-#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
-	preempt_disable();
-	part->nr_sects = size;
-	preempt_enable();
-#else
-	part->nr_sects = size;
-#endif
-}
-
 int bio_add_hw_page(struct request_queue *q, struct bio *bio,
 		struct page *page, unsigned int len, unsigned int offset,
 		unsigned int max_sectors, bool *same_page);
diff --git a/block/genhd.c b/block/genhd.c
index 40ec5473a21dd2..7832968ce3fbb7 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -38,6 +38,16 @@ static void disk_add_events(struct gendisk *disk);
 static void disk_del_events(struct gendisk *disk);
 static void disk_release_events(struct gendisk *disk);
 
+void set_capacity(struct gendisk *disk, sector_t sectors)
+{
+	struct block_device *bdev = disk->part0.bdev;
+
+	spin_lock(&bdev->bd_size_lock);
+	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
+	spin_unlock(&bdev->bd_size_lock);
+}
+EXPORT_SYMBOL(set_capacity);
+
 /*
  * Set disk capacity and notify if the size is not currently zero and will not
  * be set to zero.  Returns true if a uevent was sent, otherwise false.
@@ -47,11 +57,12 @@ bool set_capacity_and_notify(struct gendisk *disk, sector_t size)
 	sector_t capacity = get_capacity(disk);
 
 	set_capacity(disk, size);
-	revalidate_disk_size(disk, true);
 
 	if (capacity != size && capacity != 0 && size != 0) {
 		char *envp[] = { "RESIZE=1", NULL };
 
+		pr_info("%s: detected capacity change from %lld to %lld\n",
+		       disk->disk_name, size, capacity);
 		kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp);
 		return true;
 	}
@@ -246,7 +257,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
 		part = rcu_dereference(ptbl->part[piter->idx]);
 		if (!part)
 			continue;
-		if (!part_nr_sects_read(part) &&
+		if (!bdev_nr_sectors(part->bdev) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
 		      piter->idx == 0))
@@ -283,7 +294,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_exit);
 static inline int sector_in_part(struct hd_struct *part, sector_t sector)
 {
 	return part->start_sect <= sector &&
-		sector < part->start_sect + part_nr_sects_read(part);
+		sector < part->start_sect + bdev_nr_sectors(part->bdev);
 }
 
 /**
@@ -981,7 +992,7 @@ void __init printk_all_partitions(void)
 
 			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
 			       bdevt_str(part_devt(part), devt_buf),
-			       (unsigned long long)part_nr_sects_read(part) >> 1
+			       bdev_nr_sectors(part->bdev) >> 1
 			       , disk_name(disk, part->partno, name_buf),
 			       part->info ? part->info->uuid : "");
 			if (is_part0) {
@@ -1074,7 +1085,7 @@ static int show_partition(struct seq_file *seqf, void *v)
 	while ((part = disk_part_iter_next(&piter)))
 		seq_printf(seqf, "%4d  %7d %10llu %s\n",
 			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
-			   (unsigned long long)part_nr_sects_read(part) >> 1,
+			   bdev_nr_sectors(part->bdev) >> 1,
 			   disk_name(sgp, part->partno, buf));
 	disk_part_iter_exit(&piter);
 
@@ -1156,8 +1167,7 @@ ssize_t part_size_show(struct device *dev,
 {
 	struct hd_struct *p = dev_to_part(dev);
 
-	return sprintf(buf, "%llu\n",
-		(unsigned long long)part_nr_sects_read(p));
+	return sprintf(buf, "%llu\n", bdev_nr_sectors(p->bdev));
 }
 
 ssize_t part_stat_show(struct device *dev,
@@ -1616,16 +1626,6 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	ptbl = rcu_dereference_protected(disk->part_tbl, 1);
 	rcu_assign_pointer(ptbl->part[0], &disk->part0);
 
-	/*
-	 * set_capacity() and get_capacity() currently don't use
-	 * seqcounter to read/update the part0->nr_sects. Still init
-	 * the counter as we can read the sectors in IO submission
-	 * patch using seqence counters.
-	 *
-	 * TODO: Ideally set_capacity() and get_capacity() should be
-	 * converted to make use of bd_mutex and sequence counters.
-	 */
-	hd_sects_seq_init(&disk->part0);
 	if (hd_ref_init(&disk->part0))
 		goto out_free_part0;
 
diff --git a/block/partitions/core.c b/block/partitions/core.c
index 8b44f46ab1fbfc..573ef5a03fc104 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -85,6 +85,13 @@ static int (*check_part[])(struct parsed_partitions *) = {
 	NULL
 };
 
+static void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
+{
+	spin_lock(&bdev->bd_size_lock);
+	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
+	spin_unlock(&bdev->bd_size_lock);
+}
+
 static struct parsed_partitions *allocate_partitions(struct gendisk *hd)
 {
 	struct parsed_partitions *state;
@@ -295,7 +302,7 @@ static void hd_struct_free_work(struct work_struct *work)
 	put_device(disk_to_dev(disk));
 
 	part->start_sect = 0;
-	part->nr_sects = 0;
+	bdev_set_nr_sectors(part->bdev, 0);
 	part_stat_set_all(part, 0);
 	put_device(part_to_dev(part));
 }
@@ -410,11 +417,10 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	if (!p->bdev)
 		goto out_free_stats;
 
-	hd_sects_seq_init(p);
 	pdev = part_to_dev(p);
 
 	p->start_sect = start;
-	p->nr_sects = len;
+	bdev_set_nr_sectors(p->bdev, len);
 	p->partno = partno;
 	p->policy = get_disk_ro(disk);
 
@@ -508,7 +514,7 @@ static bool partition_overlaps(struct gendisk *disk, sector_t start,
 	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY);
 	while ((part = disk_part_iter_next(&piter))) {
 		if (part->partno == skip_partno ||
-		    start >= part->start_sect + part->nr_sects ||
+		    start >= part->start_sect + bdev_nr_sectors(part->bdev) ||
 		    start + length <= part->start_sect)
 			continue;
 		overlap = true;
@@ -599,8 +605,7 @@ int bdev_resize_partition(struct block_device *bdev, int partno,
 	if (partition_overlaps(bdev->bd_disk, start, length, partno))
 		goto out_unlock;
 
-	part_nr_sects_write(part, length);
-	bd_set_nr_sectors(bdevp, length);
+	bdev_set_nr_sectors(bdevp, length);
 
 	ret = 0;
 out_unlock:
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 599e94a7e69259..9d2587f6167cd8 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1243,7 +1243,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	set_capacity(lo->lo_disk, 0);
 	loop_sysfs_exit(lo);
 	if (bdev) {
-		bd_set_nr_sectors(bdev, 0);
 		/* let user-space know about this change */
 		kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
 	}
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 45b0423ef2c53d..014683968ce174 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1132,7 +1132,7 @@ static void nbd_bdev_reset(struct block_device *bdev)
 {
 	if (bdev->bd_openers > 1)
 		return;
-	bd_set_nr_sectors(bdev, 0);
+	set_capacity(bdev->bd_disk, 0);
 }
 
 static void nbd_parse_flags(struct nbd_device *nbd)
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index c6ea5d38c509a6..0762db247b41b3 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -358,9 +358,7 @@ struct pending_req {
 };
 
 
-#define vbd_sz(_v)	((_v)->bdev->bd_part ? \
-			 (_v)->bdev->bd_part->nr_sects : \
-			  get_capacity((_v)->bdev->bd_disk))
+#define vbd_sz(_v)	bdev_nr_sectors((_v)->bdev)
 
 #define xen_blkif_get(_b) (atomic_inc(&(_b)->refcnt))
 #define xen_blkif_put(_b)				\
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index d36ccdda16ed2e..ea2b80c3e44c38 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1408,7 +1408,7 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
 			q->limits.raid_partial_stripes_expensive;
 
 	ret = bcache_device_init(&dc->disk, block_size,
-			 dc->bdev->bd_part->nr_sects - dc->sb.data_offset,
+			 bdev_nr_sectors(dc->bdev) - dc->sb.data_offset,
 			 dc->bdev, &bcache_cached_ops);
 	if (ret)
 		return ret;
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 3359559517bfcf..304eba1acf163c 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -54,8 +54,6 @@ dasd_ioctl_enable(struct block_device *bdev)
 		return -ENODEV;
 
 	dasd_enable_device(base);
-	/* Formatting the dasd device can change the capacity. */
-	bd_set_nr_sectors(bdev, get_capacity(base->block->gdp));
 	dasd_put_device(base);
 	return 0;
 }
@@ -88,7 +86,7 @@ dasd_ioctl_disable(struct block_device *bdev)
 	 * Set i_size to zero, since read, write, etc. check against this
 	 * value.
 	 */
-	bd_set_nr_sectors(bdev, 0);
+	set_capacity(bdev->bd_disk, 0);
 	dasd_put_device(base);
 	return 0;
 }
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 4e37fa9b409d52..a70c33c49f0960 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -1027,12 +1027,7 @@ static u32 pscsi_get_device_type(struct se_device *dev)
 
 static sector_t pscsi_get_blocks(struct se_device *dev)
 {
-	struct pscsi_dev_virt *pdv = PSCSI_DEV(dev);
-
-	if (pdv->pdv_bd && pdv->pdv_bd->bd_part)
-		return pdv->pdv_bd->bd_part->nr_sects;
-
-	return 0;
+	return bdev_nr_sectors(PSCSI_DEV(dev)->pdv_bd);
 }
 
 static void pscsi_req_done(struct request *req, blk_status_t status)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2348f218d45deb..14b6dbfa9dda2a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1286,70 +1286,6 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
 #endif
 
-/**
- * check_disk_size_change - checks for disk size change and adjusts bdev size.
- * @disk: struct gendisk to check
- * @bdev: struct bdev to adjust.
- * @verbose: if %true log a message about a size change if there is any
- *
- * This routine checks to see if the bdev size does not match the disk size
- * and adjusts it if it differs. When shrinking the bdev size, its all caches
- * are freed.
- */
-static void check_disk_size_change(struct gendisk *disk,
-		struct block_device *bdev, bool verbose)
-{
-	loff_t disk_size, bdev_size;
-
-	spin_lock(&bdev->bd_size_lock);
-	disk_size = (loff_t)get_capacity(disk) << 9;
-	bdev_size = i_size_read(bdev->bd_inode);
-	if (disk_size != bdev_size) {
-		if (verbose) {
-			printk(KERN_INFO
-			       "%s: detected capacity change from %lld to %lld\n",
-			       disk->disk_name, bdev_size, disk_size);
-		}
-		i_size_write(bdev->bd_inode, disk_size);
-	}
-	spin_unlock(&bdev->bd_size_lock);
-}
-
-/**
- * revalidate_disk_size - checks for disk size change and adjusts bdev size.
- * @disk: struct gendisk to check
- * @verbose: if %true log a message about a size change if there is any
- *
- * This routine checks to see if the bdev size does not match the disk size
- * and adjusts it if it differs. When shrinking the bdev size, its all caches
- * are freed.
- */
-void revalidate_disk_size(struct gendisk *disk, bool verbose)
-{
-	struct block_device *bdev;
-
-	/*
-	 * Hidden disks don't have associated bdev so there's no point in
-	 * revalidating them.
-	 */
-	if (disk->flags & GENHD_FL_HIDDEN)
-		return;
-
-	bdev = bdget_disk(disk, 0);
-	if (bdev) {
-		check_disk_size_change(disk, bdev, verbose);
-		bdput(bdev);
-	}
-}
-
-void bd_set_nr_sectors(struct block_device *bdev, sector_t sectors)
-{
-	spin_lock(&bdev->bd_size_lock);
-	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
-	spin_unlock(&bdev->bd_size_lock);
-}
-EXPORT_SYMBOL(bd_set_nr_sectors);
-
 static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part);
 
 int bdev_disk_changed(struct block_device *bdev, bool invalidate)
@@ -1383,8 +1319,6 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
 			disk->fops->revalidate_disk(disk);
 	}
 
-	check_disk_size_change(disk, bdev, !invalidate);
-
 	if (get_capacity(disk)) {
 		ret = blk_add_partitions(disk, bdev);
 		if (ret == -EAGAIN)
@@ -1472,10 +1406,8 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 					need_restart = true;
 			}
 
-			if (!ret) {
-				bd_set_nr_sectors(bdev, get_capacity(disk));
+			if (!ret)
 				set_init_blocksize(bdev);
-			}
 
 			/*
 			 * If the device is invalidated, rescan partition
@@ -1497,11 +1429,10 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 				goto out_clear;
 			bdev->bd_part = disk_get_part(disk, bdev->bd_partno);
 			if (!(disk->flags & GENHD_FL_UP) ||
-			    !bdev->bd_part || !bdev->bd_part->nr_sects) {
+			    !bdev->bd_part || !bdev_nr_sectors(bdev)) {
 				ret = -ENXIO;
 				goto out_clear;
 			}
-			bd_set_nr_sectors(bdev, bdev->bd_part->nr_sects);
 			set_init_blocksize(bdev);
 		}
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 00eff2f5180790..d4e7fab352bacb 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3151,7 +3151,7 @@ static int f2fs_report_zone_cb(struct blk_zone *zone, unsigned int idx,
 static int init_blkz_info(struct f2fs_sb_info *sbi, int devi)
 {
 	struct block_device *bdev = FDEV(devi).bdev;
-	sector_t nr_sectors = bdev->bd_part->nr_sects;
+	sector_t nr_sectors = bdev_nr_sectors(bdev);
 	struct f2fs_report_zones_args rep_zone_arg;
 	int ret;
 
diff --git a/fs/pstore/blk.c b/fs/pstore/blk.c
index fcd5563dde063c..777a26f7bbe2aa 100644
--- a/fs/pstore/blk.c
+++ b/fs/pstore/blk.c
@@ -245,7 +245,7 @@ static struct block_device *psblk_get_bdev(void *holder,
 			return bdev;
 	}
 
-	nr_sects = part_nr_sects_read(bdev->bd_part);
+	nr_sects = bdev_nr_sectors(bdev);
 	if (!nr_sects) {
 		pr_err("not enough space for '%s'\n", blkdev);
 		blkdev_put(bdev, mode);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ab5fca99764e7a..e01618dfafc05c 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -52,15 +52,6 @@ struct partition_meta_info {
 
 struct hd_struct {
 	sector_t start_sect;
-	/*
-	 * nr_sects is protected by sequence counter. One might extend a
-	 * partition while IO is happening to it and update of nr_sects
-	 * can be non-atomic on 32bit machines with 64bit sector_t.
-	 */
-	sector_t nr_sects;
-#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
-	seqcount_t nr_sects_seq;
-#endif
 	unsigned long stamp;
 	struct disk_stats __percpu *dkstats;
 	struct percpu_ref ref;
@@ -259,13 +250,6 @@ static inline void disk_put_part(struct hd_struct *part)
 		put_device(part_to_dev(part));
 }
 
-static inline void hd_sects_seq_init(struct hd_struct *p)
-{
-#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
-	seqcount_init(&p->nr_sects_seq);
-#endif
-}
-
 /*
  * Smarter partition iterator without context limits.
  */
@@ -323,13 +307,15 @@ static inline sector_t get_start_sect(struct block_device *bdev)
 {
 	return bdev->bd_part->start_sect;
 }
-static inline sector_t get_capacity(struct gendisk *disk)
+	
+static inline sector_t bdev_nr_sectors(struct block_device *bdev)
 {
-	return disk->part0.nr_sects;
+	return i_size_read(bdev->bd_inode) >> 9;
 }
-static inline void set_capacity(struct gendisk *disk, sector_t size)
+	
+static inline sector_t get_capacity(struct gendisk *disk)
 {
-	disk->part0.nr_sects = size;
+	return bdev_nr_sectors(disk->part0.bdev);
 }
 
 int bdev_disk_changed(struct block_device *bdev, bool invalidate);
@@ -363,10 +349,9 @@ int __register_blkdev(unsigned int major, const char *name,
 	__register_blkdev(major, name, NULL)
 void unregister_blkdev(unsigned int major, const char *name);
 
-void revalidate_disk_size(struct gendisk *disk, bool verbose);
 bool bdev_check_media_change(struct block_device *bdev);
 int __invalidate_device(struct block_device *bdev, bool kill_dirty);
-void bd_set_nr_sectors(struct block_device *bdev, sector_t sectors);
+void set_capacity(struct gendisk *disk, sector_t size);
 
 /* for drivers/char/raw.c: */
 int blkdev_ioctl(struct block_device *, fmode_t, unsigned, unsigned long);
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f1022945e3460b..7076d588a50d69 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -465,7 +465,7 @@ static void blk_trace_setup_lba(struct blk_trace *bt,
 
 	if (part) {
 		bt->start_lba = part->start_sect;
-		bt->end_lba = part->start_sect + part->nr_sects;
+		bt->end_lba = part->start_sect + bdev_nr_sectors(bdev);
 	} else {
 		bt->start_lba = 0;
 		bt->end_lba = -1ULL;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (68 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 69/78] block: remove the nr_sects field in struct hd_struct Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  7:58   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 71/78] block: add a bdev_kobj helper Christoph Hellwig
                   ` (9 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

bd_mutex is primarily used for synchronizing the block device open and
release path, which recurses from partitions to the whole disk device.
The fact that we have two locks makes life unnecessarily complex due
to lock order constrains.  Replace the two levels of locking with a
single mutex in the gendisk structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c                   |  7 ++--
 block/ioctl.c                   |  4 +-
 block/partitions/core.c         | 22 +++++-----
 drivers/block/loop.c            | 14 +++----
 drivers/block/xen-blkfront.c    |  8 ++--
 drivers/block/zram/zram_drv.c   |  4 +-
 drivers/block/zram/zram_drv.h   |  2 +-
 drivers/md/md.h                 |  7 +---
 drivers/s390/block/dasd_genhd.c |  8 ++--
 drivers/scsi/sd.c               |  4 +-
 fs/block_dev.c                  | 71 +++++++++++++++++----------------
 fs/btrfs/volumes.c              |  2 +-
 fs/super.c                      |  8 ++--
 include/linux/blk_types.h       |  1 -
 include/linux/genhd.h           |  1 +
 15 files changed, 80 insertions(+), 83 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 7832968ce3fbb7..999f7142b04e7d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1350,7 +1350,7 @@ static const struct attribute_group *disk_attr_groups[] = {
  * original ptbl is freed using RCU callback.
  *
  * LOCKING:
- * Matching bd_mutex locked or the caller is the only user of @disk.
+ * disk->mutex locked or the caller is the only user of @disk.
  */
 static void disk_replace_part_tbl(struct gendisk *disk,
 				  struct disk_part_tbl *new_ptbl)
@@ -1375,7 +1375,7 @@ static void disk_replace_part_tbl(struct gendisk *disk,
  * uses RCU to allow unlocked dereferencing for stats and other stuff.
  *
  * LOCKING:
- * Matching bd_mutex locked or the caller is the only user of @disk.
+ * disk->mutex locked or the caller is the only user of @disk.
  * Might sleep.
  *
  * RETURNS:
@@ -1616,6 +1616,7 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	if (!disk->part0.dkstats)
 		goto out_bdput;
 
+	mutex_init(&disk->mutex);
 	init_rwsem(&disk->lookup_sem);
 	disk->node_id = node_id;
 	if (disk_expand_part_tbl(disk, 0)) {
@@ -1842,7 +1843,7 @@ void disk_unblock_events(struct gendisk *disk)
  * doesn't clear the events from @disk->ev.
  *
  * CONTEXT:
- * If @mask is non-zero must be called with bdev->bd_mutex held.
+ * If @mask is non-zero must be called with disk->mutex held.
  */
 void disk_flush_events(struct gendisk *disk, unsigned int mask)
 {
diff --git a/block/ioctl.c b/block/ioctl.c
index 22f394d118c302..18adf9b16a30f6 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -99,9 +99,9 @@ static int blkdev_reread_part(struct block_device *bdev)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 	ret = bdev_disk_changed(bdev, false);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 
 	return ret;
 }
diff --git a/block/partitions/core.c b/block/partitions/core.c
index 573ef5a03fc104..e50b5ca17df550 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -328,7 +328,7 @@ int hd_ref_init(struct hd_struct *part)
 }
 
 /*
- * Must be called either with bd_mutex held, before a disk can be opened or
+ * Must be called either with disk->mutex held, before a disk can be opened or
  * after all disk users are gone.
  */
 void delete_partition(struct hd_struct *part)
@@ -363,7 +363,7 @@ static ssize_t whole_disk_show(struct device *dev,
 static DEVICE_ATTR(whole_disk, 0444, whole_disk_show, NULL);
 
 /*
- * Must be called either with bd_mutex held, before a disk can be opened or
+ * Must be called either with disk->mutex held, before a disk can be opened or
  * after all disk users are gone.
  */
 static struct hd_struct *add_partition(struct gendisk *disk, int partno,
@@ -530,15 +530,15 @@ int bdev_add_partition(struct block_device *bdev, int partno,
 {
 	struct hd_struct *part;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 	if (partition_overlaps(bdev->bd_disk, start, length, -1)) {
-		mutex_unlock(&bdev->bd_mutex);
+		mutex_unlock(&bdev->bd_disk->mutex);
 		return -EBUSY;
 	}
 
 	part = add_partition(bdev->bd_disk, partno, start, length,
 			ADDPART_FLAG_NONE, NULL);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	return PTR_ERR_OR_ZERO(part);
 }
 
@@ -552,8 +552,7 @@ int bdev_del_partition(struct block_device *bdev, int partno)
 	if (!bdevp)
 		return -ENXIO;
 
-	mutex_lock(&bdevp->bd_mutex);
-	mutex_lock_nested(&bdev->bd_mutex, 1);
+	mutex_lock(&bdev->bd_disk->mutex);
 
 	ret = -ENXIO;
 	part = disk_get_part(bdev->bd_disk, partno);
@@ -570,8 +569,7 @@ int bdev_del_partition(struct block_device *bdev, int partno)
 	delete_partition(part);
 	ret = 0;
 out_unlock:
-	mutex_unlock(&bdev->bd_mutex);
-	mutex_unlock(&bdevp->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	bdput(bdevp);
 	if (part)
 		disk_put_part(part);
@@ -594,8 +592,7 @@ int bdev_resize_partition(struct block_device *bdev, int partno,
 	if (!bdevp)
 		goto out_put_part;
 
-	mutex_lock(&bdevp->bd_mutex);
-	mutex_lock_nested(&bdev->bd_mutex, 1);
+	mutex_lock(&bdev->bd_disk->mutex);
 
 	ret = -EINVAL;
 	if (start != part->start_sect)
@@ -609,8 +606,7 @@ int bdev_resize_partition(struct block_device *bdev, int partno,
 
 	ret = 0;
 out_unlock:
-	mutex_unlock(&bdevp->bd_mutex);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	bdput(bdevp);
 out_put_part:
 	disk_put_part(part);
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9d2587f6167cd8..91e47c5b52f1cb 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -651,9 +651,9 @@ static void loop_reread_partitions(struct loop_device *lo,
 {
 	int rc;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 	rc = bdev_disk_changed(bdev, false);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	if (rc)
 		pr_warn("%s: partition scan of loop%d (%s) failed (rc=%d)\n",
 			__func__, lo->lo_number, lo->lo_file_name, rc);
@@ -746,7 +746,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 	mutex_unlock(&loop_ctl_mutex);
 	/*
 	 * We must drop file reference outside of loop_ctl_mutex as dropping
-	 * the file ref can take bd_mutex which creates circular locking
+	 * the file ref can take disk->mutex which creates circular locking
 	 * dependency.
 	 */
 	fput(old_file);
@@ -1258,7 +1258,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	mutex_unlock(&loop_ctl_mutex);
 	if (partscan) {
 		/*
-		 * bd_mutex has been held already in release path, so don't
+		 * disk->mutex has been held already in release path, so don't
 		 * acquire it if this function is called in such case.
 		 *
 		 * If the reread partition isn't from release path, lo_refcnt
@@ -1266,10 +1266,10 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 		 * current holder is released.
 		 */
 		if (!release)
-			mutex_lock(&bdev->bd_mutex);
+			mutex_lock(&bdev->bd_disk->mutex);
 		err = bdev_disk_changed(bdev, false);
 		if (!release)
-			mutex_unlock(&bdev->bd_mutex);
+			mutex_unlock(&bdev->bd_disk->mutex);
 		if (err)
 			pr_warn("%s: partition scan of loop%d failed (rc=%d)\n",
 				__func__, lo_number, err);
@@ -1297,7 +1297,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	 * Need not hold loop_ctl_mutex to fput backing file.
 	 * Calling fput holding loop_ctl_mutex triggers a circular
 	 * lock dependency possibility warning as fput can take
-	 * bd_mutex which is usually taken before loop_ctl_mutex.
+	 * disk->mutex which is usually taken before loop_ctl_mutex.
 	 */
 	if (filp)
 		fput(filp);
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 79521e33d30ed5..5b1f99ca77b734 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2162,7 +2162,7 @@ static void blkfront_closing(struct blkfront_info *info)
 		return;
 	}
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&info->gd->mutex);
 
 	if (bdev->bd_openers) {
 		xenbus_dev_error(xbdev, -EBUSY,
@@ -2173,7 +2173,7 @@ static void blkfront_closing(struct blkfront_info *info)
 		xenbus_frontend_closed(xbdev);
 	}
 
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&info->gd->mutex);
 	bdput(bdev);
 }
 
@@ -2536,7 +2536,7 @@ static int blkfront_remove(struct xenbus_device *xbdev)
 	 * isn't closed yet, we let release take care of it.
 	 */
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&info->gd->mutex);
 	info = disk->private_data;
 
 	dev_warn(disk_to_dev(disk),
@@ -2551,7 +2551,7 @@ static int blkfront_remove(struct xenbus_device *xbdev)
 		mutex_unlock(&blkfront_mutex);
 	}
 
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&info->gd->mutex);
 	bdput(bdev);
 
 	return 0;
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d00b5761ec0b21..0b156f09e208df 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1762,12 +1762,12 @@ static ssize_t reset_store(struct device *dev,
 	if (!bdev)
 		return -ENOMEM;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&zram->disk->mutex);
 	if (bdev->bd_openers)
 		ret = -EBUSY;
 	else
 		zram_reset_device(zram);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&zram->disk->mutex);
 	bdput(bdev);
 
 	return ret ? ret : len;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 712354a4207c77..b300632c17c172 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -111,7 +111,7 @@ struct zram {
 	/*
 	 * zram is claimed so open request will be failed
 	 */
-	bool claim; /* Protected by bdev->bd_mutex */
+	bool claim; /* Protected by bdev->bd_disk->mutex */
 	struct file *backing_dev;
 #ifdef CONFIG_ZRAM_WRITEBACK
 	spinlock_t wb_limit_lock;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index ccfb69868c2ec9..28712d3498de2c 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -394,11 +394,8 @@ struct mddev {
 	/* 'open_mutex' avoids races between 'md_open' and 'do_md_stop', so
 	 * that we are never stopping an array while it is open.
 	 * 'reconfig_mutex' protects all other reconfiguration.
-	 * These locks are separate due to conflicting interactions
-	 * with bdev->bd_mutex.
-	 * Lock ordering is:
-	 *  reconfig_mutex -> bd_mutex
-	 *  bd_mutex -> open_mutex:  e.g. __blkdev_get -> md_open
+	 * These locks are separate due to historically conflicting
+	 * interactions with block layer locks.
 	 */
 	struct mutex			open_mutex;
 	struct mutex			reconfig_mutex;
diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
index a9698fba9b76ce..7b5f475b500e8c 100644
--- a/drivers/s390/block/dasd_genhd.c
+++ b/drivers/s390/block/dasd_genhd.c
@@ -109,9 +109,9 @@ int dasd_scan_partitions(struct dasd_block *block)
 		return -ENODEV;
 	}
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 	rc = bdev_disk_changed(bdev, false);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	if (rc)
 		DBF_DEV_EVENT(DBF_ERR, block->base,
 				"scan partitions error, rc %d", rc);
@@ -145,9 +145,9 @@ void dasd_destroy_partitions(struct dasd_block *block)
 	bdev = block->bdev;
 	block->bdev = NULL;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 	blk_drop_partitions(bdev);
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 
 	/* Matching blkdev_put to the blkdev_get in dasd_scan_partitions. */
 	blkdev_put(bdev, FMODE_READ);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 679c2c02504763..68c752ef3ed575 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1398,7 +1398,7 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
  *	In the latter case @inode and @filp carry an abridged amount
  *	of information as noted above.
  *
- *	Locking: called with bdev->bd_mutex held.
+ *	Locking: called with bdev->bd_disk->mutex held.
  **/
 static int sd_open(struct block_device *bdev, fmode_t mode)
 {
@@ -1474,7 +1474,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
  *	Note: may block (uninterruptible) if error recovery is underway
  *	on this disk.
  *
- *	Locking: called with bdev->bd_mutex held.
+ *	Locking: called with bdev->bd_disk->mutex held.
  **/
 static void sd_release(struct gendisk *disk, fmode_t mode)
 {
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 14b6dbfa9dda2a..4b59ace9632f65 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -803,7 +803,6 @@ static void init_once(void *foo)
 	struct block_device *bdev = &ei->bdev;
 
 	memset(bdev, 0, sizeof(*bdev));
-	mutex_init(&bdev->bd_mutex);
 #ifdef CONFIG_SYSFS
 	INIT_LIST_HEAD(&bdev->bd_holder_disks);
 #endif
@@ -1204,7 +1203,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	struct bd_holder_disk *holder;
 	int ret = 0;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 
 	WARN_ON_ONCE(!bdev->bd_holder);
 
@@ -1249,7 +1248,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 out_free:
 	kfree(holder);
 out_unlock:
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(bd_link_disk_holder);
@@ -1268,7 +1267,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 {
 	struct bd_holder_disk *holder;
 
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 
 	holder = bd_find_holder_disk(bdev, disk);
 
@@ -1281,7 +1280,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 		kfree(holder);
 	}
 
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 }
 EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
 #endif
@@ -1293,7 +1292,7 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
 	struct gendisk *disk = bdev->bd_disk;
 	int ret;
 
-	lockdep_assert_held(&bdev->bd_mutex);
+	lockdep_assert_held(&bdev->bd_disk->mutex);
 
 	clear_bit(GD_NEED_PART_SCAN, &bdev->bd_disk->state);
 
@@ -1357,13 +1356,6 @@ static int bdev_get_gendisk(struct gendisk *disk)
 	return -ENXIO;
 }
 
-/*
- * bd_mutex locking:
- *
- *  mutex_lock(part->bd_mutex)
- *    mutex_lock_nested(whole->bd_mutex, 1)
- */
-
 static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		int for_part)
 {
@@ -1377,15 +1369,18 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	if (ret)
 		goto out;
 
-	if (!for_part && (mode & FMODE_EXCL)) {
-		WARN_ON_ONCE(!holder);
-		ret = bd_prepare_to_claim(bdev, holder);
-		if (ret)
-			goto out_put_disk;
+	if (!for_part) {
+		if (mode & FMODE_EXCL) {
+			WARN_ON_ONCE(!holder);
+			ret = bd_prepare_to_claim(bdev, holder);
+			if (ret)
+				goto out_put_disk;
+		}
+
+		disk_block_events(disk);
+		mutex_lock(&disk->mutex);
 	}
 
-	disk_block_events(disk);
-	mutex_lock_nested(&bdev->bd_mutex, for_part);
 	if (!bdev->bd_openers) {
 		first_open = true;
 
@@ -1470,10 +1465,14 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 			unblock_events = false;
 		}
 	}
-	mutex_unlock(&bdev->bd_mutex);
 
-	if (unblock_events)
-		disk_unblock_events(disk);
+	if (!for_part) {
+		mutex_unlock(&disk->mutex);
+
+		if (unblock_events)
+			disk_unblock_events(disk);
+	}
+
 
 	/* only one opener holds the module reference */
 	if (!first_open)
@@ -1486,10 +1485,12 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	if (bdev_is_partition(bdev))
 		__blkdev_put(bdev_whole(bdev), mode, 1);
  out_unlock_bdev:
-	if (!for_part && (mode & FMODE_EXCL))
-		bd_abort_claiming(bdev, holder);
-	mutex_unlock(&bdev->bd_mutex);
-	disk_unblock_events(disk);
+	if (!for_part) {
+		if (mode & FMODE_EXCL)
+			bd_abort_claiming(bdev, holder);
+		mutex_unlock(&disk->mutex);
+		disk_unblock_events(disk);
+	}
  out_put_disk:
 	module_put(disk->fops->owner);
 	if (need_restart)
@@ -1668,9 +1669,10 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 	if (bdev->bd_openers == 1)
 		sync_blockdev(bdev);
 
-	mutex_lock_nested(&bdev->bd_mutex, for_part);
 	if (for_part)
 		bdev->bd_part_count--;
+	else
+		mutex_lock(&disk->mutex);
 
 	if (!--bdev->bd_openers) {
 		WARN_ON_ONCE(bdev->bd_holders);
@@ -1691,7 +1693,8 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 
 		module_put(disk->fops->owner);
 	}
-	mutex_unlock(&bdev->bd_mutex);
+	if (!for_part)
+		mutex_unlock(&disk->mutex);
 	bdput(bdev);
 	if (victim)
 		__blkdev_put(victim, mode, 1);
@@ -1699,7 +1702,7 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 
 void blkdev_put(struct block_device *bdev, fmode_t mode)
 {
-	mutex_lock(&bdev->bd_mutex);
+	mutex_lock(&bdev->bd_disk->mutex);
 
 	if (mode & FMODE_EXCL) {
 		struct block_device *whole = bdev_whole(bdev);
@@ -1707,7 +1710,7 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 
 		/*
 		 * Release a claim on the device.  The holder fields
-		 * are protected with bdev_lock.  bd_mutex is to
+		 * are protected with bdev_lock.  disk->mutex is to
 		 * synchronize disk_holder unlinking.
 		 */
 		spin_lock(&bdev_lock);
@@ -1739,7 +1742,7 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 	 */
 	disk_flush_events(bdev->bd_disk, DISK_EVENT_MEDIA_CHANGE);
 
-	mutex_unlock(&bdev->bd_mutex);
+	mutex_unlock(&bdev->bd_disk->mutex);
 
 	__blkdev_put(bdev, mode, 0);
 }
@@ -2039,10 +2042,10 @@ void iterate_bdevs(void (*func)(struct block_device *, void *), void *arg)
 		old_inode = inode;
 		bdev = I_BDEV(inode);
 
-		mutex_lock(&bdev->bd_mutex);
+		mutex_lock(&bdev->bd_disk->mutex);
 		if (bdev->bd_openers)
 			func(bdev, arg);
-		mutex_unlock(&bdev->bd_mutex);
+		mutex_unlock(&bdev->bd_disk->mutex);
 
 		spin_lock(&blockdev_superblock->s_inode_list_lock);
 	}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a6406b3b8c2b4f..ce43732f945f45 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1237,7 +1237,7 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 	lockdep_assert_held(&uuid_mutex);
 	/*
 	 * The device_list_mutex cannot be taken here in case opening the
-	 * underlying device takes further locks like bd_mutex.
+	 * underlying device takes further locks like disk->mutex.
 	 *
 	 * We also don't need the lock here as this is called during mount and
 	 * exclusion is provided by uuid_mutex
diff --git a/fs/super.c b/fs/super.c
index 98bb0629ee108e..b327a82bc1946b 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1328,9 +1328,9 @@ int get_tree_bdev(struct fs_context *fc,
 		}
 
 		/*
-		 * s_umount nests inside bd_mutex during
+		 * s_umount nests inside disk->mutex during
 		 * __invalidate_device().  blkdev_put() acquires
-		 * bd_mutex and can't be called under s_umount.  Drop
+		 * disk->mutex and can't be called under s_umount.  Drop
 		 * s_umount temporarily.  This is safe as we're
 		 * holding an active reference.
 		 */
@@ -1403,9 +1403,9 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
 		}
 
 		/*
-		 * s_umount nests inside bd_mutex during
+		 * s_umount nests inside disk->mutex during
 		 * __invalidate_device().  blkdev_put() acquires
-		 * bd_mutex and can't be called under s_umount.  Drop
+		 * disk->mutex and can't be called under s_umount.  Drop
 		 * s_umount temporarily.  This is safe as we're
 		 * holding an active reference.
 		 */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 041caca25fc787..0735e335ca6c0a 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -24,7 +24,6 @@ struct block_device {
 	int			bd_openers;
 	struct inode *		bd_inode;	/* will die */
 	struct super_block *	bd_super;
-	struct mutex		bd_mutex;	/* open/close mutex */
 	void *			bd_claiming;
 	void *			bd_holder;
 	int			bd_holders;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index e01618dfafc05c..bc0469cc8fb0dc 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -186,6 +186,7 @@ struct gendisk {
 	unsigned long state;
 #define GD_NEED_PART_SCAN		0
 	struct rw_semaphore lookup_sem;
+	struct mutex mutex;		/* open/close mutex */
 	struct kobject *slave_dir;
 
 	struct timer_rand_state *random;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 71/78] block: add a bdev_kobj helper
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (69 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  7:59   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next Christoph Hellwig
                   ` (8 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Add a little helper to find the kobject for a struct block_device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/bcache/super.c |  7 ++-----
 drivers/md/md.c           |  4 +---
 fs/btrfs/sysfs.c          | 15 +++------------
 include/linux/blk_types.h |  3 +++
 4 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index ea2b80c3e44c38..f6edacc81527c7 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1447,8 +1447,7 @@ static int register_bdev(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
 		goto err;
 
 	err = "error creating kobject";
-	if (kobject_add(&dc->disk.kobj, &part_to_dev(bdev->bd_part)->kobj,
-			"bcache"))
+	if (kobject_add(&dc->disk.kobj, bdev_kobj(bdev), "bcache"))
 		goto err;
 	if (bch_cache_accounting_add_kobjs(&dc->accounting, &dc->disk.kobj))
 		goto err;
@@ -2342,9 +2341,7 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
 		goto err;
 	}
 
-	if (kobject_add(&ca->kobj,
-			&part_to_dev(bdev->bd_part)->kobj,
-			"bcache")) {
+	if (kobject_add(&ca->kobj, bdev_kobj(bdev), "bcache")) {
 		err = "error calling kobject_add";
 		ret = -ENOMEM;
 		goto out;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index b2edf5e0f965b5..7ce6047c856ea2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2414,7 +2414,6 @@ EXPORT_SYMBOL(md_integrity_add_rdev);
 static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 {
 	char b[BDEVNAME_SIZE];
-	struct kobject *ko;
 	int err;
 
 	/* prevent duplicates */
@@ -2477,9 +2476,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 	if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
 		goto fail;
 
-	ko = &part_to_dev(rdev->bdev->bd_part)->kobj;
 	/* failure here is OK */
-	err = sysfs_create_link(&rdev->kobj, ko, "block");
+	err = sysfs_create_link(&rdev->kobj, bdev_kobj(rdev->bdev), "block");
 	rdev->sysfs_state = sysfs_get_dirent_safe(rdev->kobj.sd, "state");
 	rdev->sysfs_unack_badblocks =
 		sysfs_get_dirent_safe(rdev->kobj.sd, "unacknowledged_bad_blocks");
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 279d9262b676d4..24b6c6dc69000a 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1232,8 +1232,6 @@ int btrfs_sysfs_add_space_info_type(struct btrfs_fs_info *fs_info,
 
 void btrfs_sysfs_remove_device(struct btrfs_device *device)
 {
-	struct hd_struct *disk;
-	struct kobject *disk_kobj;
 	struct kobject *devices_kobj;
 
 	/*
@@ -1243,11 +1241,8 @@ void btrfs_sysfs_remove_device(struct btrfs_device *device)
 	devices_kobj = device->fs_info->fs_devices->devices_kobj;
 	ASSERT(devices_kobj);
 
-	if (device->bdev) {
-		disk = device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
-		sysfs_remove_link(devices_kobj, disk_kobj->name);
-	}
+	if (device->bdev)
+		sysfs_remove_link(devices_kobj, bdev_kobj(device->bdev)->name);
 
 	if (device->devid_kobj.state_initialized) {
 		kobject_del(&device->devid_kobj);
@@ -1353,11 +1348,7 @@ int btrfs_sysfs_add_device(struct btrfs_device *device)
 	nofs_flag = memalloc_nofs_save();
 
 	if (device->bdev) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		disk = device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		struct kobject *disk_kobj = bdev_kobj(device->bdev);
 
 		ret = sysfs_create_link(devices_kobj, disk_kobj, disk_kobj->name);
 		if (ret) {
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 0735e335ca6c0a..5a5ccacb804cdb 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -49,6 +49,9 @@ struct block_device {
 #define bdev_whole(_bdev) \
 	((_bdev)->bd_disk->part0.bdev)
 
+#define bdev_kobj(_bdev) \
+	(&part_to_dev((_bdev)->bd_part)->kobj)
+
 /*
  * Block error status values.  See block/blk-core:blk_errors for the details.
  * Alpha cannot write a byte atomically, so we need to use 32-bit value.
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (70 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 71/78] block: add a bdev_kobj helper Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  7:59   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 73/78] block: use put_device in put_disk Christoph Hellwig
                   ` (7 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Call disk_part_iter_exit in disk_part_iter_next instead of duplicating
the functionality.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 999f7142b04e7d..56bc37e98ed852 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -230,8 +230,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
 	int inc, end;
 
 	/* put the last partition */
-	disk_put_part(piter->part);
-	piter->part = NULL;
+	disk_part_iter_exit(piter);
 
 	/* get part_tbl */
 	rcu_read_lock();
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 73/78] block: use put_device in put_disk
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (71 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  8:02   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 74/78] block: merge struct block_device and struct hd_struct Christoph Hellwig
                   ` (6 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use put_device to put the device instead of poking into the internals
and using kobject_put.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 56bc37e98ed852..f1e20ec1b62887 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1659,7 +1659,7 @@ EXPORT_SYMBOL(__alloc_disk_node);
 void put_disk(struct gendisk *disk)
 {
 	if (disk)
-		kobject_put(&disk_to_dev(disk)->kobj);
+		put_device(disk_to_dev(disk));
 }
 EXPORT_SYMBOL(put_disk);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 74/78] block: merge struct block_device and struct hd_struct
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (72 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 73/78] block: use put_device in put_disk Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-20  8:58   ` Hannes Reinecke
  2020-11-16 14:58 ` [PATCH 75/78] block: stop using bdget_disk for partition 0 Christoph Hellwig
                   ` (5 subsequent siblings)
  79 siblings, 1 reply; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Instead of having two structures that represent each block device with
different lift time rules merged them into a single one.  This also
greatly simplifies the reference counting rules, as we can use the inode
reference count as the main reference count for the new struct
block_device, with the device model reference front ending it for device
model interaction.  The percpu refcount in struct hd_struct is entirely
gone given that struct block_device must be opened and thus valid for
the duration of the I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c                        |   6 +-
 block/blk-cgroup.c                 |   9 +-
 block/blk-core.c                   |  85 +++++-----
 block/blk-flush.c                  |   2 +-
 block/blk-lib.c                    |   2 +-
 block/blk-merge.c                  |   6 +-
 block/blk-mq.c                     |  11 +-
 block/blk-mq.h                     |   5 +-
 block/blk.h                        |  38 ++---
 block/genhd.c                      | 242 +++++++++++------------------
 block/ioctl.c                      |   4 +-
 block/partitions/core.c            | 221 +++++++-------------------
 drivers/block/drbd/drbd_receiver.c |   2 +-
 drivers/block/drbd/drbd_worker.c   |   2 +-
 drivers/block/zram/zram_drv.c      |   2 +-
 drivers/md/bcache/request.c        |   4 +-
 drivers/md/dm.c                    |   8 +-
 drivers/md/md.c                    |   4 +-
 drivers/nvme/target/admin-cmd.c    |  20 +--
 drivers/s390/block/dasd.c          |   8 +-
 fs/block_dev.c                     |  68 +++-----
 fs/ext4/super.c                    |  18 +--
 fs/ext4/sysfs.c                    |  10 +-
 fs/f2fs/checkpoint.c               |   5 +-
 fs/f2fs/f2fs.h                     |   2 +-
 fs/f2fs/super.c                    |   6 +-
 fs/f2fs/sysfs.c                    |   9 --
 include/linux/blk_types.h          |  23 ++-
 include/linux/blkdev.h             |  13 +-
 include/linux/genhd.h              |  67 ++------
 include/linux/part_stat.h          |  17 +-
 init/do_mounts.c                   |  20 +--
 kernel/trace/blktrace.c            |  54 ++-----
 33 files changed, 351 insertions(+), 642 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 0c5269997434d6..4df1ecd53baf8f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -608,12 +608,12 @@ void bio_truncate(struct bio *bio, unsigned new_size)
 void guard_bio_eod(struct bio *bio)
 {
 	sector_t maxsector;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	rcu_read_lock();
-	part = __disk_get_part(bio->bi_disk, bio->bi_partno);
+	part = __bdget_disk(bio->bi_disk, bio->bi_partno);
 	if (part)
-		maxsector = bdev_nr_sectors(part->bdev);
+		maxsector = bdev_nr_sectors(part);
 	else
 		maxsector = get_capacity(bio->bi_disk);
 	rcu_read_unlock();
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 4c0ae0f6bce02d..fb5076223f10f2 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -820,9 +820,9 @@ static void blkcg_fill_root_iostats(void)
 
 	class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
 	while ((dev = class_dev_iter_next(&iter))) {
-		struct gendisk *disk = dev_to_disk(dev);
-		struct hd_struct *part = disk_get_part(disk, 0);
-		struct blkcg_gq *blkg = blk_queue_root_blkg(disk->queue);
+		struct block_device *bdev = dev_to_bdev(dev);
+		struct blkcg_gq *blkg =
+			blk_queue_root_blkg(bdev->bd_disk->queue);
 		struct blkg_iostat tmp;
 		int cpu;
 
@@ -830,7 +830,7 @@ static void blkcg_fill_root_iostats(void)
 		for_each_possible_cpu(cpu) {
 			struct disk_stats *cpu_dkstats;
 
-			cpu_dkstats = per_cpu_ptr(part->dkstats, cpu);
+			cpu_dkstats = per_cpu_ptr(bdev->bd_stats, cpu);
 			tmp.ios[BLKG_IOSTAT_READ] +=
 				cpu_dkstats->ios[STAT_READ];
 			tmp.ios[BLKG_IOSTAT_WRITE] +=
@@ -849,7 +849,6 @@ static void blkcg_fill_root_iostats(void)
 			blkg_iostat_set(&blkg->iostat.cur, &tmp);
 			u64_stats_update_end(&blkg->iostat.sync);
 		}
-		disk_put_part(part);
 	}
 }
 
diff --git a/block/blk-core.c b/block/blk-core.c
index 988f45094a387b..192607c98e87c5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -119,7 +119,7 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->tag = BLK_MQ_NO_TAG;
 	rq->internal_tag = BLK_MQ_NO_TAG;
 	rq->start_time_ns = ktime_get_ns();
-	rq->part = NULL;
+	rq->bdev = NULL;
 	refcount_set(&rq->ref, 1);
 	blk_crypto_rq_set_defaults(rq);
 }
@@ -666,9 +666,9 @@ static int __init setup_fail_make_request(char *str)
 }
 __setup("fail_make_request=", setup_fail_make_request);
 
-static bool should_fail_request(struct hd_struct *part, unsigned int bytes)
+static bool should_fail_request(struct block_device *bdev, unsigned int bytes)
 {
-	return part->make_it_fail && should_fail(&fail_make_request, bytes);
+	return bdev->bd_make_it_fail && should_fail(&fail_make_request, bytes);
 }
 
 static int __init fail_make_request_debugfs(void)
@@ -683,19 +683,19 @@ late_initcall(fail_make_request_debugfs);
 
 #else /* CONFIG_FAIL_MAKE_REQUEST */
 
-static inline bool should_fail_request(struct hd_struct *part,
-					unsigned int bytes)
+static inline bool should_fail_request(struct block_device *bdev,
+		unsigned int bytes)
 {
 	return false;
 }
 
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
-static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
+static inline bool bio_check_ro(struct bio *bio, struct block_device *bdev)
 {
 	const int op = bio_op(bio);
 
-	if (part->policy && op_is_write(op)) {
+	if (bdev->bd_policy && op_is_write(op)) {
 		char b[BDEVNAME_SIZE];
 
 		if (op_is_flush(bio->bi_opf) && !bio_sectors(bio))
@@ -703,7 +703,7 @@ static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
 
 		WARN_ONCE(1,
 		       "Trying to write to read-only block-device %s (partno %d)\n",
-			bio_devname(bio, b), part->partno);
+			bio_devname(bio, b), bdev->bd_partno);
 		/* Older lvm-tools actually trigger this */
 		return false;
 	}
@@ -713,7 +713,7 @@ static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
 
 static noinline int should_fail_bio(struct bio *bio)
 {
-	if (should_fail_request(&bio->bi_disk->part0, bio->bi_iter.bi_size))
+	if (should_fail_request(bio->bi_disk->part0, bio->bi_iter.bi_size))
 		return -EIO;
 	return 0;
 }
@@ -742,11 +742,11 @@ static inline int bio_check_eod(struct bio *bio, sector_t maxsector)
  */
 static inline int blk_partition_remap(struct bio *bio)
 {
-	struct hd_struct *p;
+	struct block_device *p;
 	int ret = -EIO;
 
 	rcu_read_lock();
-	p = __disk_get_part(bio->bi_disk, bio->bi_partno);
+	p = __bdget_disk(bio->bi_disk, bio->bi_partno);
 	if (unlikely(!p))
 		goto out;
 	if (unlikely(should_fail_request(p, bio->bi_iter.bi_size)))
@@ -755,11 +755,11 @@ static inline int blk_partition_remap(struct bio *bio)
 		goto out;
 
 	if (bio_sectors(bio)) {
-		if (bio_check_eod(bio, bdev_nr_sectors(p->bdev)))
+		if (bio_check_eod(bio, bdev_nr_sectors(p)))
 			goto out;
-		bio->bi_iter.bi_sector += p->start_sect;
-		trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p),
-				      bio->bi_iter.bi_sector - p->start_sect);
+		bio->bi_iter.bi_sector += p->bd_start_sect;
+		trace_block_bio_remap(bio->bi_disk->queue, bio, p->bd_dev,
+				      bio->bi_iter.bi_sector - p->bd_start_sect);
 	}
 	bio->bi_partno = 0;
 	ret = 0;
@@ -829,7 +829,7 @@ static noinline_for_stack bool submit_bio_checks(struct bio *bio)
 		if (unlikely(blk_partition_remap(bio)))
 			goto end_io;
 	} else {
-		if (unlikely(bio_check_ro(bio, &bio->bi_disk->part0)))
+		if (unlikely(bio_check_ro(bio, bio->bi_disk->part0)))
 			goto end_io;
 		if (unlikely(bio_check_eod(bio, get_capacity(bio->bi_disk))))
 			goto end_io;
@@ -1201,7 +1201,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 		return ret;
 
 	if (rq->rq_disk &&
-	    should_fail_request(&rq->rq_disk->part0, blk_rq_bytes(rq)))
+	    should_fail_request(rq->rq_disk->part0, blk_rq_bytes(rq)))
 		return BLK_STS_IOERR;
 
 	if (blk_crypto_insert_cloned_request(rq))
@@ -1260,30 +1260,29 @@ unsigned int blk_rq_err_bytes(const struct request *rq)
 }
 EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
 
-static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
+static void update_io_ticks(struct block_device *part, unsigned long now,
+		bool end)
 {
 	unsigned long stamp;
 again:
-	stamp = READ_ONCE(part->stamp);
+	stamp = READ_ONCE(part->bd_stamp);
 	if (unlikely(stamp != now)) {
-		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
+		if (likely(cmpxchg(&part->bd_stamp, stamp, now) == stamp))
 			__part_stat_add(part, io_ticks, end ? now - stamp : 1);
 	}
-	if (part->partno) {
-		part = &part_to_disk(part)->part0;
+	if (part->bd_partno) {
+		part = part->bd_disk->part0;
 		goto again;
 	}
 }
 
 static void blk_account_io_completion(struct request *req, unsigned int bytes)
 {
-	if (req->part && blk_do_io_stat(req)) {
+	if (req->bdev && blk_do_io_stat(req)) {
 		const int sgrp = op_stat_group(req_op(req));
-		struct hd_struct *part;
 
 		part_stat_lock();
-		part = req->part;
-		part_stat_add(part, sectors[sgrp], bytes >> 9);
+		part_stat_add(req->bdev, sectors[sgrp], bytes >> 9);
 		part_stat_unlock();
 	}
 }
@@ -1295,20 +1294,15 @@ void blk_account_io_done(struct request *req, u64 now)
 	 * normal IO on queueing nor completion.  Accounting the
 	 * containing request is enough.
 	 */
-	if (req->part && blk_do_io_stat(req) &&
+	if (req->bdev && blk_do_io_stat(req) &&
 	    !(req->rq_flags & RQF_FLUSH_SEQ)) {
 		const int sgrp = op_stat_group(req_op(req));
-		struct hd_struct *part;
 
 		part_stat_lock();
-		part = req->part;
-
-		update_io_ticks(part, jiffies, true);
-		part_stat_inc(part, ios[sgrp]);
-		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
+		update_io_ticks(req->bdev, jiffies, true);
+		part_stat_inc(req->bdev, ios[sgrp]);
+		part_stat_add(req->bdev, nsecs[sgrp], now - req->start_time_ns);
 		part_stat_unlock();
-
-		hd_struct_put(part);
 	}
 }
 
@@ -1317,15 +1311,15 @@ void blk_account_io_start(struct request *rq)
 	if (!blk_do_io_stat(rq))
 		return;
 
-	rq->part = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
+	rq->bdev = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
 
 	part_stat_lock();
-	update_io_ticks(rq->part, jiffies, false);
+	update_io_ticks(rq->bdev, jiffies, false);
 	part_stat_unlock();
 }
 
-static unsigned long __part_start_io_acct(struct hd_struct *part,
-					  unsigned int sectors, unsigned int op)
+static unsigned long __part_start_io_acct(struct block_device *part,
+		unsigned int sectors, unsigned int op)
 {
 	const int sgrp = op_stat_group(op);
 	unsigned long now = READ_ONCE(jiffies);
@@ -1340,8 +1334,8 @@ static unsigned long __part_start_io_acct(struct hd_struct *part,
 	return now;
 }
 
-unsigned long part_start_io_acct(struct gendisk *disk, struct hd_struct **part,
-				 struct bio *bio)
+unsigned long part_start_io_acct(struct gendisk *disk,
+		struct block_device **part, struct bio *bio)
 {
 	*part = disk_map_sector_rcu(disk, bio->bi_iter.bi_sector);
 
@@ -1352,11 +1346,11 @@ EXPORT_SYMBOL_GPL(part_start_io_acct);
 unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
 				 unsigned int op)
 {
-	return __part_start_io_acct(&disk->part0, sectors, op);
+	return __part_start_io_acct(disk->part0, sectors, op);
 }
 EXPORT_SYMBOL(disk_start_io_acct);
 
-static void __part_end_io_acct(struct hd_struct *part, unsigned int op,
+static void __part_end_io_acct(struct block_device *part, unsigned int op,
 			       unsigned long start_time)
 {
 	const int sgrp = op_stat_group(op);
@@ -1370,18 +1364,17 @@ static void __part_end_io_acct(struct hd_struct *part, unsigned int op,
 	part_stat_unlock();
 }
 
-void part_end_io_acct(struct hd_struct *part, struct bio *bio,
+void part_end_io_acct(struct block_device *part, struct bio *bio,
 		      unsigned long start_time)
 {
 	__part_end_io_acct(part, bio_op(bio), start_time);
-	hd_struct_put(part);
 }
 EXPORT_SYMBOL_GPL(part_end_io_acct);
 
 void disk_end_io_acct(struct gendisk *disk, unsigned int op,
 		      unsigned long start_time)
 {
-	__part_end_io_acct(&disk->part0, op, start_time);
+	__part_end_io_acct(disk->part0, op, start_time);
 }
 EXPORT_SYMBOL(disk_end_io_acct);
 
diff --git a/block/blk-flush.c b/block/blk-flush.c
index e32958f0b68750..9507dcdd58814c 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -139,7 +139,7 @@ static void blk_flush_queue_rq(struct request *rq, bool add_front)
 
 static void blk_account_io_flush(struct request *rq)
 {
-	struct hd_struct *part = &rq->rq_disk->part0;
+	struct block_device *part = rq->rq_disk->part0;
 
 	part_stat_lock();
 	part_stat_inc(part, ios[STAT_FLUSH]);
diff --git a/block/blk-lib.c b/block/blk-lib.c
index e90614fd8d6a42..752f9c7220622a 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -65,7 +65,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 
 	/* In case the discard request is in a partition */
 	if (bdev_is_partition(bdev))
-		part_offset = bdev->bd_part->start_sect;
+		part_offset = bdev->bd_start_sect;
 
 	while (nr_sects) {
 		sector_t granularity_aligned_lba, req_sects;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index bcf5e458060337..3ec0d322e4a769 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -681,10 +681,8 @@ static void blk_account_io_merge_request(struct request *req)
 {
 	if (blk_do_io_stat(req)) {
 		part_stat_lock();
-		part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+		part_stat_inc(req->bdev, merges[op_stat_group(req_op(req))]);
 		part_stat_unlock();
-
-		hd_struct_put(req->part);
 	}
 }
 
@@ -906,7 +904,7 @@ static void blk_account_io_merge_bio(struct request *req)
 		return;
 
 	part_stat_lock();
-	part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+	part_stat_inc(req->bdev, merges[op_stat_group(req_op(req))]);
 	part_stat_unlock();
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 55bcee5dc0320c..a28475e6405de9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -95,7 +95,7 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx,
 }
 
 struct mq_inflight {
-	struct hd_struct *part;
+	struct block_device *part;
 	unsigned int inflight[2];
 };
 
@@ -105,13 +105,14 @@ static bool blk_mq_check_inflight(struct blk_mq_hw_ctx *hctx,
 {
 	struct mq_inflight *mi = priv;
 
-	if (rq->part == mi->part && blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
+	if (rq->bdev == mi->part && blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
 		mi->inflight[rq_data_dir(rq)]++;
 
 	return true;
 }
 
-unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part)
+unsigned int blk_mq_in_flight(struct request_queue *q,
+		struct block_device *part)
 {
 	struct mq_inflight mi = { .part = part };
 
@@ -120,7 +121,7 @@ unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part)
 	return mi.inflight[0] + mi.inflight[1];
 }
 
-void blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part,
+void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
 			 unsigned int inflight[2])
 {
 	struct mq_inflight mi = { .part = part };
@@ -300,7 +301,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	INIT_HLIST_NODE(&rq->hash);
 	RB_CLEAR_NODE(&rq->rb_node);
 	rq->rq_disk = NULL;
-	rq->part = NULL;
+	rq->bdev = NULL;
 #ifdef CONFIG_BLK_RQ_ALLOC_TIME
 	rq->alloc_time_ns = alloc_time_ns;
 #endif
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a52703c98b7736..395fbc6c59d1eb 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -182,8 +182,9 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
 	return hctx->nr_ctx && hctx->tags;
 }
 
-unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part);
-void blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part,
+unsigned int blk_mq_in_flight(struct request_queue *q,
+		struct block_device *bdev);
+void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *bdev,
 			 unsigned int inflight[2]);
 
 static inline void blk_mq_put_dispatch_budget(struct request_queue *q)
diff --git a/block/blk.h b/block/blk.h
index 7d10bb24eb282d..90dd2047c6cd29 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -215,7 +215,15 @@ static inline void elevator_exit(struct request_queue *q,
 	__elevator_exit(q, e);
 }
 
-struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
+static inline struct block_device *__bdget_disk(struct gendisk *disk,
+		int partno)
+{
+	struct disk_part_tbl *ptbl = rcu_dereference(disk->part_tbl);
+
+	if (unlikely(partno < 0 || partno >= ptbl->len))
+		return NULL;
+	return rcu_dereference(ptbl->part[partno]);
+}
 
 ssize_t part_size_show(struct device *dev, struct device_attribute *attr,
 		char *buf);
@@ -348,43 +356,21 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q);
 static inline void blk_queue_free_zone_bitmaps(struct request_queue *q) {}
 #endif
 
-struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, sector_t sector);
+struct block_device *disk_map_sector_rcu(struct gendisk *disk, sector_t sector);
 
-int blk_alloc_devt(struct hd_struct *part, dev_t *devt);
+int blk_alloc_devt(struct block_device *bdev, dev_t *devt);
 void blk_free_devt(dev_t devt);
 char *disk_name(struct gendisk *hd, int partno, char *buf);
 #define ADDPART_FLAG_NONE	0
 #define ADDPART_FLAG_RAID	1
 #define ADDPART_FLAG_WHOLEDISK	2
-void delete_partition(struct hd_struct *part);
+void delete_partition(struct block_device *part);
 int bdev_add_partition(struct block_device *bdev, int partno,
 		sector_t start, sector_t length);
 int bdev_del_partition(struct block_device *bdev, int partno);
 int bdev_resize_partition(struct block_device *bdev, int partno,
 		sector_t start, sector_t length);
 int disk_expand_part_tbl(struct gendisk *disk, int target);
-int hd_ref_init(struct hd_struct *part);
-
-/* no need to get/put refcount of part0 */
-static inline int hd_struct_try_get(struct hd_struct *part)
-{
-	if (part->partno)
-		return percpu_ref_tryget_live(&part->ref);
-	return 1;
-}
-
-static inline void hd_struct_put(struct hd_struct *part)
-{
-	if (part->partno)
-		percpu_ref_put(&part->ref);
-}
-
-static inline void hd_free_part(struct hd_struct *part)
-{
-	free_percpu(part->dkstats);
-	kfree(part->info);
-	percpu_ref_exit(&part->ref);
-}
 
 int bio_add_hw_page(struct request_queue *q, struct bio *bio,
 		struct page *page, unsigned int len, unsigned int offset,
diff --git a/block/genhd.c b/block/genhd.c
index f1e20ec1b62887..5dcb8b8902daae 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -40,7 +40,7 @@ static void disk_release_events(struct gendisk *disk);
 
 void set_capacity(struct gendisk *disk, sector_t sectors)
 {
-	struct block_device *bdev = disk->part0.bdev;
+	struct block_device *bdev = disk->part0;
 
 	spin_lock(&bdev->bd_size_lock);
 	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
@@ -93,13 +93,14 @@ const char *bdevname(struct block_device *bdev, char *buf)
 }
 EXPORT_SYMBOL(bdevname);
 
-static void part_stat_read_all(struct hd_struct *part, struct disk_stats *stat)
+static void part_stat_read_all(struct block_device *part,
+		struct disk_stats *stat)
 {
 	int cpu;
 
 	memset(stat, 0, sizeof(struct disk_stats));
 	for_each_possible_cpu(cpu) {
-		struct disk_stats *ptr = per_cpu_ptr(part->dkstats, cpu);
+		struct disk_stats *ptr = per_cpu_ptr(part->bd_stats, cpu);
 		int group;
 
 		for (group = 0; group < NR_STAT_GROUPS; group++) {
@@ -113,7 +114,7 @@ static void part_stat_read_all(struct hd_struct *part, struct disk_stats *stat)
 	}
 }
 
-static unsigned int part_in_flight(struct hd_struct *part)
+static unsigned int part_in_flight(struct block_device *part)
 {
 	unsigned int inflight = 0;
 	int cpu;
@@ -128,7 +129,8 @@ static unsigned int part_in_flight(struct hd_struct *part)
 	return inflight;
 }
 
-static void part_in_flight_rw(struct hd_struct *part, unsigned int inflight[2])
+static void part_in_flight_rw(struct block_device *part,
+		unsigned int inflight[2])
 {
 	int cpu;
 
@@ -144,42 +146,6 @@ static void part_in_flight_rw(struct hd_struct *part, unsigned int inflight[2])
 		inflight[1] = 0;
 }
 
-struct hd_struct *__disk_get_part(struct gendisk *disk, int partno)
-{
-	struct disk_part_tbl *ptbl = rcu_dereference(disk->part_tbl);
-
-	if (unlikely(partno < 0 || partno >= ptbl->len))
-		return NULL;
-	return rcu_dereference(ptbl->part[partno]);
-}
-
-/**
- * disk_get_part - get partition
- * @disk: disk to look partition from
- * @partno: partition number
- *
- * Look for partition @partno from @disk.  If found, increment
- * reference count and return it.
- *
- * CONTEXT:
- * Don't care.
- *
- * RETURNS:
- * Pointer to the found partition on success, NULL if not found.
- */
-struct hd_struct *disk_get_part(struct gendisk *disk, int partno)
-{
-	struct hd_struct *part;
-
-	rcu_read_lock();
-	part = __disk_get_part(disk, partno);
-	if (part)
-		get_device(part_to_dev(part));
-	rcu_read_unlock();
-
-	return part;
-}
-
 /**
  * disk_part_iter_init - initialize partition iterator
  * @piter: iterator to initialize
@@ -224,7 +190,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_init);
  * CONTEXT:
  * Don't care.
  */
-struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
+struct block_device *disk_part_iter_next(struct disk_part_iter *piter)
 {
 	struct disk_part_tbl *ptbl;
 	int inc, end;
@@ -251,19 +217,18 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
 
 	/* iterate to the next partition */
 	for (; piter->idx != end; piter->idx += inc) {
-		struct hd_struct *part;
+		struct block_device *part;
 
 		part = rcu_dereference(ptbl->part[piter->idx]);
 		if (!part)
 			continue;
-		if (!bdev_nr_sectors(part->bdev) &&
+		if (!bdev_nr_sectors(part) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
 		      piter->idx == 0))
 			continue;
 
-		get_device(part_to_dev(part));
-		piter->part = part;
+		piter->part = bdgrab(part);
 		piter->idx += inc;
 		break;
 	}
@@ -285,15 +250,16 @@ EXPORT_SYMBOL_GPL(disk_part_iter_next);
  */
 void disk_part_iter_exit(struct disk_part_iter *piter)
 {
-	disk_put_part(piter->part);
+	if (piter->part)
+		bdput(piter->part);
 	piter->part = NULL;
 }
 EXPORT_SYMBOL_GPL(disk_part_iter_exit);
 
-static inline int sector_in_part(struct hd_struct *part, sector_t sector)
+static inline int sector_in_part(struct block_device *part, sector_t sector)
 {
-	return part->start_sect <= sector &&
-		sector < part->start_sect + bdev_nr_sectors(part->bdev);
+	return part->bd_start_sect <= sector &&
+		sector < part->bd_start_sect + bdev_nr_sectors(part);
 }
 
 /**
@@ -313,36 +279,28 @@ static inline int sector_in_part(struct hd_struct *part, sector_t sector)
  * Found partition on success, part0 is returned if no partition matches
  * or the matched partition is being deleted.
  */
-struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, sector_t sector)
+struct block_device *disk_map_sector_rcu(struct gendisk *disk, sector_t sector)
 {
 	struct disk_part_tbl *ptbl;
-	struct hd_struct *part;
+	struct block_device *part;
 	int i;
 
 	rcu_read_lock();
 	ptbl = rcu_dereference(disk->part_tbl);
 
 	part = rcu_dereference(ptbl->last_lookup);
-	if (part && sector_in_part(part, sector) && hd_struct_try_get(part))
+	if (part && sector_in_part(part, sector))
 		goto out_unlock;
 
 	for (i = 1; i < ptbl->len; i++) {
 		part = rcu_dereference(ptbl->part[i]);
-
 		if (part && sector_in_part(part, sector)) {
-			/*
-			 * only live partition can be cached for lookup,
-			 * so use-after-free on cached & deleting partition
-			 * can be avoided
-			 */
-			if (!hd_struct_try_get(part))
-				break;
 			rcu_assign_pointer(ptbl->last_lookup, part);
 			goto out_unlock;
 		}
 	}
 
-	part = &disk->part0;
+	part = disk->part0;
 out_unlock:
 	rcu_read_unlock();
 	return part;
@@ -557,7 +515,7 @@ static int blk_mangle_minor(int minor)
 
 /**
  * blk_alloc_devt - allocate a dev_t for a partition
- * @part: partition to allocate dev_t for
+ * @bdev: partition to allocate dev_t for
  * @devt: out parameter for resulting dev_t
  *
  * Allocate a dev_t for block device.
@@ -569,14 +527,14 @@ static int blk_mangle_minor(int minor)
  * CONTEXT:
  * Might sleep.
  */
-int blk_alloc_devt(struct hd_struct *part, dev_t *devt)
+int blk_alloc_devt(struct block_device *bdev, dev_t *devt)
 {
-	struct gendisk *disk = part_to_disk(part);
+	struct gendisk *disk = bdev->bd_disk;
 	int idx;
 
 	/* in consecutive minor range? */
-	if (part->partno < disk->minors) {
-		*devt = MKDEV(disk->major, disk->first_minor + part->partno);
+	if (bdev->bd_partno < disk->minors) {
+		*devt = MKDEV(disk->major, disk->first_minor + bdev->bd_partno);
 		return 0;
 	}
 
@@ -633,7 +591,7 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 {
 	struct device *ddev = disk_to_dev(disk);
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 	int err;
 
 	ddev->parent = parent;
@@ -665,7 +623,8 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 	 */
 	pm_runtime_set_memalloc_noio(ddev, true);
 
-	disk->part0.holder_dir = kobject_create_and_add("holders", &ddev->kobj);
+	disk->part0->bd_holder_dir =
+		kobject_create_and_add("holders", &ddev->kobj);
 	disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
 
 	if (disk->flags & GENHD_FL_HIDDEN) {
@@ -682,7 +641,7 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 	/* announce possible partitions */
 	disk_part_iter_init(&piter, disk, 0);
 	while ((part = disk_part_iter_next(&piter)))
-		kobject_uevent(&part_to_dev(part)->kobj, KOBJ_ADD);
+		kobject_uevent(bdev_kobj(part), KOBJ_ADD);
 	disk_part_iter_exit(&piter);
 
 	if (disk->queue->backing_dev_info->dev) {
@@ -731,7 +690,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 
 	disk->flags |= GENHD_FL_UP;
 
-	retval = blk_alloc_devt(&disk->part0, &devt);
+	retval = blk_alloc_devt(disk->part0, &devt);
 	if (retval) {
 		WARN_ON(1);
 		return;
@@ -758,7 +717,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 		ret = bdi_register(bdi, "%u:%u", MAJOR(devt), MINOR(devt));
 		WARN_ON(ret);
 		bdi_set_owner(bdi, dev);
-		bdev_add(disk->part0.bdev, devt);
+		bdev_add(disk->part0, devt);
 	}
 	register_disk(parent, disk, groups);
 	if (register_queue)
@@ -788,14 +747,8 @@ void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
 }
 EXPORT_SYMBOL(device_add_disk_no_queue_reg);
 
-static void invalidate_partition(struct gendisk *disk, int partno)
+static void invalidate_partition(struct block_device *bdev)
 {
-	struct block_device *bdev;
-
-	bdev = bdget_disk(disk, partno);
-	if (!bdev)
-		return;
-
 	fsync_bdev(bdev);
 	__invalidate_device(bdev, true);
 
@@ -804,7 +757,6 @@ static void invalidate_partition(struct gendisk *disk, int partno)
 	 * as last inode reference is dropped.
 	 */
 	remove_inode_hash(bdev->bd_inode);
-	bdput(bdev);
 }
 
 /**
@@ -829,7 +781,7 @@ static void invalidate_partition(struct gendisk *disk, int partno)
 void del_gendisk(struct gendisk *disk)
 {
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	might_sleep();
 
@@ -848,12 +800,12 @@ void del_gendisk(struct gendisk *disk)
 	disk_part_iter_init(&piter, disk,
 			     DISK_PITER_INCL_EMPTY | DISK_PITER_REVERSE);
 	while ((part = disk_part_iter_next(&piter))) {
-		invalidate_partition(disk, part->partno);
+		invalidate_partition(part);
 		delete_partition(part);
 	}
 	disk_part_iter_exit(&piter);
 
-	invalidate_partition(disk, 0);
+	invalidate_partition(disk->part0);
 	set_capacity(disk, 0);
 	disk->flags &= ~GENHD_FL_UP;
 	up_write(&disk->lookup_sem);
@@ -870,11 +822,11 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_unregister_queue(disk);
 
-	kobject_put(disk->part0.holder_dir);
+	kobject_put(disk->part0->bd_holder_dir);
 	kobject_put(disk->slave_dir);
 
-	part_stat_set_all(&disk->part0, 0);
-	disk->part0.stamp = 0;
+	part_stat_set_all(disk->part0, 0);
+	disk->part0->bd_stamp = 0;
 	if (!sysfs_deprecated)
 		sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk)));
 	pm_runtime_set_memalloc_noio(disk_to_dev(disk), false);
@@ -942,13 +894,13 @@ void blk_request_module(dev_t devt)
  */
 struct block_device *bdget_disk(struct gendisk *disk, int partno)
 {
-	struct hd_struct *part;
 	struct block_device *bdev = NULL;
 
-	part = disk_get_part(disk, partno);
-	if (part)
-		bdev = bdget_part(part);
-	disk_put_part(part);
+	rcu_read_lock();
+	bdev = __bdget_disk(disk, partno);
+	if (bdev)
+		bdgrab(bdev);
+	rcu_read_unlock();
 
 	return bdev;
 }
@@ -968,7 +920,7 @@ void __init printk_all_partitions(void)
 	while ((dev = class_dev_iter_next(&iter))) {
 		struct gendisk *disk = dev_to_disk(dev);
 		struct disk_part_iter piter;
-		struct hd_struct *part;
+		struct block_device *part;
 		char name_buf[BDEVNAME_SIZE];
 		char devt_buf[BDEVT_SIZE];
 
@@ -987,13 +939,14 @@ void __init printk_all_partitions(void)
 		 */
 		disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0);
 		while ((part = disk_part_iter_next(&piter))) {
-			bool is_part0 = part == &disk->part0;
+			bool is_part0 = part == disk->part0;
 
 			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
-			       bdevt_str(part_devt(part), devt_buf),
-			       bdev_nr_sectors(part->bdev) >> 1
-			       , disk_name(disk, part->partno, name_buf),
-			       part->info ? part->info->uuid : "");
+			       bdevt_str(part->bd_dev, devt_buf),
+			       bdev_nr_sectors(part) >> 1,
+			       disk_name(disk, part->bd_partno, name_buf),
+			       part->bd_meta_info ?
+					part->bd_meta_info->uuid : "");
 			if (is_part0) {
 				if (dev->parent && dev->parent->driver)
 					printk(" driver: %s\n",
@@ -1069,7 +1022,7 @@ static int show_partition(struct seq_file *seqf, void *v)
 {
 	struct gendisk *sgp = v;
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 	char buf[BDEVNAME_SIZE];
 
 	/* Don't show non-partitionable removeable devices or empty devices */
@@ -1083,9 +1036,9 @@ static int show_partition(struct seq_file *seqf, void *v)
 	disk_part_iter_init(&piter, sgp, DISK_PITER_INCL_PART0);
 	while ((part = disk_part_iter_next(&piter)))
 		seq_printf(seqf, "%4d  %7d %10llu %s\n",
-			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
-			   bdev_nr_sectors(part->bdev) >> 1,
-			   disk_name(sgp, part->partno, buf));
+			   MAJOR(part->bd_dev), MINOR(part->bd_dev),
+			   bdev_nr_sectors(part) >> 1,
+			   disk_name(sgp, part->bd_partno, buf));
 	disk_part_iter_exit(&piter);
 
 	return 0;
@@ -1164,24 +1117,22 @@ static ssize_t disk_ro_show(struct device *dev,
 ssize_t part_size_show(struct device *dev,
 		       struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-
-	return sprintf(buf, "%llu\n", bdev_nr_sectors(p->bdev));
+	return sprintf(buf, "%llu\n", bdev_nr_sectors(dev_to_bdev(dev)));
 }
 
 ssize_t part_stat_show(struct device *dev,
 		       struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-	struct request_queue *q = part_to_disk(p)->queue;
+	struct block_device *bdev = dev_to_bdev(dev);
+	struct request_queue *q = bdev->bd_disk->queue;
 	struct disk_stats stat;
 	unsigned int inflight;
 
-	part_stat_read_all(p, &stat);
+	part_stat_read_all(bdev, &stat);
 	if (queue_is_mq(q))
-		inflight = blk_mq_in_flight(q, p);
+		inflight = blk_mq_in_flight(q, bdev);
 	else
-		inflight = part_in_flight(p);
+		inflight = part_in_flight(bdev);
 
 	return sprintf(buf,
 		"%8lu %8lu %8llu %8u "
@@ -1216,14 +1167,14 @@ ssize_t part_stat_show(struct device *dev,
 ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-	struct request_queue *q = part_to_disk(p)->queue;
+	struct block_device *bdev = dev_to_bdev(dev);
+	struct request_queue *q = bdev->bd_disk->queue;
 	unsigned int inflight[2];
 
 	if (queue_is_mq(q))
-		blk_mq_in_flight_rw(q, p, inflight);
+		blk_mq_in_flight_rw(q, bdev, inflight);
 	else
-		part_in_flight_rw(p, inflight);
+		part_in_flight_rw(bdev, inflight);
 
 	return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]);
 }
@@ -1271,16 +1222,14 @@ static DEVICE_ATTR(badblocks, 0644, disk_badblocks_show, disk_badblocks_store);
 ssize_t part_fail_show(struct device *dev,
 		       struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-
-	return sprintf(buf, "%d\n", p->make_it_fail);
+	return sprintf(buf, "%d\n", dev_to_bdev(dev)->make_it_fail);
 }
 
 ssize_t part_fail_store(struct device *dev,
 			struct device_attribute *attr,
 			const char *buf, size_t count)
 {
-	struct hd_struct *p = dev_to_part(dev);
+	struct block_device *p = dev_to_bdev(dev);
 	int i;
 
 	if (count > 0 && sscanf(buf, "%d", &i) > 0)
@@ -1441,9 +1390,9 @@ static void disk_release(struct device *dev)
 	disk_release_events(disk);
 	kfree(disk->random);
 	disk_replace_part_tbl(disk, NULL);
-	hd_free_part(&disk->part0);
 	if (disk->queue)
 		blk_put_queue(disk->queue);
+	bdput(disk->part0);
 	kfree(disk);
 }
 struct class block_class = {
@@ -1479,7 +1428,7 @@ static int diskstats_show(struct seq_file *seqf, void *v)
 {
 	struct gendisk *gp = v;
 	struct disk_part_iter piter;
-	struct hd_struct *hd;
+	struct block_device *hd;
 	char buf[BDEVNAME_SIZE];
 	unsigned int inflight;
 	struct disk_stats stat;
@@ -1507,8 +1456,8 @@ static int diskstats_show(struct seq_file *seqf, void *v)
 			   "%lu %lu %lu %u "
 			   "%lu %u"
 			   "\n",
-			   MAJOR(part_devt(hd)), MINOR(part_devt(hd)),
-			   disk_name(gp, hd->partno, buf),
+			   MAJOR(hd->bd_dev), MINOR(hd->bd_dev),
+			   disk_name(gp, hd->bd_partno, buf),
 			   stat.ios[STAT_READ],
 			   stat.merges[STAT_READ],
 			   stat.sectors[STAT_READ],
@@ -1564,9 +1513,9 @@ dev_t blk_lookup_devt(const char *name, int partno)
 	struct device *dev;
 
 	class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
-	while ((dev = class_dev_iter_next(&iter))) {
+	while ((dev = class_dev_iter_next(&iter)) && !devt) {
 		struct gendisk *disk = dev_to_disk(dev);
-		struct hd_struct *part;
+		struct block_device *bdev;
 
 		if (strcmp(dev_name(dev), name))
 			continue;
@@ -1577,15 +1526,13 @@ dev_t blk_lookup_devt(const char *name, int partno)
 			 */
 			devt = MKDEV(MAJOR(dev->devt),
 				     MINOR(dev->devt) + partno);
-			break;
+		} else {
+			rcu_read_lock();
+			bdev = __bdget_disk(disk, partno);
+			if (bdev)
+				devt = bdev->bd_dev;
+			rcu_read_unlock();
 		}
-		part = disk_get_part(disk, partno);
-		if (part) {
-			devt = part_devt(part);
-			disk_put_part(part);
-			break;
-		}
-		disk_put_part(part);
 	}
 	class_dev_iter_exit(&iter);
 	return devt;
@@ -1607,27 +1554,18 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	if (!disk)
 		return NULL;
 
-	disk->part0.bdev = bdev_alloc(disk, 0);
-	if (!disk->part0.bdev)
+	disk->part0 = bdev_alloc(disk, 0);
+	if (!disk->part0)
 		goto out_free_disk;
 
-	disk->part0.dkstats = alloc_percpu(struct disk_stats);
-	if (!disk->part0.dkstats)
-		goto out_bdput;
-
 	mutex_init(&disk->mutex);
 	init_rwsem(&disk->lookup_sem);
 	disk->node_id = node_id;
-	if (disk_expand_part_tbl(disk, 0)) {
-		free_percpu(disk->part0.dkstats);
-		goto out_free_disk;
-	}
+	if (disk_expand_part_tbl(disk, 0))
+		goto out_bdput;
 
 	ptbl = rcu_dereference_protected(disk->part_tbl, 1);
-	rcu_assign_pointer(ptbl->part[0], &disk->part0);
-
-	if (hd_ref_init(&disk->part0))
-		goto out_free_part0;
+	rcu_assign_pointer(ptbl->part[0], disk->part0);
 
 	disk->minors = minors;
 	rand_initialize_disk(disk);
@@ -1636,10 +1574,8 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	device_initialize(disk_to_dev(disk));
 	return disk;
 
-out_free_part0:
-	hd_free_part(&disk->part0);
 out_bdput:
-	bdput(disk->part0.bdev);
+	bdput(disk->part0);
 out_free_disk:
 	kfree(disk);
 	return NULL;
@@ -1676,16 +1612,16 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 void set_disk_ro(struct gendisk *disk, int flag)
 {
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 
-	if (disk->part0.policy != flag) {
+	if (disk->part0->bd_policy != flag) {
 		set_disk_ro_uevent(disk, flag);
-		disk->part0.policy = flag;
+		disk->part0->bd_policy = flag;
 	}
 
 	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY);
 	while ((part = disk_part_iter_next(&piter)))
-		part->policy = flag;
+		part->bd_policy = flag;
 	disk_part_iter_exit(&piter);
 }
 
@@ -1695,7 +1631,7 @@ int bdev_read_only(struct block_device *bdev)
 {
 	if (!bdev)
 		return 0;
-	return bdev->bd_part->policy;
+	return bdev->bd_policy;
 }
 
 EXPORT_SYMBOL(bdev_read_only);
diff --git a/block/ioctl.c b/block/ioctl.c
index 18adf9b16a30f6..7207b716b6c9a7 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -35,7 +35,7 @@ static int blkpg_do_ioctl(struct block_device *bdev,
 	start = p.start >> SECTOR_SHIFT;
 	length = p.length >> SECTOR_SHIFT;
 
-	/* check for fit in a hd_struct */
+	/* check for fit in a sector_t */
 	if (sizeof(sector_t) < sizeof(long long)) {
 		long pstart = start, plength = length;
 
@@ -355,7 +355,7 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
 			return ret;
 	}
 	if (bdev_is_partition(bdev))
-		bdev->bd_part->policy = n;
+		bdev->bd_policy = n;
 	else
 		set_disk_ro(bdev->bd_disk, n);
 	return 0;
diff --git a/block/partitions/core.c b/block/partitions/core.c
index e50b5ca17df550..e22f1b2d5c423d 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -182,44 +182,39 @@ static struct parsed_partitions *check_partition(struct gendisk *hd,
 static ssize_t part_partition_show(struct device *dev,
 				   struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-
-	return sprintf(buf, "%d\n", p->partno);
+	return sprintf(buf, "%d\n", dev_to_bdev(dev)->bd_partno);
 }
 
 static ssize_t part_start_show(struct device *dev,
 			       struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-
-	return sprintf(buf, "%llu\n",(unsigned long long)p->start_sect);
+	return sprintf(buf, "%llu\n", dev_to_bdev(dev)->bd_start_sect);
 }
 
 static ssize_t part_ro_show(struct device *dev,
 			    struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
-	return sprintf(buf, "%d\n", p->policy ? 1 : 0);
+	return sprintf(buf, "%d\n", dev_to_bdev(dev)->bd_policy ? 1 : 0);
 }
 
 static ssize_t part_alignment_offset_show(struct device *dev,
 					  struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
+	struct block_device *bdev = dev_to_bdev(dev);
 
 	return sprintf(buf, "%u\n",
-		queue_limit_alignment_offset(&part_to_disk(p)->queue->limits,
-				p->start_sect));
+		queue_limit_alignment_offset(&bdev->bd_disk->queue->limits,
+				bdev->bd_start_sect));
 }
 
 static ssize_t part_discard_alignment_show(struct device *dev,
 					   struct device_attribute *attr, char *buf)
 {
-	struct hd_struct *p = dev_to_part(dev);
+	struct block_device *bdev = dev_to_bdev(dev);
 
 	return sprintf(buf, "%u\n",
-		queue_limit_discard_alignment(&part_to_disk(p)->queue->limits,
-				p->start_sect));
+		queue_limit_discard_alignment(&bdev->bd_disk->queue->limits,
+				bdev->bd_start_sect));
 }
 
 static DEVICE_ATTR(partition, 0444, part_partition_show, NULL);
@@ -264,19 +259,19 @@ static const struct attribute_group *part_attr_groups[] = {
 
 static void part_release(struct device *dev)
 {
-	struct hd_struct *p = dev_to_part(dev);
+	struct block_device *p = dev_to_bdev(dev);
+
 	blk_free_devt(dev->devt);
-	hd_free_part(p);
-	kfree(p);
+	bdput(p);
 }
 
 static int part_uevent(struct device *dev, struct kobj_uevent_env *env)
 {
-	struct hd_struct *part = dev_to_part(dev);
+	struct block_device *part = dev_to_bdev(dev);
 
-	add_uevent_var(env, "PARTN=%u", part->partno);
-	if (part->info && part->info->volname[0])
-		add_uevent_var(env, "PARTNAME=%s", part->info->volname);
+	add_uevent_var(env, "PARTN=%u", part->bd_partno);
+	if (part->bd_meta_info && part->bd_meta_info->volname[0])
+		add_uevent_var(env, "PARTNAME=%s", part->bd_meta_info->volname);
 	return 0;
 }
 
@@ -287,72 +282,21 @@ struct device_type part_type = {
 	.uevent		= part_uevent,
 };
 
-static void hd_struct_free_work(struct work_struct *work)
-{
-	struct hd_struct *part =
-		container_of(to_rcu_work(work), struct hd_struct, rcu_work);
-	struct gendisk *disk = part_to_disk(part);
-
-	/*
-	 * Release the disk reference acquired in delete_partition here.
-	 * We can't release it in hd_struct_free because the final put_device
-	 * needs process context and thus can't be run directly from a
-	 * percpu_ref ->release handler.
-	 */
-	put_device(disk_to_dev(disk));
-
-	part->start_sect = 0;
-	bdev_set_nr_sectors(part->bdev, 0);
-	part_stat_set_all(part, 0);
-	put_device(part_to_dev(part));
-}
-
-static void hd_struct_free(struct percpu_ref *ref)
-{
-	struct hd_struct *part = container_of(ref, struct hd_struct, ref);
-	struct gendisk *disk = part_to_disk(part);
-	struct disk_part_tbl *ptbl =
-		rcu_dereference_protected(disk->part_tbl, 1);
-
-	rcu_assign_pointer(ptbl->last_lookup, NULL);
-
-	INIT_RCU_WORK(&part->rcu_work, hd_struct_free_work);
-	queue_rcu_work(system_wq, &part->rcu_work);
-}
-
-int hd_ref_init(struct hd_struct *part)
-{
-	if (percpu_ref_init(&part->ref, hd_struct_free, 0, GFP_KERNEL))
-		return -ENOMEM;
-	return 0;
-}
-
 /*
  * Must be called either with disk->mutex held, before a disk can be opened or
  * after all disk users are gone.
  */
-void delete_partition(struct hd_struct *part)
+void delete_partition(struct block_device *part)
 {
-	struct gendisk *disk = part_to_disk(part);
+	struct gendisk *disk = part->bd_disk;
 	struct disk_part_tbl *ptbl =
 		rcu_dereference_protected(disk->part_tbl, 1);
 
-	/*
-	 * ->part_tbl is referenced in this part's release handler, so
-	 *  we have to hold the disk device
-	 */
-	get_device(disk_to_dev(disk));
-	rcu_assign_pointer(ptbl->part[part->partno], NULL);
-	kobject_put(part->holder_dir);
+	rcu_assign_pointer(ptbl->part[part->bd_partno], NULL);
+	rcu_assign_pointer(ptbl->last_lookup, NULL);
+	kobject_put(part->bd_holder_dir);
 	device_del(part_to_dev(part));
-
-	/*
-	 * Remove the block device from the inode hash, so that it cannot be
-	 * looked up while waiting for the RCU grace period.
-	 */
-	bdput(part->bdev);
-
-	percpu_ref_kill(&part->ref);
+	put_device(part_to_dev(part));
 }
 
 static ssize_t whole_disk_show(struct device *dev,
@@ -366,11 +310,11 @@ static DEVICE_ATTR(whole_disk, 0444, whole_disk_show, NULL);
  * Must be called either with disk->mutex held, before a disk can be opened or
  * after all disk users are gone.
  */
-static struct hd_struct *add_partition(struct gendisk *disk, int partno,
+static struct block_device *add_partition(struct gendisk *disk, int partno,
 				sector_t start, sector_t len, int flags,
 				struct partition_meta_info *info)
 {
-	struct hd_struct *p;
+	struct block_device *p;
 	dev_t devt = MKDEV(0, 0);
 	struct device *ddev = disk_to_dev(disk);
 	struct device *pdev;
@@ -404,36 +348,22 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	if (ptbl->part[partno])
 		return ERR_PTR(-EBUSY);
 
-	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	p = bdev_alloc(disk, partno);
 	if (!p)
-		return ERR_PTR(-EBUSY);
-
-	err = -ENOMEM;
-	p->dkstats = alloc_percpu(struct disk_stats);
-	if (!p->dkstats)
-		goto out_free;
-
-	p->bdev = bdev_alloc(disk, partno);
-	if (!p->bdev)
-		goto out_free_stats;
-
-	pdev = part_to_dev(p);
+		return ERR_PTR(-ENOMEM);
 
-	p->start_sect = start;
-	bdev_set_nr_sectors(p->bdev, len);
-	p->partno = partno;
-	p->policy = get_disk_ro(disk);
+	p->bd_start_sect = start;
+	bdev_set_nr_sectors(p, len);
+	p->bd_policy = get_disk_ro(disk);
 
 	if (info) {
-		struct partition_meta_info *pinfo;
-
-		pinfo = kzalloc_node(sizeof(*pinfo), GFP_KERNEL, disk->node_id);
-		if (!pinfo)
-			goto out_bdput;
-		memcpy(pinfo, info, sizeof(*info));
-		p->info = pinfo;
+		err = -ENOMEM;
+		p->bd_meta_info = kmemdup(info, sizeof(*info), GFP_KERNEL);
+		if (!p->bd_meta_info)
+			goto out_free_stats;
 	}
 
+	pdev = part_to_dev(p);
 	dname = dev_name(ddev);
 	if (isdigit(dname[strlen(dname) - 1]))
 		dev_set_name(pdev, "%sp%d", dname, partno);
@@ -457,8 +387,8 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 		goto out_put;
 
 	err = -ENOMEM;
-	p->holder_dir = kobject_create_and_add("holders", &pdev->kobj);
-	if (!p->holder_dir)
+	p->bd_holder_dir = kobject_create_and_add("holders", &pdev->kobj);
+	if (!p->bd_holder_dir)
 		goto out_del;
 
 	dev_set_uevent_suppress(pdev, 0);
@@ -468,15 +398,8 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 			goto out_del;
 	}
 
-	err = hd_ref_init(p);
-	if (err) {
-		if (flags & ADDPART_FLAG_WHOLEDISK)
-			goto out_remove_file;
-		goto out_del;
-	}
-
 	/* everything is up and running, commence */
-	bdev_add(p->bdev, devt);
+	bdev_add(p, devt);
 	rcu_assign_pointer(ptbl->part[partno], p);
 
 	/* suppress uevent if the disk suppresses it */
@@ -485,19 +408,13 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	return p;
 
 out_free_info:
-	kfree(p->info);
-out_bdput:
-	bdput(p->bdev);
+	kfree(p->bd_meta_info);
 out_free_stats:
-	free_percpu(p->dkstats);
-out_free:
-	kfree(p);
+	bdput(p);
 	return ERR_PTR(err);
 
-out_remove_file:
-	device_remove_file(pdev, &dev_attr_whole_disk);
 out_del:
-	kobject_put(p->holder_dir);
+	kobject_put(p->bd_holder_dir);
 	device_del(pdev);
 out_put:
 	put_device(pdev);
@@ -508,14 +425,14 @@ static bool partition_overlaps(struct gendisk *disk, sector_t start,
 		sector_t length, int skip_partno)
 {
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 	bool overlap = false;
 
 	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY);
 	while ((part = disk_part_iter_next(&piter))) {
-		if (part->partno == skip_partno ||
-		    start >= part->start_sect + bdev_nr_sectors(part->bdev) ||
-		    start + length <= part->start_sect)
+		if (part->bd_partno == skip_partno ||
+		    start >= part->bd_start_sect + bdev_nr_sectors(part) ||
+		    start + length <= part->bd_start_sect)
 			continue;
 		overlap = true;
 		break;
@@ -528,7 +445,7 @@ static bool partition_overlaps(struct gendisk *disk, sector_t start,
 int bdev_add_partition(struct block_device *bdev, int partno,
 		sector_t start, sector_t length)
 {
-	struct hd_struct *part;
+	struct block_device *part;
 
 	mutex_lock(&bdev->bd_disk->mutex);
 	if (partition_overlaps(bdev->bd_disk, start, length, -1)) {
@@ -544,72 +461,54 @@ int bdev_add_partition(struct block_device *bdev, int partno,
 
 int bdev_del_partition(struct block_device *bdev, int partno)
 {
-	struct block_device *bdevp;
-	struct hd_struct *part = NULL;
+	struct block_device *part = NULL;
 	int ret;
 
-	bdevp = bdget_disk(bdev->bd_disk, partno);
-	if (!bdevp)
+	part = bdget_disk(bdev->bd_disk, partno);
+	if (!part)
 		return -ENXIO;
 
 	mutex_lock(&bdev->bd_disk->mutex);
-
-	ret = -ENXIO;
-	part = disk_get_part(bdev->bd_disk, partno);
-	if (!part)
-		goto out_unlock;
-
 	ret = -EBUSY;
-	if (bdevp->bd_openers)
+	if (part->bd_openers)
 		goto out_unlock;
 
-	sync_blockdev(bdevp);
-	invalidate_bdev(bdevp);
+	sync_blockdev(part);
+	invalidate_bdev(part);
 
 	delete_partition(part);
 	ret = 0;
 out_unlock:
 	mutex_unlock(&bdev->bd_disk->mutex);
-	bdput(bdevp);
-	if (part)
-		disk_put_part(part);
+	bdput(part);
 	return ret;
 }
 
 int bdev_resize_partition(struct block_device *bdev, int partno,
 		sector_t start, sector_t length)
 {
-	struct block_device *bdevp;
-	struct hd_struct *part;
+	struct block_device *part = NULL;
 	int ret = 0;
 
-	part = disk_get_part(bdev->bd_disk, partno);
+	part = bdget_disk(bdev->bd_disk, partno);
 	if (!part)
 		return -ENXIO;
 
-	ret = -ENOMEM;
-	bdevp = bdget_part(part);
-	if (!bdevp)
-		goto out_put_part;
-
 	mutex_lock(&bdev->bd_disk->mutex);
-
 	ret = -EINVAL;
-	if (start != part->start_sect)
+	if (start != part->bd_start_sect)
 		goto out_unlock;
 
 	ret = -EBUSY;
 	if (partition_overlaps(bdev->bd_disk, start, length, partno))
 		goto out_unlock;
 
-	bdev_set_nr_sectors(bdevp, length);
+	bdev_set_nr_sectors(part, length);
 
 	ret = 0;
 out_unlock:
 	mutex_unlock(&bdev->bd_disk->mutex);
-	bdput(bdevp);
-out_put_part:
-	disk_put_part(part);
+	bdput(part);
 	return ret;
 }
 
@@ -632,7 +531,7 @@ static bool disk_unlock_native_capacity(struct gendisk *disk)
 int blk_drop_partitions(struct block_device *bdev)
 {
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	if (bdev->bd_part_count)
 		return -EBUSY;
@@ -657,7 +556,7 @@ static bool blk_add_partition(struct gendisk *disk, struct block_device *bdev,
 {
 	sector_t size = state->parts[p].size;
 	sector_t from = state->parts[p].from;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	if (!size)
 		return true;
@@ -697,7 +596,7 @@ static bool blk_add_partition(struct gendisk *disk, struct block_device *bdev,
 
 	if (IS_BUILTIN(CONFIG_BLK_DEV_MD) &&
 	    (state->parts[p].flags & ADDPART_FLAG_RAID))
-		md_autodetect_dev(part_to_dev(part)->devt);
+		md_autodetect_dev(part->bd_dev);
 
 	return true;
 }
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index dc333dbe523281..09c86ef3f0fd93 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2802,7 +2802,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
 	if (c_min_rate == 0)
 		return false;
 
-	curr_events = (int)part_stat_read_accum(&disk->part0, sectors) -
+	curr_events = (int)part_stat_read_accum(disk->part0, sectors) -
 			atomic_read(&device->rs_sect_ev);
 
 	if (atomic_read(&device->ap_actlog_cnt)
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index ba56f3f05312f0..4537559829876e 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -1678,7 +1678,7 @@ void drbd_rs_controller_reset(struct drbd_device *device)
 	atomic_set(&device->rs_sect_in, 0);
 	atomic_set(&device->rs_sect_ev, 0);
 	device->rs_in_flight = 0;
-	device->rs_last_events = (int)part_stat_read_accum(&disk->part0, sectors);
+	device->rs_last_events = part_stat_read_accum(disk->part0, sectors);
 
 	/* Updating the RCU protected object in place is necessary since
 	   this function gets called from atomic context.
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 0b156f09e208df..e765765263495f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1687,7 +1687,7 @@ static void zram_reset_device(struct zram *zram)
 	zram->disksize = 0;
 
 	set_capacity_and_notify(zram->disk, 0);
-	part_stat_set_all(&zram->disk->part0, 0);
+	part_stat_set_all(zram->disk->part0, 0);
 
 	up_write(&zram->init_lock);
 	/* I/O operation under all of CPU are done so let's free */
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index afac8d07c1bd00..85b1f2a9b72d68 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -475,7 +475,7 @@ struct search {
 	unsigned int		read_dirty_data:1;
 	unsigned int		cache_missed:1;
 
-	struct hd_struct	*part;
+	struct block_device	*part;
 	unsigned long		start_time;
 
 	struct btree_op		op;
@@ -1073,7 +1073,7 @@ struct detached_dev_io_private {
 	unsigned long		start_time;
 	bio_end_io_t		*bi_end_io;
 	void			*bi_private;
-	struct hd_struct	*part;
+	struct block_device	*part;
 };
 
 static void detached_dev_end_io(struct bio *bio)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c789ffea2badde..ac46f6e41279cc 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1607,7 +1607,7 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 				 * (by eliminating DM's splitting and just using bio_split)
 				 */
 				part_stat_lock();
-				__dm_part_stat_sub(&dm_disk(md)->part0,
+				__dm_part_stat_sub(dm_disk(md)->part0,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
@@ -2242,12 +2242,12 @@ EXPORT_SYMBOL_GPL(dm_put);
 static bool md_in_flight_bios(struct mapped_device *md)
 {
 	int cpu;
-	struct hd_struct *part = &dm_disk(md)->part0;
+	struct block_device *bdev = dm_disk(md)->part0;
 	long sum = 0;
 
 	for_each_possible_cpu(cpu) {
-		sum += part_stat_local_read_cpu(part, in_flight[0], cpu);
-		sum += part_stat_local_read_cpu(part, in_flight[1], cpu);
+		sum += part_stat_local_read_cpu(bdev, in_flight[0], cpu);
+		sum += part_stat_local_read_cpu(bdev, in_flight[1], cpu);
 	}
 
 	return sum != 0;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7ce6047c856ea2..0065736f05b428 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -464,7 +464,7 @@ struct md_io {
 	bio_end_io_t *orig_bi_end_io;
 	void *orig_bi_private;
 	unsigned long start_time;
-	struct hd_struct *part;
+	struct block_device *part;
 };
 
 static void md_end_io(struct bio *bio)
@@ -8441,7 +8441,7 @@ static int is_mddev_idle(struct mddev *mddev, int init)
 	rcu_read_lock();
 	rdev_for_each_rcu(rdev, mddev) {
 		struct gendisk *disk = rdev->bdev->bd_disk;
-		curr_events = (int)part_stat_read_accum(&disk->part0, sectors) -
+		curr_events = (int)part_stat_read_accum(disk->part0, sectors) -
 			      atomic_read(&disk->sync_io);
 		/* sync IO will cause sync_io to increase before the disk_stats
 		 * as sync_io is counted when a request starts, and
diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index dca34489a1dc9e..8d90235e4fcc5a 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -89,12 +89,12 @@ static u16 nvmet_get_smart_log_nsid(struct nvmet_req *req,
 	if (!ns->bdev)
 		goto out;
 
-	host_reads = part_stat_read(ns->bdev->bd_part, ios[READ]);
-	data_units_read = DIV_ROUND_UP(part_stat_read(ns->bdev->bd_part,
-		sectors[READ]), 1000);
-	host_writes = part_stat_read(ns->bdev->bd_part, ios[WRITE]);
-	data_units_written = DIV_ROUND_UP(part_stat_read(ns->bdev->bd_part,
-		sectors[WRITE]), 1000);
+	host_reads = part_stat_read(ns->bdev, ios[READ]);
+	data_units_read =
+		DIV_ROUND_UP(part_stat_read(ns->bdev, sectors[READ]), 1000);
+	host_writes = part_stat_read(ns->bdev, ios[WRITE]);
+	data_units_written =
+		DIV_ROUND_UP(part_stat_read(ns->bdev, sectors[WRITE]), 1000);
 
 	put_unaligned_le64(host_reads, &slog->host_reads[0]);
 	put_unaligned_le64(data_units_read, &slog->data_units_read[0]);
@@ -120,12 +120,12 @@ static u16 nvmet_get_smart_log_all(struct nvmet_req *req,
 		/* we don't have the right data for file backed ns */
 		if (!ns->bdev)
 			continue;
-		host_reads += part_stat_read(ns->bdev->bd_part, ios[READ]);
+		host_reads += part_stat_read(ns->bdev, ios[READ]);
 		data_units_read += DIV_ROUND_UP(
-			part_stat_read(ns->bdev->bd_part, sectors[READ]), 1000);
-		host_writes += part_stat_read(ns->bdev->bd_part, ios[WRITE]);
+			part_stat_read(ns->bdev, sectors[READ]), 1000);
+		host_writes += part_stat_read(ns->bdev, ios[WRITE]);
 		data_units_written += DIV_ROUND_UP(
-			part_stat_read(ns->bdev->bd_part, sectors[WRITE]), 1000);
+			part_stat_read(ns->bdev, sectors[WRITE]), 1000);
 	}
 
 	put_unaligned_le64(host_reads, &slog->host_reads[0]);
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index db24e04ee9781e..1825fa8d05a780 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -432,7 +432,7 @@ dasd_state_ready_to_online(struct dasd_device * device)
 {
 	struct gendisk *disk;
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	device->state = DASD_STATE_ONLINE;
 	if (device->block) {
@@ -445,7 +445,7 @@ dasd_state_ready_to_online(struct dasd_device * device)
 		disk = device->block->bdev->bd_disk;
 		disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0);
 		while ((part = disk_part_iter_next(&piter)))
-			kobject_uevent(&part_to_dev(part)->kobj, KOBJ_CHANGE);
+			kobject_uevent(bdev_kobj(part), KOBJ_CHANGE);
 		disk_part_iter_exit(&piter);
 	}
 	return 0;
@@ -459,7 +459,7 @@ static int dasd_state_online_to_ready(struct dasd_device *device)
 	int rc;
 	struct gendisk *disk;
 	struct disk_part_iter piter;
-	struct hd_struct *part;
+	struct block_device *part;
 
 	if (device->discipline->online_to_ready) {
 		rc = device->discipline->online_to_ready(device);
@@ -472,7 +472,7 @@ static int dasd_state_online_to_ready(struct dasd_device *device)
 		disk = device->block->bdev->bd_disk;
 		disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0);
 		while ((part = disk_part_iter_next(&piter)))
-			kobject_uevent(&part_to_dev(part)->kobj, KOBJ_CHANGE);
+			kobject_uevent(bdev_kobj(part), KOBJ_CHANGE);
 		disk_part_iter_exit(&piter);
 	}
 	return 0;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 4b59ace9632f65..e1457bf76c6f34 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -34,6 +34,7 @@
 #include <linux/falloc.h>
 #include <linux/uaccess.h>
 #include <linux/suspend.h>
+#include <linux/part_stat.h>
 #include "internal.h"
 
 struct bdev_inode {
@@ -788,28 +789,18 @@ static struct inode *bdev_alloc_inode(struct super_block *sb)
 
 static void bdev_free_inode(struct inode *inode)
 {
-	kmem_cache_free(bdev_cachep, BDEV_I(inode));
-}
+	struct block_device *bdev = I_BDEV(inode);
 
-static void bdev_destroy_inode(struct inode *inode)
-{
-	if (inode->i_rdev)
-		put_device(disk_to_dev(I_BDEV(inode)->bd_disk));
+	kfree(bdev->bd_meta_info);
+	free_percpu(bdev->bd_stats);
+	kmem_cache_free(bdev_cachep, BDEV_I(inode));
 }
 
 static void init_once(void *foo)
 {
 	struct bdev_inode *ei = (struct bdev_inode *) foo;
-	struct block_device *bdev = &ei->bdev;
 
-	memset(bdev, 0, sizeof(*bdev));
-#ifdef CONFIG_SYSFS
-	INIT_LIST_HEAD(&bdev->bd_holder_disks);
-#endif
-	bdev->bd_bdi = &noop_backing_dev_info;
 	inode_init_once(&ei->vfs_inode);
-	/* Initialize mutex for freeze. */
-	mutex_init(&bdev->bd_fsfreeze_mutex);
 }
 
 static void bdev_evict_inode(struct inode *inode)
@@ -830,7 +821,6 @@ static const struct super_operations bdev_sops = {
 	.statfs = simple_statfs,
 	.alloc_inode = bdev_alloc_inode,
 	.free_inode = bdev_free_inode,
-	.destroy_inode = bdev_destroy_inode,
 	.drop_inode = generic_delete_inode,
 	.evict_inode = bdev_evict_inode,
 };
@@ -882,12 +872,21 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 		return NULL;
 
 	bdev = I_BDEV(inode);
+	memset(bdev, 0, sizeof(*bdev));
 	spin_lock_init(&bdev->bd_size_lock);
+	mutex_init(&bdev->bd_fsfreeze_mutex);
+	bdev->bd_bdi = &noop_backing_dev_info;
 	bdev->bd_disk = disk;
 	bdev->bd_partno = partno;
-	bdev->bd_super = NULL;
 	bdev->bd_inode = inode;
-	bdev->bd_part_count = 0;
+	bdev->bd_stats = alloc_percpu(struct disk_stats);
+	if (!bdev->bd_stats) {
+		iput(inode);
+		return NULL;
+	}
+#ifdef CONFIG_SYSFS
+	INIT_LIST_HEAD(&bdev->bd_holder_disks);
+#endif
 
 	inode->i_mode = S_IFBLK;
 	inode->i_rdev = 0;
@@ -900,7 +899,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 void bdev_add(struct block_device *bdev, dev_t dev)
 {
 	bdev->bd_dev = dev;
-	get_device(disk_to_dev(bdev->bd_disk));
 	bdev->bd_inode->i_rdev = dev;
 	bdev->bd_inode->i_ino = dev;
 	insert_inode_hash(bdev->bd_inode);
@@ -927,11 +925,6 @@ struct block_device *bdgrab(struct block_device *bdev)
 }
 EXPORT_SYMBOL(bdgrab);
 
-struct block_device *bdget_part(struct hd_struct *part)
-{
-	return bdget(part_devt(part));
-}
-
 long nr_blockdev_pages(void)
 {
 	struct inode *inode;
@@ -1208,7 +1201,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	WARN_ON_ONCE(!bdev->bd_holder);
 
 	/* FIXME: remove the following once add_disk() handles errors */
-	if (WARN_ON(!disk->slave_dir || !bdev->bd_part->holder_dir))
+	if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
 		goto out_unlock;
 
 	holder = bd_find_holder_disk(bdev, disk);
@@ -1227,24 +1220,24 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	holder->disk = disk;
 	holder->refcnt = 1;
 
-	ret = add_symlink(disk->slave_dir, &part_to_dev(bdev->bd_part)->kobj);
+	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
 	if (ret)
 		goto out_free;
 
-	ret = add_symlink(bdev->bd_part->holder_dir, &disk_to_dev(disk)->kobj);
+	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
 	if (ret)
 		goto out_del;
 	/*
 	 * bdev could be deleted beneath us which would implicitly destroy
 	 * the holder directory.  Hold on to it.
 	 */
-	kobject_get(bdev->bd_part->holder_dir);
+	kobject_get(bdev->bd_holder_dir);
 
 	list_add(&holder->list, &bdev->bd_holder_disks);
 	goto out_unlock;
 
 out_del:
-	del_symlink(disk->slave_dir, &part_to_dev(bdev->bd_part)->kobj);
+	del_symlink(disk->slave_dir, bdev_kobj(bdev));
 out_free:
 	kfree(holder);
 out_unlock:
@@ -1272,10 +1265,10 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	holder = bd_find_holder_disk(bdev, disk);
 
 	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
-		del_symlink(disk->slave_dir, &part_to_dev(bdev->bd_part)->kobj);
-		del_symlink(bdev->bd_part->holder_dir,
+		del_symlink(disk->slave_dir, bdev_kobj(bdev));
+		del_symlink(bdev->bd_holder_dir,
 			    &disk_to_dev(disk)->kobj);
-		kobject_put(bdev->bd_part->holder_dir);
+		kobject_put(bdev->bd_holder_dir);
 		list_del_init(&holder->list);
 		kfree(holder);
 	}
@@ -1385,11 +1378,6 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 		first_open = true;
 
 		if (!bdev->bd_partno) {
-			ret = -ENXIO;
-			bdev->bd_part = disk_get_part(disk, 0);
-			if (!bdev->bd_part)
-				goto out_clear;
-
 			ret = 0;
 			if (disk->fops->open) {
 				ret = disk->fops->open(bdev, mode);
@@ -1422,9 +1410,8 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 			ret = __blkdev_get(bdev_whole(bdev), mode, NULL, 1);
 			if (ret)
 				goto out_clear;
-			bdev->bd_part = disk_get_part(disk, bdev->bd_partno);
 			if (!(disk->flags & GENHD_FL_UP) ||
-			    !bdev->bd_part || !bdev_nr_sectors(bdev)) {
+			    !bdev_nr_sectors(bdev)) {
 				ret = -ENXIO;
 				goto out_clear;
 			}
@@ -1480,8 +1467,6 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 	return 0;
 
  out_clear:
-	disk_put_part(bdev->bd_part);
-	bdev->bd_part = NULL;
 	if (bdev_is_partition(bdev))
 		__blkdev_put(bdev_whole(bdev), mode, 1);
  out_unlock_bdev:
@@ -1686,11 +1671,8 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 			disk->fops->release(disk, mode);
 	}
 	if (!bdev->bd_openers) {
-		disk_put_part(bdev->bd_part);
-		bdev->bd_part = NULL;
 		if (bdev_is_partition(bdev))
 			victim = bdev_whole(bdev);
-
 		module_put(disk->fops->owner);
 	}
 	if (!for_part)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 6633b20224d509..c303a0ff0b1701 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4048,9 +4048,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	sbi->s_sb = sb;
 	sbi->s_inode_readahead_blks = EXT4_DEF_INODE_READAHEAD_BLKS;
 	sbi->s_sb_block = sb_block;
-	if (sb->s_bdev->bd_part)
-		sbi->s_sectors_written_start =
-			part_stat_read(sb->s_bdev->bd_part, sectors[STAT_WRITE]);
+	sbi->s_sectors_written_start =
+		part_stat_read(sb->s_bdev, sectors[STAT_WRITE]);
 
 	/* Cleanup superblock name */
 	strreplace(sb->s_id, '/', '!');
@@ -5509,15 +5508,10 @@ static int ext4_commit_super(struct super_block *sb, int sync)
 	 */
 	if (!(sb->s_flags & SB_RDONLY))
 		ext4_update_tstamp(es, s_wtime);
-	if (sb->s_bdev->bd_part)
-		es->s_kbytes_written =
-			cpu_to_le64(EXT4_SB(sb)->s_kbytes_written +
-			    ((part_stat_read(sb->s_bdev->bd_part,
-					     sectors[STAT_WRITE]) -
-			      EXT4_SB(sb)->s_sectors_written_start) >> 1));
-	else
-		es->s_kbytes_written =
-			cpu_to_le64(EXT4_SB(sb)->s_kbytes_written);
+	es->s_kbytes_written =
+		cpu_to_le64(EXT4_SB(sb)->s_kbytes_written +
+		    ((part_stat_read(sb->s_bdev, sectors[STAT_WRITE]) -
+		      EXT4_SB(sb)->s_sectors_written_start) >> 1));
 	if (percpu_counter_initialized(&EXT4_SB(sb)->s_freeclusters_counter))
 		ext4_free_blocks_count_set(es,
 			EXT4_C2B(EXT4_SB(sb), percpu_counter_sum_positive(
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index 4e27fe6ed3ae6a..075aa3a19ff5f1 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -62,11 +62,8 @@ static ssize_t session_write_kbytes_show(struct ext4_sb_info *sbi, char *buf)
 {
 	struct super_block *sb = sbi->s_buddy_cache->i_sb;
 
-	if (!sb->s_bdev->bd_part)
-		return snprintf(buf, PAGE_SIZE, "0\n");
 	return snprintf(buf, PAGE_SIZE, "%lu\n",
-			(part_stat_read(sb->s_bdev->bd_part,
-					sectors[STAT_WRITE]) -
+			(part_stat_read(sb->s_bdev, sectors[STAT_WRITE]) -
 			 sbi->s_sectors_written_start) >> 1);
 }
 
@@ -74,12 +71,9 @@ static ssize_t lifetime_write_kbytes_show(struct ext4_sb_info *sbi, char *buf)
 {
 	struct super_block *sb = sbi->s_buddy_cache->i_sb;
 
-	if (!sb->s_bdev->bd_part)
-		return snprintf(buf, PAGE_SIZE, "0\n");
 	return snprintf(buf, PAGE_SIZE, "%llu\n",
 			(unsigned long long)(sbi->s_kbytes_written +
-			((part_stat_read(sb->s_bdev->bd_part,
-					 sectors[STAT_WRITE]) -
+			((part_stat_read(sb->s_bdev, sectors[STAT_WRITE]) -
 			  EXT4_SB(sb)->s_sectors_written_start) >> 1)));
 }
 
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 023462e80e58d5..54a1905af052cc 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1395,7 +1395,6 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	__u32 crc32 = 0;
 	int i;
 	int cp_payload_blks = __cp_payload(sbi);
-	struct super_block *sb = sbi->sb;
 	struct curseg_info *seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
 	u64 kbytes_written;
 	int err;
@@ -1489,9 +1488,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	start_blk += data_sum_blocks;
 
 	/* Record write statistics in the hot node summary */
-	kbytes_written = sbi->kbytes_written;
-	if (sb->s_bdev->bd_part)
-		kbytes_written += BD_PART_WRITTEN(sbi);
+	kbytes_written = sbi->kbytes_written + BD_PART_WRITTEN(sbi);
 
 	seg_i->journal->info.kbytes_written = cpu_to_le64(kbytes_written);
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index cb700d79729680..5f9522d4c727fb 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1675,7 +1675,7 @@ static inline bool f2fs_is_multi_device(struct f2fs_sb_info *sbi)
  * and the return value is in kbytes. s is of struct f2fs_sb_info.
  */
 #define BD_PART_WRITTEN(s)						 \
-(((u64)part_stat_read((s)->sb->s_bdev->bd_part, sectors[STAT_WRITE]) -   \
+(((u64)part_stat_read((s)->sb->s_bdev, sectors[STAT_WRITE]) -   \
 		(s)->sectors_written_start) >> 1)
 
 static inline void f2fs_update_time(struct f2fs_sb_info *sbi, int type)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index d4e7fab352bacb..fae92285f561b4 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3700,10 +3700,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	/* For write statistics */
-	if (sb->s_bdev->bd_part)
-		sbi->sectors_written_start =
-			(u64)part_stat_read(sb->s_bdev->bd_part,
-					    sectors[STAT_WRITE]);
+	sbi->sectors_written_start =
+		part_stat_read(sb->s_bdev, sectors[STAT_WRITE]);
 
 	/* Read accumulated write IO statistics if exists */
 	seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index ec77ccfea923dc..24e876e849c512 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -90,11 +90,6 @@ static ssize_t free_segments_show(struct f2fs_attr *a,
 static ssize_t lifetime_write_kbytes_show(struct f2fs_attr *a,
 		struct f2fs_sb_info *sbi, char *buf)
 {
-	struct super_block *sb = sbi->sb;
-
-	if (!sb->s_bdev->bd_part)
-		return sprintf(buf, "0\n");
-
 	return sprintf(buf, "%llu\n",
 			(unsigned long long)(sbi->kbytes_written +
 			BD_PART_WRITTEN(sbi)));
@@ -103,12 +98,8 @@ static ssize_t lifetime_write_kbytes_show(struct f2fs_attr *a,
 static ssize_t features_show(struct f2fs_attr *a,
 		struct f2fs_sb_info *sbi, char *buf)
 {
-	struct super_block *sb = sbi->sb;
 	int len = 0;
 
-	if (!sb->s_bdev->bd_part)
-		return sprintf(buf, "0\n");
-
 	if (f2fs_sb_has_encrypt(sbi))
 		len += scnprintf(buf, PAGE_SIZE - len, "%s",
 						"encryption");
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5a5ccacb804cdb..c6d00732b1af52 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 #include <linux/bvec.h>
+#include <linux/device.h>
 #include <linux/ktime.h>
 
 struct bio_set;
@@ -20,7 +21,13 @@ typedef void (bio_end_io_t) (struct bio *);
 struct bio_crypt_ctx;
 
 struct block_device {
+	sector_t		bd_start_sect;
+	unsigned long		bd_stamp;
+	struct disk_stats __percpu *bd_stats;
+	u8			bd_partno;
+	int			bd_policy;
 	dev_t			bd_dev;
+	struct device		bd_device;
 	int			bd_openers;
 	struct inode *		bd_inode;	/* will die */
 	struct super_block *	bd_super;
@@ -31,8 +38,7 @@ struct block_device {
 #ifdef CONFIG_SYSFS
 	struct list_head	bd_holder_disks;
 #endif
-	u8			bd_partno;
-	struct hd_struct *	bd_part;
+	struct kobject		*bd_holder_dir;
 	/* number of times partitions within this device have been opened. */
 	unsigned		bd_part_count;
 
@@ -44,13 +50,22 @@ struct block_device {
 	int			bd_fsfreeze_count;
 	/* Mutex for freeze */
 	struct mutex		bd_fsfreeze_mutex;
+
+	struct partition_meta_info *bd_meta_info;
+#ifdef CONFIG_FAIL_MAKE_REQUEST
+	int			bd_make_it_fail;
+#endif
 } __randomize_layout;
 
 #define bdev_whole(_bdev) \
-	((_bdev)->bd_disk->part0.bdev)
+	((_bdev)->bd_disk->part0)
+
+#define dev_to_bdev(device) \
+	container_of((device), struct block_device, bd_device)
+#define part_to_dev(part)	(&((part)->bd_device))
 
 #define bdev_kobj(_bdev) \
-	(&part_to_dev((_bdev)->bd_part)->kobj)
+	(&part_to_dev((_bdev))->kobj)
 
 /*
  * Block error status values.  See block/blk-core:blk_errors for the details.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 696b2f9c5529d8..ed40144ab80339 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -191,7 +191,7 @@ struct request {
 	};
 
 	struct gendisk *rq_disk;
-	struct hd_struct *part;
+	struct block_device *bdev;
 #ifdef CONFIG_BLK_RQ_ALLOC_TIME
 	/* Time that the first bio started allocating this request. */
 	u64 alloc_time_ns;
@@ -1488,7 +1488,7 @@ static inline int bdev_alignment_offset(struct block_device *bdev)
 		return -1;
 	if (bdev_is_partition(bdev))
 		return queue_limit_alignment_offset(&q->limits,
-				bdev->bd_part->start_sect);
+				bdev->bd_start_sect);
 	return q->limits.alignment_offset;
 }
 
@@ -1529,7 +1529,7 @@ static inline int bdev_discard_alignment(struct block_device *bdev)
 
 	if (bdev_is_partition(bdev))
 		return queue_limit_discard_alignment(&q->limits,
-				bdev->bd_part->start_sect);
+				bdev->bd_start_sect);
 	return q->limits.discard_alignment;
 }
 
@@ -1943,9 +1943,9 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
 void disk_end_io_acct(struct gendisk *disk, unsigned int op,
 		unsigned long start_time);
 
-unsigned long part_start_io_acct(struct gendisk *disk, struct hd_struct **part,
-				 struct bio *bio);
-void part_end_io_acct(struct hd_struct *part, struct bio *bio,
+unsigned long part_start_io_acct(struct gendisk *disk,
+		struct block_device **part, struct bio *bio);
+void part_end_io_acct(struct block_device *part, struct bio *bio,
 		      unsigned long start_time);
 
 /**
@@ -1996,7 +1996,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno);
 void bdev_add(struct block_device *bdev, dev_t dev);
 struct block_device *bdget(dev_t dev);
 struct block_device *I_BDEV(struct inode *inode);
-struct block_device *bdget_part(struct hd_struct *part);
 struct block_device *bdgrab(struct block_device *bdev);
 void bdput(struct block_device *);
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index bc0469cc8fb0dc..98e1ce7d56a256 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -19,11 +19,6 @@
 #include <linux/blk_types.h>
 #include <asm/local.h>
 
-#define dev_to_disk(device)	container_of((device), struct gendisk, part0.__dev)
-#define dev_to_part(device)	container_of((device), struct hd_struct, __dev)
-#define disk_to_dev(disk)	(&(disk)->part0.__dev)
-#define part_to_dev(part)	(&((part)->__dev))
-
 extern const struct device_type disk_type;
 extern struct device_type part_type;
 extern struct class block_class;
@@ -50,23 +45,6 @@ struct partition_meta_info {
 	u8 volname[PARTITION_META_INFO_VOLNAMELTH];
 };
 
-struct hd_struct {
-	sector_t start_sect;
-	unsigned long stamp;
-	struct disk_stats __percpu *dkstats;
-	struct percpu_ref ref;
-
-	struct block_device *bdev;
-	struct device __dev;
-	struct kobject *holder_dir;
-	int policy, partno;
-	struct partition_meta_info *info;
-#ifdef CONFIG_FAIL_MAKE_REQUEST
-	int make_it_fail;
-#endif
-	struct rcu_work rcu_work;
-};
-
 /**
  * DOC: genhd capability flags
  *
@@ -141,8 +119,8 @@ enum {
 struct disk_part_tbl {
 	struct rcu_head rcu_head;
 	int len;
-	struct hd_struct __rcu *last_lookup;
-	struct hd_struct __rcu *part[];
+	struct block_device __rcu *last_lookup;
+	struct block_device __rcu *part[];
 };
 
 struct disk_events;
@@ -176,7 +154,7 @@ struct gendisk {
 	 * helpers.
 	 */
 	struct disk_part_tbl __rcu *part_tbl;
-	struct hd_struct part0;
+	struct block_device *part0;
 
 	const struct block_device_operations *fops;
 	struct request_queue *queue;
@@ -203,23 +181,17 @@ struct gendisk {
 	struct lockdep_map lockdep_map;
 };
 
+#define dev_to_disk(device) \
+	(dev_to_bdev(device)->bd_disk)
+#define disk_to_dev(disk) \
+	(part_to_dev((disk)->part0))
+
 #if IS_REACHABLE(CONFIG_CDROM)
 #define disk_to_cdi(disk)	((disk)->cdi)
 #else
 #define disk_to_cdi(disk)	NULL
 #endif
 
-static inline struct gendisk *part_to_disk(struct hd_struct *part)
-{
-	if (likely(part)) {
-		if (part->partno)
-			return dev_to_disk(part_to_dev(part)->parent);
-		else
-			return dev_to_disk(part_to_dev(part));
-	}
-	return NULL;
-}
-
 static inline int disk_max_parts(struct gendisk *disk)
 {
 	if (disk->flags & GENHD_FL_EXT_DEVT)
@@ -238,19 +210,6 @@ static inline dev_t disk_devt(struct gendisk *disk)
 	return MKDEV(disk->major, disk->first_minor);
 }
 
-static inline dev_t part_devt(struct hd_struct *part)
-{
-	return part_to_dev(part)->devt;
-}
-
-extern struct hd_struct *disk_get_part(struct gendisk *disk, int partno);
-
-static inline void disk_put_part(struct hd_struct *part)
-{
-	if (likely(part))
-		put_device(part_to_dev(part));
-}
-
 /*
  * Smarter partition iterator without context limits.
  */
@@ -261,14 +220,14 @@ static inline void disk_put_part(struct hd_struct *part)
 
 struct disk_part_iter {
 	struct gendisk		*disk;
-	struct hd_struct	*part;
+	struct block_device	*part;
 	int			idx;
 	unsigned int		flags;
 };
 
 extern void disk_part_iter_init(struct disk_part_iter *piter,
 				 struct gendisk *disk, unsigned int flags);
-extern struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter);
+struct block_device *disk_part_iter_next(struct disk_part_iter *piter);
 extern void disk_part_iter_exit(struct disk_part_iter *piter);
 extern bool disk_has_partitions(struct gendisk *disk);
 
@@ -292,7 +251,7 @@ extern void set_disk_ro(struct gendisk *disk, int flag);
 
 static inline int get_disk_ro(struct gendisk *disk)
 {
-	return disk->part0.policy;
+	return disk->part0->bd_policy;
 }
 
 extern void disk_block_events(struct gendisk *disk);
@@ -306,7 +265,7 @@ extern void rand_initialize_disk(struct gendisk *disk);
 
 static inline sector_t get_start_sect(struct block_device *bdev)
 {
-	return bdev->bd_part->start_sect;
+	return bdev->bd_start_sect;
 }
 	
 static inline sector_t bdev_nr_sectors(struct block_device *bdev)
@@ -316,7 +275,7 @@ static inline sector_t bdev_nr_sectors(struct block_device *bdev)
 	
 static inline sector_t get_capacity(struct gendisk *disk)
 {
-	return bdev_nr_sectors(disk->part0.bdev);
+	return bdev_nr_sectors(disk->part0);
 }
 
 int bdev_disk_changed(struct block_device *bdev, bool invalidate);
diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index 24125778ef3ec7..3b3621b4983a58 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -25,26 +25,26 @@ struct disk_stats {
 #define part_stat_unlock()	preempt_enable()
 
 #define part_stat_get_cpu(part, field, cpu)				\
-	(per_cpu_ptr((part)->dkstats, (cpu))->field)
+	(per_cpu_ptr((part)->bd_stats, (cpu))->field)
 
 #define part_stat_get(part, field)					\
 	part_stat_get_cpu(part, field, smp_processor_id())
 
 #define part_stat_read(part, field)					\
 ({									\
-	typeof((part)->dkstats->field) res = 0;				\
+	typeof((part)->bd_stats->field) res = 0;				\
 	unsigned int _cpu;						\
 	for_each_possible_cpu(_cpu)					\
-		res += per_cpu_ptr((part)->dkstats, _cpu)->field;	\
+		res += per_cpu_ptr((part)->bd_stats, _cpu)->field;	\
 	res;								\
 })
 
-static inline void part_stat_set_all(struct hd_struct *part, int value)
+static inline void part_stat_set_all(struct block_device *bdev, int value)
 {
 	int i;
 
 	for_each_possible_cpu(i)
-		memset(per_cpu_ptr(part->dkstats, i), value,
+		memset(per_cpu_ptr(bdev->bd_stats, i), value,
 				sizeof(struct disk_stats));
 }
 
@@ -54,13 +54,12 @@ static inline void part_stat_set_all(struct hd_struct *part, int value)
 	 part_stat_read(part, field[STAT_DISCARD]))
 
 #define __part_stat_add(part, field, addnd)				\
-	__this_cpu_add((part)->dkstats->field, addnd)
+	__this_cpu_add((part)->bd_stats->field, addnd)
 
 #define part_stat_add(part, field, addnd)	do {			\
 	__part_stat_add((part), field, addnd);				\
-	if ((part)->partno)						\
-		__part_stat_add(&part_to_disk((part))->part0,		\
-				field, addnd);				\
+	if ((part)->bd_partno)						\
+		__part_stat_add((part)->bd_disk->part0, field, addnd);	\
 } while (0)
 
 #define part_stat_dec(gendiskp, field)					\
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 5879edf083b318..a78e44ee6adb8d 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -76,11 +76,11 @@ struct uuidcmp {
  */
 static int match_dev_by_uuid(struct device *dev, const void *data)
 {
+	struct block_device *bdev = dev_to_bdev(dev);
 	const struct uuidcmp *cmp = data;
-	struct hd_struct *part = dev_to_part(dev);
 
-	if (!part->info ||
-	    strncasecmp(cmp->uuid, part->info->uuid, cmp->len))
+	if (!bdev->bd_meta_info ||
+	    strncasecmp(cmp->uuid, bdev->bd_meta_info->uuid, cmp->len))
 		return 0;
 	return 1;
 }
@@ -133,13 +133,13 @@ static dev_t devt_from_partuuid(const char *uuid_str)
 		 * Attempt to find the requested partition by adding an offset
 		 * to the partition number found by UUID.
 		 */
-		struct hd_struct *part;
+		struct block_device *part;
 
-		part = disk_get_part(dev_to_disk(dev),
-				     dev_to_part(dev)->partno + offset);
+		part = bdget_disk(dev_to_disk(dev),
+				  dev_to_bdev(dev)->bd_partno + offset);
 		if (part) {
-			devt = part_devt(part);
-			put_device(part_to_dev(part));
+			devt = part->bd_dev;
+			bdput(part);
 		}
 	} else {
 		devt = dev->devt;
@@ -166,10 +166,10 @@ static dev_t devt_from_partuuid(const char *uuid_str)
  */
 static int match_dev_by_label(struct device *dev, const void *data)
 {
+	struct block_device *bdev = dev_to_bdev(dev);
 	const char *label = data;
-	struct hd_struct *part = dev_to_part(dev);
 
-	if (!part->info || strcmp(label, part->info->volname))
+	if (!bdev->bd_meta_info || strcmp(label, bdev->bd_meta_info->volname))
 		return 0;
 	return 1;
 }
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 7076d588a50d69..a482a37848bff7 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -458,14 +458,9 @@ static struct rchan_callbacks blk_relay_callbacks = {
 static void blk_trace_setup_lba(struct blk_trace *bt,
 				struct block_device *bdev)
 {
-	struct hd_struct *part = NULL;
-
-	if (bdev)
-		part = bdev->bd_part;
-
-	if (part) {
-		bt->start_lba = part->start_sect;
-		bt->end_lba = part->start_sect + bdev_nr_sectors(bdev);
+	if (bdev) {
+		bt->start_lba = bdev->bd_start_sect;
+		bt->end_lba = bdev->bd_start_sect + bdev_nr_sectors(bdev);
 	} else {
 		bt->start_lba = 0;
 		bt->end_lba = -1ULL;
@@ -1815,30 +1810,15 @@ static ssize_t blk_trace_mask2str(char *buf, int mask)
 	return p - buf;
 }
 
-static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
-{
-	if (bdev->bd_disk == NULL)
-		return NULL;
-
-	return bdev_get_queue(bdev);
-}
-
 static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
 					 struct device_attribute *attr,
 					 char *buf)
 {
-	struct block_device *bdev = bdget_part(dev_to_part(dev));
-	struct request_queue *q;
+	struct block_device *bdev = dev_to_bdev(dev);
+	struct request_queue *q = bdev_get_queue(bdev);
 	struct blk_trace *bt;
 	ssize_t ret = -ENXIO;
 
-	if (bdev == NULL)
-		goto out;
-
-	q = blk_trace_get_queue(bdev);
-	if (q == NULL)
-		goto out_bdput;
-
 	mutex_lock(&q->debugfs_mutex);
 
 	bt = rcu_dereference_protected(q->blk_trace,
@@ -1861,9 +1841,6 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
 
 out_unlock_bdev:
 	mutex_unlock(&q->debugfs_mutex);
-out_bdput:
-	bdput(bdev);
-out:
 	return ret;
 }
 
@@ -1871,8 +1848,8 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 					  struct device_attribute *attr,
 					  const char *buf, size_t count)
 {
-	struct block_device *bdev;
-	struct request_queue *q;
+	struct block_device *bdev = dev_to_bdev(dev);
+	struct request_queue *q = bdev_get_queue(bdev);
 	struct blk_trace *bt;
 	u64 value;
 	ssize_t ret = -EINVAL;
@@ -1888,17 +1865,10 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 				goto out;
 			value = ret;
 		}
-	} else if (kstrtoull(buf, 0, &value))
-		goto out;
-
-	ret = -ENXIO;
-	bdev = bdget_part(dev_to_part(dev));
-	if (bdev == NULL)
-		goto out;
-
-	q = blk_trace_get_queue(bdev);
-	if (q == NULL)
-		goto out_bdput;
+	} else {
+		if (kstrtoull(buf, 0, &value))
+			goto out;
+	}
 
 	mutex_lock(&q->debugfs_mutex);
 
@@ -1936,8 +1906,6 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 
 out_unlock_bdev:
 	mutex_unlock(&q->debugfs_mutex);
-out_bdput:
-	bdput(bdev);
 out:
 	return ret ? ret : count;
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 75/78] block: stop using bdget_disk for partition 0
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (73 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 74/78] block: merge struct block_device and struct hd_struct Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-16 14:58 ` [PATCH 76/78] filemap: use ->f_mapping over ->i_mapping consistently Christoph Hellwig
                   ` (4 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

We can just dereference the point in struct gendisk instead.  Also
remove the now unused export.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c                   |  1 -
 drivers/block/nbd.c             |  4 +---
 drivers/block/xen-blkfront.c    | 20 +++++---------------
 drivers/block/zram/zram_drv.c   | 25 ++-----------------------
 drivers/md/dm.c                 | 13 ++-----------
 drivers/s390/block/dasd_ioctl.c |  5 ++---
 6 files changed, 12 insertions(+), 56 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 5dcb8b8902daae..b2a4e68171519a 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -904,7 +904,6 @@ struct block_device *bdget_disk(struct gendisk *disk, int partno)
 
 	return bdev;
 }
-EXPORT_SYMBOL(bdget_disk);
 
 /*
  * print a full list of all partitions - intended for places where the root
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 014683968ce174..92f84ed0ba9eb6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1488,12 +1488,10 @@ static int nbd_open(struct block_device *bdev, fmode_t mode)
 static void nbd_release(struct gendisk *disk, fmode_t mode)
 {
 	struct nbd_device *nbd = disk->private_data;
-	struct block_device *bdev = bdget_disk(disk, 0);
 
 	if (test_bit(NBD_RT_DISCONNECT_ON_CLOSE, &nbd->config->runtime_flags) &&
-			bdev->bd_openers == 0)
+			disk->part0->bd_openers == 0)
 		nbd_disconnect_and_put(nbd);
-	bdput(bdev);
 
 	nbd_config_put(nbd);
 	nbd_put(nbd);
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5b1f99ca77b734..c2721ec73d7291 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2153,7 +2153,7 @@ static void blkfront_closing(struct blkfront_info *info)
 	}
 
 	if (info->gd)
-		bdev = bdget_disk(info->gd, 0);
+		bdev = bdgrab(info->gd->part0);
 
 	mutex_unlock(&info->mutex);
 
@@ -2518,7 +2518,7 @@ static int blkfront_remove(struct xenbus_device *xbdev)
 
 	disk = info->gd;
 	if (disk)
-		bdev = bdget_disk(disk, 0);
+		bdev = bdgrab(disk->part0);
 
 	info->xbdev = NULL;
 	mutex_unlock(&info->mutex);
@@ -2595,19 +2595,11 @@ static int blkif_open(struct block_device *bdev, fmode_t mode)
 static void blkif_release(struct gendisk *disk, fmode_t mode)
 {
 	struct blkfront_info *info = disk->private_data;
-	struct block_device *bdev;
 	struct xenbus_device *xbdev;
 
 	mutex_lock(&blkfront_mutex);
-
-	bdev = bdget_disk(disk, 0);
-
-	if (!bdev) {
-		WARN(1, "Block device %s yanked out from us!\n", disk->disk_name);
+	if (disk->part0->bd_openers)
 		goto out_mutex;
-	}
-	if (bdev->bd_openers)
-		goto out;
 
 	/*
 	 * Check if we have been instructed to close. We will have
@@ -2619,7 +2611,7 @@ static void blkif_release(struct gendisk *disk, fmode_t mode)
 
 	if (xbdev && xbdev->state == XenbusStateClosing) {
 		/* pending switch to state closed */
-		dev_info(disk_to_dev(bdev->bd_disk), "releasing disk\n");
+		dev_info(disk_to_dev(disk), "releasing disk\n");
 		xlvbd_release_gendisk(info);
 		xenbus_frontend_closed(info->xbdev);
  	}
@@ -2628,14 +2620,12 @@ static void blkif_release(struct gendisk *disk, fmode_t mode)
 
 	if (!xbdev) {
 		/* sudden device removal */
-		dev_info(disk_to_dev(bdev->bd_disk), "releasing disk\n");
+		dev_info(disk_to_dev(disk), "releasing disk\n");
 		xlvbd_release_gendisk(info);
 		disk->private_data = NULL;
 		free_info(info);
 	}
 
-out:
-	bdput(bdev);
 out_mutex:
 	mutex_unlock(&blkfront_mutex);
 }
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e765765263495f..e7a23638e2f181 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1748,7 +1748,6 @@ static ssize_t reset_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
 	struct zram *zram = dev_to_zram(dev);
-	struct block_device *bdev;
 	unsigned short do_reset;
 	int ret = 0;
 
@@ -1758,17 +1757,12 @@ static ssize_t reset_store(struct device *dev,
 	if (!do_reset)
 		return -EINVAL;
 
-	bdev = bdget_disk(zram->disk, 0);
-	if (!bdev)
-		return -ENOMEM;
-
 	mutex_lock(&zram->disk->mutex);
-	if (bdev->bd_openers)
+	if (zram->disk->part0->bd_openers)
 		ret = -EBUSY;
 	else
 		zram_reset_device(zram);
 	mutex_unlock(&zram->disk->mutex);
-	bdput(bdev);
 
 	return ret ? ret : len;
 }
@@ -1931,24 +1925,9 @@ static int zram_add(void)
 	return ret;
 }
 
-static bool zram_busy(struct zram *zram)
-{
-	struct block_device *bdev;
-	bool busy = false;
-
-	bdev = bdget_disk(zram->disk, 0);
-	if (bdev) {
-		if (bdev->bd_openers)
-			busy = true;
-		bdput(bdev);
-	}
-
-	return busy;
-}
-
 static int zram_remove(struct zram *zram)
 {
-	if (zram_busy(zram))
+	if (zram->disk->part0->bd_openers)
 		return -EBUSY;
 
 	del_gendisk(zram->disk);
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index ac46f6e41279cc..ec48ccae50dd53 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2375,17 +2375,12 @@ struct dm_table *dm_swap_table(struct mapped_device *md, struct dm_table *table)
  */
 static int lock_fs(struct mapped_device *md)
 {
-	struct block_device *bdev;
 	int r;
 
 	WARN_ON(md->frozen_sb);
 
-	bdev = bdget_disk(md->disk, 0);
-	if (!bdev)
-		return -ENOMEM;
-	md->frozen_sb = freeze_bdev(bdev);
+	md->frozen_sb = freeze_bdev(md->disk->part0);
 	if (IS_ERR(md->frozen_sb)) {
-		bdput(bdev);
 		r = PTR_ERR(md->frozen_sb);
 		md->frozen_sb = NULL;
 		return r;
@@ -2398,14 +2393,10 @@ static int lock_fs(struct mapped_device *md)
 
 static void unlock_fs(struct mapped_device *md)
 {
-	struct block_device *bdev;
-
 	if (!test_bit(DMF_FROZEN, &md->flags))
 		return;
 
-	bdev = md->frozen_sb->s_bdev;
-	thaw_bdev(bdev, md->frozen_sb);
-	bdput(bdev);
+	thaw_bdev(md->frozen_sb->s_bdev, md->frozen_sb);
 	md->frozen_sb = NULL;
 	clear_bit(DMF_FROZEN, &md->flags);
 }
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 304eba1acf163c..9f642440894655 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -220,9 +220,8 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata)
 	 * enabling the device later.
 	 */
 	if (fdata->start_unit == 0) {
-		struct block_device *bdev = bdget_disk(block->gdp, 0);
-		bdev->bd_inode->i_blkbits = blksize_bits(fdata->blksize);
-		bdput(bdev);
+		block->gdp->part0->bd_inode->i_blkbits =
+			blksize_bits(fdata->blksize);
 	}
 
 	rc = base->discipline->format_device(base, fdata, 1);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 76/78] filemap: use ->f_mapping over ->i_mapping consistently
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (74 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 75/78] block: stop using bdget_disk for partition 0 Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-16 14:58 ` [PATCH 77/78] fs: simplify the get_super_thawed interface Christoph Hellwig
                   ` (3 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Use file->f_mapping in all functions that have a struct file available
to properly handle the case where file_inode(file)->i_mapping !=
inode->i_mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/filemap.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d5e7c2029d16b4..3e3531a757f8db 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2887,13 +2887,13 @@ EXPORT_SYMBOL(filemap_map_pages);
 vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf)
 {
 	struct page *page = vmf->page;
-	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct inode *inode = vmf->vma->vm_file->f_mapping->host;
 	vm_fault_t ret = VM_FAULT_LOCKED;
 
 	sb_start_pagefault(inode->i_sb);
 	file_update_time(vmf->vma->vm_file);
 	lock_page(page);
-	if (page->mapping != inode->i_mapping) {
+	if (page->mapping != vmf->vma->vm_file->f_mapping) {
 		unlock_page(page);
 		ret = VM_FAULT_NOPAGE;
 		goto out;
@@ -3149,10 +3149,9 @@ void dio_warn_stale_pagecache(struct file *filp)
 {
 	static DEFINE_RATELIMIT_STATE(_rs, 86400 * HZ, DEFAULT_RATELIMIT_BURST);
 	char pathname[128];
-	struct inode *inode = file_inode(filp);
 	char *path;
 
-	errseq_set(&inode->i_mapping->wb_err, -EIO);
+	errseq_set(&filp->f_mapping->wb_err, -EIO);
 	if (__ratelimit(&_rs)) {
 		path = file_path(filp, pathname, sizeof(pathname));
 		if (IS_ERR(path))
@@ -3179,7 +3178,7 @@ generic_file_direct_write(struct kiocb *iocb, struct iov_iter *from)
 
 	if (iocb->ki_flags & IOCB_NOWAIT) {
 		/* If there are pages to writeback, return */
-		if (filemap_range_has_page(inode->i_mapping, pos,
+		if (filemap_range_has_page(file->f_mapping, pos,
 					   pos + write_len - 1))
 			return -EAGAIN;
 	} else {
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 77/78] fs: simplify the get_super_thawed interface
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (75 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 76/78] filemap: use ->f_mapping over ->i_mapping consistently Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-16 14:58 ` [PATCH 78/78] block: remove i_bdev Christoph Hellwig
                   ` (2 subsequent siblings)
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Merge get_super_thawed and get_super_exclusive_thawed into a single
function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/quota/quota.c   |  4 ++--
 fs/super.c         | 42 +++++++++++-------------------------------
 include/linux/fs.h |  3 +--
 3 files changed, 14 insertions(+), 35 deletions(-)

diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 9af95c7a0bbe3c..21d43933213965 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -876,9 +876,9 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 	if (IS_ERR(bdev))
 		return ERR_CAST(bdev);
 	if (quotactl_cmd_onoff(cmd))
-		sb = get_super_exclusive_thawed(bdev);
+		sb = get_super_thawed(bdev, true);
 	else if (quotactl_cmd_write(cmd))
-		sb = get_super_thawed(bdev);
+		sb = get_super_thawed(bdev, false);
 	else
 		sb = get_super(bdev);
 	bdput(bdev);
diff --git a/fs/super.c b/fs/super.c
index b327a82bc1946b..50995f8abd1bf1 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -789,8 +789,17 @@ struct super_block *get_super(struct block_device *bdev)
 }
 EXPORT_SYMBOL(get_super);
 
-static struct super_block *__get_super_thawed(struct block_device *bdev,
-					      bool excl)
+/**
+ * get_super_thawed - get thawed superblock of a device
+ * @bdev: device to get the superblock for
+ * @excl: lock s_umount exclusive if %true, else shared.
+ *
+ * Scans the superblock list and finds the superblock of the file system mounted
+ * on the device.  The superblock is returned with s_umount held once it is
+ * thawed (or immediately if it was not frozen), or %NULL if no superblock was
+ * found.
+ */
+struct super_block *get_super_thawed(struct block_device *bdev, bool excl)
 {
 	while (1) {
 		struct super_block *s = __get_super(bdev, excl);
@@ -805,37 +814,8 @@ static struct super_block *__get_super_thawed(struct block_device *bdev,
 		put_super(s);
 	}
 }
-
-/**
- *	get_super_thawed - get thawed superblock of a device
- *	@bdev: device to get the superblock for
- *
- *	Scans the superblock list and finds the superblock of the file system
- *	mounted on the device. The superblock is returned once it is thawed
- *	(or immediately if it was not frozen). %NULL is returned if no match
- *	is found.
- */
-struct super_block *get_super_thawed(struct block_device *bdev)
-{
-	return __get_super_thawed(bdev, false);
-}
 EXPORT_SYMBOL(get_super_thawed);
 
-/**
- *	get_super_exclusive_thawed - get thawed superblock of a device
- *	@bdev: device to get the superblock for
- *
- *	Scans the superblock list and finds the superblock of the file system
- *	mounted on the device. The superblock is returned once it is thawed
- *	(or immediately if it was not frozen) and s_umount semaphore is held
- *	in exclusive mode. %NULL is returned if no match is found.
- */
-struct super_block *get_super_exclusive_thawed(struct block_device *bdev)
-{
-	return __get_super_thawed(bdev, true);
-}
-EXPORT_SYMBOL(get_super_exclusive_thawed);
-
 /**
  * get_active_super - get an active reference to the superblock of a device
  * @bdev: device to get the superblock for
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8667d0cdc71e76..d026d177a526bf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3132,8 +3132,7 @@ extern struct file_system_type *get_filesystem(struct file_system_type *fs);
 extern void put_filesystem(struct file_system_type *fs);
 extern struct file_system_type *get_fs_type(const char *name);
 extern struct super_block *get_super(struct block_device *);
-extern struct super_block *get_super_thawed(struct block_device *);
-extern struct super_block *get_super_exclusive_thawed(struct block_device *bdev);
+struct super_block *get_super_thawed(struct block_device *bdev, bool excl);
 extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 78/78] block: remove i_bdev
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (76 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 77/78] fs: simplify the get_super_thawed interface Christoph Hellwig
@ 2020-11-16 14:58 ` Christoph Hellwig
  2020-11-16 15:05 ` cleanup updating the size of block devices v3 Christoph Hellwig
  2020-11-16 15:40 ` Jens Axboe
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 14:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

Switch the block device lookup interfaces to directly work with a dev_t
so that struct block_device references are only acquired by the
blkdev_get variants (and the blk-cgroup special case).  This means that
we not don't need an extra reference in the inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/ioctl.c                                |   3 +-
 drivers/block/loop.c                         |   8 +-
 drivers/md/dm-table.c                        |   9 +-
 drivers/mtd/mtdsuper.c                       |  17 +-
 drivers/target/target_core_file.c            |   6 +-
 drivers/usb/gadget/function/storage_common.c |   8 +-
 fs/block_dev.c                               | 206 +++++--------------
 fs/btrfs/volumes.c                           |  13 +-
 fs/inode.c                                   |   3 -
 fs/internal.h                                |   6 +-
 fs/io_uring.c                                |   2 +-
 fs/pipe.c                                    |   5 +-
 fs/quota/quota.c                             |  31 +--
 fs/statfs.c                                  |   2 +-
 fs/super.c                                   |  63 ++----
 include/linux/blkdev.h                       |   2 +-
 include/linux/fs.h                           |   4 +-
 17 files changed, 114 insertions(+), 274 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 7207b716b6c9a7..39341409927607 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -602,8 +602,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
 {
 	int ret;
 	void __user *argp = compat_ptr(arg);
-	struct inode *inode = file->f_mapping->host;
-	struct block_device *bdev = inode->i_bdev;
+	struct block_device *bdev = I_BDEV(file->f_mapping->host);
 	struct gendisk *disk = bdev->bd_disk;
 	fmode_t mode = file->f_mode;
 	loff_t size;
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 91e47c5b52f1cb..4a0037586f93b2 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -675,10 +675,10 @@ static int loop_validate_file(struct file *file, struct block_device *bdev)
 	while (is_loop_device(f)) {
 		struct loop_device *l;
 
-		if (f->f_mapping->host->i_bdev == bdev)
+		if (f->f_mapping->host->i_rdev == bdev->bd_dev)
 			return -EBADF;
 
-		l = f->f_mapping->host->i_bdev->bd_disk->private_data;
+		l = I_BDEV(f->f_mapping->host)->bd_disk->private_data;
 		if (l->lo_state != Lo_bound) {
 			return -EINVAL;
 		}
@@ -885,9 +885,7 @@ static void loop_config_discard(struct loop_device *lo)
 	 * file-backed loop devices: discarded regions read back as zero.
 	 */
 	if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
-		struct request_queue *backingq;
-
-		backingq = bdev_get_queue(inode->i_bdev);
+		struct request_queue *backingq = bdev_get_queue(I_BDEV(inode));
 
 		max_discard_sectors = backingq->limits.max_write_zeroes_sectors;
 		granularity = backingq->limits.discard_granularity ?:
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index ce543b761be7b2..dea67772171053 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -348,16 +348,9 @@ static int upgrade_mode(struct dm_dev_internal *dd, fmode_t new_mode,
 dev_t dm_get_dev_t(const char *path)
 {
 	dev_t dev;
-	struct block_device *bdev;
 
-	bdev = lookup_bdev(path);
-	if (IS_ERR(bdev))
+	if (lookup_bdev(path, &dev))
 		dev = name_to_dev_t(path);
-	else {
-		dev = bdev->bd_dev;
-		bdput(bdev);
-	}
-
 	return dev;
 }
 EXPORT_SYMBOL_GPL(dm_get_dev_t);
diff --git a/drivers/mtd/mtdsuper.c b/drivers/mtd/mtdsuper.c
index c3e2098372f2e5..38b6aa849c6383 100644
--- a/drivers/mtd/mtdsuper.c
+++ b/drivers/mtd/mtdsuper.c
@@ -120,8 +120,8 @@ int get_tree_mtd(struct fs_context *fc,
 				struct fs_context *fc))
 {
 #ifdef CONFIG_BLOCK
-	struct block_device *bdev;
-	int ret, major;
+	dev_t dev;
+	int ret;
 #endif
 	int mtdnr;
 
@@ -169,20 +169,15 @@ int get_tree_mtd(struct fs_context *fc,
 	/* try the old way - the hack where we allowed users to mount
 	 * /dev/mtdblock$(n) but didn't actually _use_ the blockdev
 	 */
-	bdev = lookup_bdev(fc->source);
-	if (IS_ERR(bdev)) {
-		ret = PTR_ERR(bdev);
+	ret = lookup_bdev(fc->source, &dev);
+	if (ret) {
 		errorf(fc, "MTD: Couldn't look up '%s': %d", fc->source, ret);
 		return ret;
 	}
 	pr_debug("MTDSB: lookup_bdev() returned 0\n");
 
-	major = MAJOR(bdev->bd_dev);
-	mtdnr = MINOR(bdev->bd_dev);
-	bdput(bdev);
-
-	if (major == MTD_BLOCK_MAJOR)
-		return mtd_get_sb_by_nr(fc, mtdnr, fill_super);
+	if (MAJOR(dev) == MTD_BLOCK_MAJOR)
+		return mtd_get_sb_by_nr(fc, MINOR(dev), fill_super);
 
 #endif /* CONFIG_BLOCK */
 
diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 7143d03f0e027e..b0cb5b95e892d3 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -133,10 +133,10 @@ static int fd_configure_device(struct se_device *dev)
 	 */
 	inode = file->f_mapping->host;
 	if (S_ISBLK(inode->i_mode)) {
-		struct request_queue *q = bdev_get_queue(inode->i_bdev);
+		struct request_queue *q = bdev_get_queue(I_BDEV(inode));
 		unsigned long long dev_size;
 
-		fd_dev->fd_block_size = bdev_logical_block_size(inode->i_bdev);
+		fd_dev->fd_block_size = bdev_logical_block_size(I_BDEV(inode));
 		/*
 		 * Determine the number of bytes from i_size_read() minus
 		 * one (1) logical sector from underlying struct block_device
@@ -559,7 +559,7 @@ fd_execute_unmap(struct se_cmd *cmd, sector_t lba, sector_t nolb)
 
 	if (S_ISBLK(inode->i_mode)) {
 		/* The backend is block device, use discard */
-		struct block_device *bdev = inode->i_bdev;
+		struct block_device *bdev = I_BDEV(inode);
 		struct se_device *dev = cmd->se_dev;
 
 		ret = blkdev_issue_discard(bdev,
diff --git a/drivers/usb/gadget/function/storage_common.c b/drivers/usb/gadget/function/storage_common.c
index f7e6c42558eb76..b859a158a4140e 100644
--- a/drivers/usb/gadget/function/storage_common.c
+++ b/drivers/usb/gadget/function/storage_common.c
@@ -204,7 +204,7 @@ int fsg_lun_open(struct fsg_lun *curlun, const char *filename)
 	if (!(filp->f_mode & FMODE_WRITE))
 		ro = 1;
 
-	inode = file_inode(filp);
+	inode = filp->f_mapping->host;
 	if ((!S_ISREG(inode->i_mode) && !S_ISBLK(inode->i_mode))) {
 		LINFO(curlun, "invalid file type: %s\n", filename);
 		goto out;
@@ -221,7 +221,7 @@ int fsg_lun_open(struct fsg_lun *curlun, const char *filename)
 	if (!(filp->f_mode & FMODE_CAN_WRITE))
 		ro = 1;
 
-	size = i_size_read(inode->i_mapping->host);
+	size = i_size_read(inode);
 	if (size < 0) {
 		LINFO(curlun, "unable to find file size: %s\n", filename);
 		rc = (int) size;
@@ -231,8 +231,8 @@ int fsg_lun_open(struct fsg_lun *curlun, const char *filename)
 	if (curlun->cdrom) {
 		blksize = 2048;
 		blkbits = 11;
-	} else if (inode->i_bdev) {
-		blksize = bdev_logical_block_size(inode->i_bdev);
+	} else if (S_ISBLK(inode->i_mode)) {
+		blksize = bdev_logical_block_size(I_BDEV(inode));
 		blkbits = blksize_bits(blksize);
 	} else {
 		blksize = 512;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index e1457bf76c6f34..6b43ee6ee571df 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -523,7 +523,7 @@ EXPORT_SYMBOL(sync_blockdev);
  */
 int fsync_bdev(struct block_device *bdev)
 {
-	struct super_block *sb = get_super(bdev);
+	struct super_block *sb = get_super(bdev->bd_dev, false);
 	if (sb) {
 		int res = sync_filesystem(sb);
 		drop_super(sb);
@@ -557,7 +557,7 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 		 * to freeze_bdev grab an active reference and only the last
 		 * thaw_bdev drops it.
 		 */
-		sb = get_super(bdev);
+		sb = get_super(bdev->bd_dev, false);
 		if (sb)
 			drop_super(sb);
 		mutex_unlock(&bdev->bd_fsfreeze_mutex);
@@ -890,7 +890,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 
 	inode->i_mode = S_IFBLK;
 	inode->i_rdev = 0;
-	inode->i_bdev = bdev;
 	inode->i_data.a_ops = &def_blk_aops;
 
 	return bdev;
@@ -942,71 +941,8 @@ void bdput(struct block_device *bdev)
 {
 	iput(bdev->bd_inode);
 }
-
 EXPORT_SYMBOL(bdput);
  
-static struct block_device *bd_acquire(struct inode *inode)
-{
-	struct block_device *bdev;
-
-	spin_lock(&bdev_lock);
-	bdev = inode->i_bdev;
-	if (bdev && !inode_unhashed(bdev->bd_inode)) {
-		bdgrab(bdev);
-		spin_unlock(&bdev_lock);
-		return bdev;
-	}
-	spin_unlock(&bdev_lock);
-
-	/*
-	 * i_bdev references block device inode that was already shut down
-	 * (corresponding device got removed).  Remove the reference and look
-	 * up block device inode again just in case new device got
-	 * reestablished under the same device number.
-	 */
-	if (bdev)
-		bd_forget(inode);
-
-	bdev = bdget(inode->i_rdev);
-	if (!bdev) {
-		blk_request_module(inode->i_rdev);
-		bdev = bdget(inode->i_rdev);
-	}
-	if (bdev) {
-		spin_lock(&bdev_lock);
-		if (!inode->i_bdev) {
-			/*
-			 * We take an additional reference to bd_inode,
-			 * and it's released in clear_inode() of inode.
-			 * So, we can access it via ->i_mapping always
-			 * without igrab().
-			 */
-			bdgrab(bdev);
-			inode->i_bdev = bdev;
-			inode->i_mapping = bdev->bd_inode->i_mapping;
-		}
-		spin_unlock(&bdev_lock);
-	}
-	return bdev;
-}
-
-/* Call when you free inode */
-
-void bd_forget(struct inode *inode)
-{
-	struct block_device *bdev = NULL;
-
-	spin_lock(&bdev_lock);
-	if (!sb_is_blkdev_sb(inode->i_sb))
-		bdev = inode->i_bdev;
-	inode->i_bdev = NULL;
-	inode->i_mapping = &inode->i_data;
-	spin_unlock(&bdev_lock);
-
-	if (bdev)
-		bdput(bdev);
-}
-
 /**
  * bd_may_claim - test whether a block device can be claimed
  * @bdev: block device of interest
@@ -1485,32 +1421,44 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, void *holder,
 }
 
 /**
- * blkdev_get - open a block device
- * @bdev: block_device to open
+ * blkdev_get_by_dev - open a block device by device number
+ * @dev: device number of block device to open
  * @mode: FMODE_* mask
  * @holder: exclusive holder identifier
  *
- * Open @bdev with @mode.  If @mode includes %FMODE_EXCL, @bdev is
- * open with exclusive access.  Specifying %FMODE_EXCL with %NULL
- * @holder is invalid.  Exclusive opens may nest for the same @holder.
+ * Open the block device described by device number @dev.  If @mode includes
+ * If @mode includes %FMODE_EXCL, the block device is opened with exclusive
+ * access.  Specifying %FMODE_EXCL with a %NULL @holder is invalid.  Exclusive
+ * opens may nest for the same @holder.
  *
- * On success, the reference count of @bdev is unchanged.  On failure,
- * @bdev is put.
+ * Use this interface ONLY if you really do not have anything better - i.e. when
+ * you are behind a truly sucky interface and all you are given is a device
+ * number.  Everything else should use blkdev_get_by_path().
  *
  * CONTEXT:
  * Might sleep.
  *
  * RETURNS:
- * 0 on success, -errno on failure.
+ * Reference to the block_device on success, ERR_PTR(-errno) on failure.
  */
-static int blkdev_get(struct block_device *bdev, fmode_t mode, void *holder)
+struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
 {
+	struct block_device *bdev;
 	int ret, perm = 0;
 
 	if (mode & FMODE_READ)
 		perm |= MAY_READ;
 	if (mode & FMODE_WRITE)
 		perm |= MAY_WRITE;
+
+	bdev = bdget(dev);
+	if (!bdev) {
+		blk_request_module(dev);
+		bdev = bdget(dev);
+		if (!bdev)
+			return ERR_PTR(-ENOMEM);
+	}
+
 	ret = devcgroup_inode_permission(bdev->bd_inode, perm);
 	if (ret)
 		goto bdput;
@@ -1522,8 +1470,9 @@ static int blkdev_get(struct block_device *bdev, fmode_t mode, void *holder)
 
 bdput:
 	bdput(bdev);
-	return ret;
+	return ERR_PTR(ret);
 }
+EXPORT_SYMBOL(blkdev_get_by_dev);
 
 /**
  * blkdev_get_by_path - open a block device by name
@@ -1531,32 +1480,31 @@ static int blkdev_get(struct block_device *bdev, fmode_t mode, void *holder)
  * @mode: FMODE_* mask
  * @holder: exclusive holder identifier
  *
- * Open the blockdevice described by the device file at @path.  @mode
- * and @holder are identical to blkdev_get().
+ * Open the block device described by the device file at &path.
  *
- * On success, the returned block_device has reference count of one.
+ * If @mode includes %FMODE_EXCL, the block device is opened with exclusive
+ * access.  Specifying %FMODE_EXCL with a %NULL @holder is invalid.  Exclusive
+ * opens may nest for the same @holder.
  *
  * CONTEXT:
  * Might sleep.
  *
  * RETURNS:
- * Pointer to block_device on success, ERR_PTR(-errno) on failure.
+ * Reference to the block_device on success, ERR_PTR(-errno) on failure.
  */
 struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
 					void *holder)
 {
 	struct block_device *bdev;
-	int err;
-
-	bdev = lookup_bdev(path);
-	if (IS_ERR(bdev))
-		return bdev;
+	dev_t dev;
+	int error;
 
-	err = blkdev_get(bdev, mode, holder);
-	if (err)
-		return ERR_PTR(err);
+	error = lookup_bdev(path, &dev);
+	if (error)
+		return ERR_PTR(error);
 
-	if ((mode & FMODE_WRITE) && bdev_read_only(bdev)) {
+	bdev = blkdev_get_by_dev(dev, mode, holder);
+	if (!IS_ERR(bdev) && (mode & FMODE_WRITE) && bdev_read_only(bdev)) {
 		blkdev_put(bdev, mode);
 		return ERR_PTR(-EACCES);
 	}
@@ -1565,49 +1513,6 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
 }
 EXPORT_SYMBOL(blkdev_get_by_path);
 
-/**
- * blkdev_get_by_dev - open a block device by device number
- * @dev: device number of block device to open
- * @mode: FMODE_* mask
- * @holder: exclusive holder identifier
- *
- * Open the blockdevice described by device number @dev.  @mode and
- * @holder are identical to blkdev_get().
- *
- * Use it ONLY if you really do not have anything better - i.e. when
- * you are behind a truly sucky interface and all you are given is a
- * device number.  _Never_ to be used for internal purposes.  If you
- * ever need it - reconsider your API.
- *
- * On success, the returned block_device has reference count of one.
- *
- * CONTEXT:
- * Might sleep.
- *
- * RETURNS:
- * Pointer to block_device on success, ERR_PTR(-errno) on failure.
- */
-struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
-{
-	struct block_device *bdev;
-	int err;
-
-	bdev = bdget(dev);
-	if (!bdev) {
-		blk_request_module(dev);
-		bdev = bdget(dev);
-	}
-	if (!bdev)
-		return ERR_PTR(-ENOMEM);
-
-	err = blkdev_get(bdev, mode, holder);
-	if (err)
-		return ERR_PTR(err);
-
-	return bdev;
-}
-EXPORT_SYMBOL(blkdev_get_by_dev);
-
 static int blkdev_open(struct inode * inode, struct file * filp)
 {
 	struct block_device *bdev;
@@ -1629,14 +1534,12 @@ static int blkdev_open(struct inode * inode, struct file * filp)
 	if ((filp->f_flags & O_ACCMODE) == 3)
 		filp->f_mode |= FMODE_WRITE_IOCTL;
 
-	bdev = bd_acquire(inode);
-	if (bdev == NULL)
-		return -ENOMEM;
-
+	bdev = blkdev_get_by_dev(inode->i_rdev, filp->f_mode, filp);
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
 	filp->f_mapping = bdev->bd_inode->i_mapping;
 	filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);
-
-	return blkdev_get(bdev, filp->f_mode, filp);
+	return 0;
 }
 
 static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
@@ -1939,43 +1842,38 @@ const struct file_operations def_blk_fops = {
  * namespace if possible and return it.  Return ERR_PTR(error)
  * otherwise.
  */
-struct block_device *lookup_bdev(const char *pathname)
+int lookup_bdev(const char *pathname, dev_t *dev)
 {
-	struct block_device *bdev;
 	struct inode *inode;
 	struct path path;
 	int error;
 
 	if (!pathname || !*pathname)
-		return ERR_PTR(-EINVAL);
+		return -EINVAL;
 
 	error = kern_path(pathname, LOOKUP_FOLLOW, &path);
 	if (error)
-		return ERR_PTR(error);
+		return error;
 
 	inode = d_backing_inode(path.dentry);
 	error = -ENOTBLK;
 	if (!S_ISBLK(inode->i_mode))
-		goto fail;
+		goto out_path_put;
 	error = -EACCES;
 	if (!may_open_dev(&path))
-		goto fail;
-	error = -ENOMEM;
-	bdev = bd_acquire(inode);
-	if (!bdev)
-		goto fail;
-out:
+		goto out_path_put;
+
+	*dev = inode->i_rdev;
+	error = 0;
+out_path_put:
 	path_put(&path);
-	return bdev;
-fail:
-	bdev = ERR_PTR(error);
-	goto out;
+	return error;
 }
 EXPORT_SYMBOL(lookup_bdev);
 
 int __invalidate_device(struct block_device *bdev, bool kill_dirty)
 {
-	struct super_block *sb = get_super(bdev);
+	struct super_block *sb = get_super(bdev->bd_dev, false);
 	int res = 0;
 
 	if (sb) {
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ce43732f945f45..76dedfcbd03716 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -929,16 +929,16 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 		 * make sure it's the same device if the device is mounted
 		 */
 		if (device->bdev) {
-			struct block_device *path_bdev;
+			int error;
+			dev_t path_dev;
 
-			path_bdev = lookup_bdev(path);
-			if (IS_ERR(path_bdev)) {
+			error = lookup_bdev(path, &path_dev);
+			if (error) {
 				mutex_unlock(&fs_devices->device_list_mutex);
-				return ERR_CAST(path_bdev);
+				return ERR_PTR(error);
 			}
 
-			if (device->bdev != path_bdev) {
-				bdput(path_bdev);
+			if (device->bdev->bd_dev != path_dev) {
 				mutex_unlock(&fs_devices->device_list_mutex);
 				btrfs_warn_in_rcu(device->fs_info,
 	"duplicate device %s devid %llu generation %llu scanned by %s (%d)",
@@ -947,7 +947,6 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 						  task_pid_nr(current));
 				return ERR_PTR(-EEXIST);
 			}
-			bdput(path_bdev);
 			btrfs_info_in_rcu(device->fs_info,
 	"devid %llu device path %s changed to %s scanned by %s (%d)",
 					  devid, rcu_str_deref(device->name),
diff --git a/fs/inode.c b/fs/inode.c
index 9d78c37b00b817..cb008acf0efdb8 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -155,7 +155,6 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	inode->i_bytes = 0;
 	inode->i_generation = 0;
 	inode->i_pipe = NULL;
-	inode->i_bdev = NULL;
 	inode->i_cdev = NULL;
 	inode->i_link = NULL;
 	inode->i_dir_seq = 0;
@@ -580,8 +579,6 @@ static void evict(struct inode *inode)
 		truncate_inode_pages_final(&inode->i_data);
 		clear_inode(inode);
 	}
-	if (S_ISBLK(inode->i_mode) && inode->i_bdev)
-		bd_forget(inode);
 	if (S_ISCHR(inode->i_mode) && inode->i_cdev)
 		cd_forget(inode);
 
diff --git a/fs/internal.h b/fs/internal.h
index a7cd0f64faa4ab..36f87f0ac4f969 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -25,7 +25,6 @@ extern void __init bdev_cache_init(void);
 extern int __sync_blockdev(struct block_device *bdev, int wait);
 void iterate_bdevs(void (*)(struct block_device *, void *), void *);
 void emergency_thaw_bdev(struct super_block *sb);
-void bd_forget(struct inode *inode);
 #else
 static inline void bdev_cache_init(void)
 {
@@ -43,9 +42,6 @@ static inline int emergency_thaw_bdev(struct super_block *sb)
 {
 	return 0;
 }
-static inline void bd_forget(struct inode *inode)
-{
-}
 #endif /* CONFIG_BLOCK */
 
 /*
@@ -114,7 +110,7 @@ extern struct file *alloc_empty_file_noaccount(int, const struct cred *);
  */
 extern int reconfigure_super(struct fs_context *);
 extern bool trylock_super(struct super_block *sb);
-extern struct super_block *user_get_super(dev_t);
+struct super_block *get_super(dev_t, bool excl);
 extern bool mount_capable(struct fs_context *);
 
 /*
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4ead291b2976f3..84d2fae8518471 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2733,7 +2733,7 @@ static bool io_file_supports_async(struct file *file, int rw)
 	umode_t mode = file_inode(file)->i_mode;
 
 	if (S_ISBLK(mode)) {
-		if (io_bdev_nowait(file->f_inode->i_bdev))
+		if (io_bdev_nowait(I_BDEV(file->f_mapping->host)))
 			return true;
 		return false;
 	}
diff --git a/fs/pipe.c b/fs/pipe.c
index 0ac197658a2d6e..c5989cfd564d45 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -1342,9 +1342,8 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long arg)
 }
 
 /*
- * After the inode slimming patch, i_pipe/i_bdev/i_cdev share the same
- * location, so checking ->i_pipe is not enough to verify that this is a
- * pipe.
+ * Note that i_pipe and i_cdev share the same location, so checking ->i_pipe is
+ * not enough to verify that this is a pipe.
  */
 struct pipe_inode_info *get_pipe_info(struct file *file, bool for_splice)
 {
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 21d43933213965..3087225b90880c 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -20,6 +20,7 @@
 #include <linux/writeback.h>
 #include <linux/nospec.h>
 #include "compat.h"
+#include "../internal.h"
 
 static int check_quotactl_permission(struct super_block *sb, int type, int cmd,
 				     qid_t id)
@@ -864,31 +865,31 @@ static bool quotactl_cmd_onoff(int cmd)
  */
 static struct super_block *quotactl_block(const char __user *special, int cmd)
 {
-#ifdef CONFIG_BLOCK
-	struct block_device *bdev;
 	struct super_block *sb;
-	struct filename *tmp = getname(special);
+	struct filename *tmp;
+	int error;
+	dev_t dev;
+
+	if (!IS_ENABLED(CONFIG_BLOCK))
+		return ERR_PTR(-ENODEV);
 
+	tmp = getname(special);
 	if (IS_ERR(tmp))
 		return ERR_CAST(tmp);
-	bdev = lookup_bdev(tmp->name);
-	putname(tmp);
-	if (IS_ERR(bdev))
-		return ERR_CAST(bdev);
+	error = lookup_bdev(tmp->name, &dev);
+	if (error)
+		return ERR_PTR(error);
+
 	if (quotactl_cmd_onoff(cmd))
-		sb = get_super_thawed(bdev, true);
+		sb = get_super_thawed(dev, true);
 	else if (quotactl_cmd_write(cmd))
-		sb = get_super_thawed(bdev, false);
+		sb = get_super_thawed(dev, false);
 	else
-		sb = get_super(bdev);
-	bdput(bdev);
+		sb = get_super(dev, false);
+
 	if (!sb)
 		return ERR_PTR(-ENODEV);
-
 	return sb;
-#else
-	return ERR_PTR(-ENODEV);
-#endif
 }
 
 /*
diff --git a/fs/statfs.c b/fs/statfs.c
index 59f33752c1311f..52230a9814337a 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -235,7 +235,7 @@ SYSCALL_DEFINE3(fstatfs64, unsigned int, fd, size_t, sz, struct statfs64 __user
 
 static int vfs_ustat(dev_t dev, struct kstatfs *sbuf)
 {
-	struct super_block *s = user_get_super(dev);
+	struct super_block *s = get_super(dev);
 	int err;
 	if (!s)
 		return -EINVAL;
diff --git a/fs/super.c b/fs/super.c
index 50995f8abd1bf1..ffc16e4eee99c4 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -740,19 +740,25 @@ void iterate_supers_type(struct file_system_type *type,
 
 EXPORT_SYMBOL(iterate_supers_type);
 
-static struct super_block *__get_super(struct block_device *bdev, bool excl)
+/**
+ * get_super - get the superblock of a device
+ * @dev: device to get the superblock for
+ * @excl: lock s_umount exclusive if %true, else shared.
+ *
+ * Scans the superblock list and finds the superblock of the file system mounted
+ * on the device.  The superblock is returned with s_umount held, or %NULL if no
+ * superblock was found.
+ */
+struct super_block *get_super(dev_t dev, bool excl)
 {
 	struct super_block *sb;
 
-	if (!bdev)
-		return NULL;
-
 	spin_lock(&sb_lock);
 rescan:
 	list_for_each_entry(sb, &super_blocks, s_list) {
 		if (hlist_unhashed(&sb->s_instances))
 			continue;
-		if (sb->s_bdev == bdev) {
+		if (sb->s_dev == dev) {
 			sb->s_count++;
 			spin_unlock(&sb_lock);
 			if (!excl)
@@ -776,22 +782,9 @@ static struct super_block *__get_super(struct block_device *bdev, bool excl)
 	return NULL;
 }
 
-/**
- *	get_super - get the superblock of a device
- *	@bdev: device to get the superblock for
- *
- *	Scans the superblock list and finds the superblock of the file system
- *	mounted on the device given. %NULL is returned if no match is found.
- */
-struct super_block *get_super(struct block_device *bdev)
-{
-	return __get_super(bdev, false);
-}
-EXPORT_SYMBOL(get_super);
-
 /**
  * get_super_thawed - get thawed superblock of a device
- * @bdev: device to get the superblock for
+ * @dev: device to get the superblock for
  * @excl: lock s_umount exclusive if %true, else shared.
  *
  * Scans the superblock list and finds the superblock of the file system mounted
@@ -799,10 +792,11 @@ EXPORT_SYMBOL(get_super);
  * thawed (or immediately if it was not frozen), or %NULL if no superblock was
  * found.
  */
-struct super_block *get_super_thawed(struct block_device *bdev, bool excl)
+struct super_block *get_super_thawed(dev_t dev, bool excl)
 {
 	while (1) {
-		struct super_block *s = __get_super(bdev, excl);
+		struct super_block *s = get_super(dev, excl);
+
 		if (!s || s->s_writers.frozen == SB_UNFROZEN)
 			return s;
 		if (!excl)
@@ -847,33 +841,6 @@ struct super_block *get_active_super(struct block_device *bdev)
 	return NULL;
 }
 
-struct super_block *user_get_super(dev_t dev)
-{
-	struct super_block *sb;
-
-	spin_lock(&sb_lock);
-rescan:
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (hlist_unhashed(&sb->s_instances))
-			continue;
-		if (sb->s_dev ==  dev) {
-			sb->s_count++;
-			spin_unlock(&sb_lock);
-			down_read(&sb->s_umount);
-			/* still alive? */
-			if (sb->s_root && (sb->s_flags & SB_BORN))
-				return sb;
-			up_read(&sb->s_umount);
-			/* nope, got unmounted */
-			spin_lock(&sb_lock);
-			__put_super(sb);
-			goto rescan;
-		}
-	}
-	spin_unlock(&sb_lock);
-	return NULL;
-}
-
 /**
  * reconfigure_super - asks filesystem to change superblock parameters
  * @fc: The superblock and configuration
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ed40144ab80339..9dc44f1ae22bb1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1973,7 +1973,7 @@ int bdev_read_only(struct block_device *bdev);
 int set_blocksize(struct block_device *bdev, int size);
 
 const char *bdevname(struct block_device *bdev, char *buffer);
-struct block_device *lookup_bdev(const char *);
+int lookup_bdev(const char *pathname, dev_t *dev);
 
 void blkdev_show(struct seq_file *seqf, off_t offset);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d026d177a526bf..bd16b8ad5dde32 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -696,7 +696,6 @@ struct inode {
 	struct list_head	i_devices;
 	union {
 		struct pipe_inode_info	*i_pipe;
-		struct block_device	*i_bdev;
 		struct cdev		*i_cdev;
 		char			*i_link;
 		unsigned		i_dir_seq;
@@ -3131,8 +3130,7 @@ extern int vfs_readlink(struct dentry *, char __user *, int);
 extern struct file_system_type *get_filesystem(struct file_system_type *fs);
 extern void put_filesystem(struct file_system_type *fs);
 extern struct file_system_type *get_fs_type(const char *name);
-extern struct super_block *get_super(struct block_device *);
-struct super_block *get_super_thawed(struct block_device *bdev, bool excl);
+struct super_block *get_super_thawed(dev_t dev, bool excl);
 extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: cleanup updating the size of block devices v3
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (77 preceding siblings ...)
  2020-11-16 14:58 ` [PATCH 78/78] block: remove i_bdev Christoph Hellwig
@ 2020-11-16 15:05 ` Christoph Hellwig
  2020-11-16 15:40 ` Jens Axboe
  79 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-16 15:05 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Justin Sanders, Mike Snitzer, Michael S. Tsirkin, Jason Wang,
	linux-nvme, Song Liu, dm-devel, drbd-dev, linux-scsi, xen-devel,
	Ilya Dryomov, Jack Wang, Konrad Rzeszutek Wilk, Josef Bacik, nbd,
	linux-raid, Stefan Hajnoczi, ceph-devel, linux-block,
	Martin K. Petersen, Minchan Kim, linux-fsdevel, Paolo Bonzini,
	Roger Pau Monné

Oops,

this is a bigger patch bomb than intended.  Only patches 1-23 are this
series which should be ready to be applied once for-5.11/block pulles in
5.10-rc4.

After that follow patches already in for-5.11/block and my current hot
off the press development branch.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: cleanup updating the size of block devices v3
  2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
                   ` (78 preceding siblings ...)
  2020-11-16 15:05 ` cleanup updating the size of block devices v3 Christoph Hellwig
@ 2020-11-16 15:40 ` Jens Axboe
  79 siblings, 0 replies; 113+ messages in thread
From: Jens Axboe @ 2020-11-16 15:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 7:56 AM, Christoph Hellwig wrote:
> Hi Jens,
> 
> this series builds on top of the work that went into the last merge window,
> and make sure we have a single coherent interfac for updating the size of a
> block device.
> 
> Changes since v2:
>  - rebased to the set_capacity_revalidate_and_notify in mainline
>  - keep the loop_set_size function
>  - fix two mixed up acks

Applied 1-23 for 5.11, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 28/78] md: implement ->set_read_only to hook into BLKROSET processing
  2020-11-16 14:57 ` [PATCH 28/78] md: " Christoph Hellwig
@ 2020-11-16 17:37   ` Song Liu
  0 siblings, 0 replies; 113+ messages in thread
From: Song Liu @ 2020-11-16 17:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Martin K. Petersen, dm-devel,
	linux-block, drbd-dev, nbd, ceph-devel, xen-devel, linux-raid,
	linux-nvme, linux-scsi, Linux-Fsdevel

On Mon, Nov 16, 2020 at 6:58 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Implement the ->set_read_only method instead of parsing the actual
> ioctl command.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Acked-by: Song Liu <song@kernel.org>

[...]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 30/78] block: don't call into the driver for BLKROSET
  2020-11-16 14:57 ` [PATCH 30/78] block: don't call into the driver for BLKROSET Christoph Hellwig
@ 2020-11-20  7:19   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:19 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Now that all drivers that want to hook into setting or clearing the
> read-only flag use the set_read_only method, this code can be removed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/ioctl.c | 23 -----------------------
>   1 file changed, 23 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 31/78] loop: use set_disk_ro
  2020-11-16 14:57 ` [PATCH 31/78] loop: use set_disk_ro Christoph Hellwig
@ 2020-11-20  7:20   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:20 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Use set_disk_ro instead of set_device_ro to match all other block
> drivers and to ensure all partitions mirror the read-only flag.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/loop.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 84a36c242e5550..41caf799df721f 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1134,7 +1134,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
>   	if (error)
>   		goto out_unlock;
>   
> -	set_device_ro(bdev, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
> +	set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
>   
>   	lo->use_dio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
>   	lo->lo_device = bdev;
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 32/78] block: remove set_device_ro
  2020-11-16 14:57 ` [PATCH 32/78] block: remove set_device_ro Christoph Hellwig
@ 2020-11-20  7:20   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:20 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Fold set_device_ro into its only remaining caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/genhd.c         | 7 -------
>   block/ioctl.c         | 2 +-
>   include/linux/genhd.h | 1 -
>   3 files changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 8c350fecfe8bfe..b0f0b0cac9aa7f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1843,13 +1843,6 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro)
>   	kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp);
>   }
>   
> -void set_device_ro(struct block_device *bdev, int flag)
> -{
> -	bdev->bd_part->policy = flag;
> -}
> -
> -EXPORT_SYMBOL(set_device_ro);
> -
>   void set_disk_ro(struct gendisk *disk, int flag)
>   {
>   	struct disk_part_iter piter;
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 96cb4544736468..04255dc5f3bff3 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -371,7 +371,7 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
>   		if (ret)
>   			return ret;
>   	}
> -	set_device_ro(bdev, n);
> +	bdev->bd_part->policy = n;
>   	return 0;
>   }
>   
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 4b22bfd9336e1a..8427ad8bef520d 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -304,7 +304,6 @@ extern void del_gendisk(struct gendisk *gp);
>   extern struct gendisk *get_gendisk(dev_t dev, int *partno);
>   extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
>   
> -extern void set_device_ro(struct block_device *bdev, int flag);
>   extern void set_disk_ro(struct gendisk *disk, int flag);
>   
>   static inline int get_disk_ro(struct gendisk *disk)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 33/78] block: remove __blkdev_driver_ioctl
  2020-11-16 14:57 ` [PATCH 33/78] block: remove __blkdev_driver_ioctl Christoph Hellwig
@ 2020-11-20  7:22   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:22 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Just open code it in the few callers.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/ioctl.c               | 25 +++++--------------------
>   drivers/block/pktcdvd.c     |  6 ++++--
>   drivers/md/bcache/request.c |  5 +++--
>   drivers/md/dm.c             |  5 ++++-
>   include/linux/blkdev.h      |  2 --
>   5 files changed, 16 insertions(+), 27 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 34/78] block: propagate BLKROSET to all partitions
  2020-11-16 14:57 ` [PATCH 34/78] block: propagate BLKROSET to all partitions Christoph Hellwig
@ 2020-11-20  7:23   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:23 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> When setting the whole device read-only (or clearing the read-only
> state), also update the policy for all partitions.  The s390 dasd
> driver has awlways been doing this and it makes a lot of sense.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/ioctl.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 6b785181344fe1..22f394d118c302 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -354,7 +354,10 @@ static int blkdev_roset(struct block_device *bdev, fmode_t mode,
>   		if (ret)
>   			return ret;
>   	}
> -	bdev->bd_part->policy = n;
> +	if (bdev_is_partition(bdev))
> +		bdev->bd_part->policy = n;
> +	else
> +		set_disk_ro(bdev->bd_disk, n);
>   	return 0;
>   }
>   
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats
  2020-11-16 14:57 ` [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats Christoph Hellwig
@ 2020-11-20  7:25   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:25 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> disk_get_part needs to be paired with a disk_put_part.
> 
> Fixes: ef45fe470e1 ("blk-cgroup: show global disk stats in root cgroup io.stat")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/blk-cgroup.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index c68bdf58c9a6e1..54fbe1e80cc41a 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -849,6 +849,7 @@ static void blkcg_fill_root_iostats(void)
>   			blkg_iostat_set(&blkg->iostat.cur, &tmp);
>   			u64_stats_update_end(&blkg->iostat.sync);
>   		}
> +		disk_put_part(part);
>   	}
>   }
>   
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 54/78] block: remove a duplicate __disk_get_part prototype
  2020-11-16 14:57 ` [PATCH 54/78] block: remove a duplicate __disk_get_part prototype Christoph Hellwig
@ 2020-11-20  7:25   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:25 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   include/linux/genhd.h | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 46553d6d602563..22f5b9fd96f8bf 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -250,7 +250,6 @@ static inline dev_t part_devt(struct hd_struct *part)
>   	return part_to_dev(part)->devt;
>   }
>   
> -extern struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
>   extern struct hd_struct *disk_get_part(struct gendisk *disk, int partno);
>   
>   static inline void disk_put_part(struct hd_struct *part)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 55/78] block: change the hash used for looking up block devices
  2020-11-16 14:57 ` [PATCH 55/78] block: change the hash used for looking up block devices Christoph Hellwig
@ 2020-11-20  7:26   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:26 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Adding the minor to the major creates tons of pointless conflicts. Just
> use the dev_t itself, which is 32-bits and thus is guaranteed to fit
> into ino_t.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/block_dev.c | 26 ++------------------------
>   1 file changed, 2 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index d8664f5c1ff669..29db12c3bb501c 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -870,35 +870,12 @@ void __init bdev_cache_init(void)
>   	blockdev_superblock = bd_mnt->mnt_sb;   /* For writeback */
>   }
>   
> -/*
> - * Most likely _very_ bad one - but then it's hardly critical for small
> - * /dev and can be fixed when somebody will need really large one.
> - * Keep in mind that it will be fed through icache hash function too.
> - */
> -static inline unsigned long hash(dev_t dev)
> -{
> -	return MAJOR(dev)+MINOR(dev);
> -}
> -
> -static int bdev_test(struct inode *inode, void *data)
> -{
> -	return BDEV_I(inode)->bdev.bd_dev == *(dev_t *)data;
> -}
> -
> -static int bdev_set(struct inode *inode, void *data)
> -{
> -	BDEV_I(inode)->bdev.bd_dev = *(dev_t *)data;
> -	return 0;
> -}
> -
>   static struct block_device *bdget(dev_t dev)
>   {
>   	struct block_device *bdev;
>   	struct inode *inode;
>   
> -	inode = iget5_locked(blockdev_superblock, hash(dev),
> -			bdev_test, bdev_set, &dev);
> -
> +	inode = iget_locked(blockdev_superblock, dev);
>   	if (!inode)
>   		return NULL;
>   
> @@ -910,6 +887,7 @@ static struct block_device *bdget(dev_t dev)
>   		bdev->bd_super = NULL;
>   		bdev->bd_inode = inode;
>   		bdev->bd_part_count = 0;
> +		bdev->bd_dev = dev;
>   		inode->i_mode = S_IFBLK;
>   		inode->i_rdev = dev;
>   		inode->i_bdev = bdev;
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 56/78] init: refactor name_to_dev_t
  2020-11-16 14:57 ` [PATCH 56/78] init: refactor name_to_dev_t Christoph Hellwig
@ 2020-11-20  7:31   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:31 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Split each case into a self-contained helper.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   include/linux/genhd.h |   7 +-
>   init/do_mounts.c      | 183 +++++++++++++++++++++---------------------
>   2 files changed, 91 insertions(+), 99 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 57/78] init: refactor devt_from_partuuid
  2020-11-16 14:57 ` [PATCH 57/78] init: refactor devt_from_partuuid Christoph Hellwig
@ 2020-11-20  7:33   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:33 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> The code in devt_from_partuuid is very convoluted.  Refactor a bit by
> sanitizing the goto and variable name usage.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   init/do_mounts.c | 68 ++++++++++++++++++++++--------------------------
>   1 file changed, 31 insertions(+), 37 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label
  2020-11-16 14:57 ` [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label Christoph Hellwig
@ 2020-11-20  7:34   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:34 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Avoid a totally pointless goto label, and use the same style of
> comparism for both helpers.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   init/do_mounts.c | 18 ++++++------------
>   1 file changed, 6 insertions(+), 12 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal
  2020-11-16 14:57 ` [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal Christoph Hellwig
@ 2020-11-20  7:35   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:35 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> del_gendisk already calls fsync_bdev for every partition, no need
> to do this twice.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/mtip32xx/mtip32xx.c | 15 ---------------
>   drivers/block/mtip32xx/mtip32xx.h |  2 --
>   2 files changed, 17 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 60/78] zram: remove the claim mechanism
  2020-11-16 14:57 ` [PATCH 60/78] zram: remove the claim mechanism Christoph Hellwig
@ 2020-11-20  7:37   ` Hannes Reinecke
  2020-11-26  1:11   ` Minchan Kim
  1 sibling, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:37 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> The zram claim mechanism was added to ensure no new opens come in
> during teardown.  But the proper way to archive that is to call
> del_gendisk first, which takes care of all that.  Once del_gendisk
> is called in the right place, the reset side can also be simplified
> as no I/O can be outstanding on a block device that is not open.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/zram/zram_drv.c | 76 ++++++++++-------------------------
>   1 file changed, 21 insertions(+), 55 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 61/78] zram: do not call set_blocksize
  2020-11-16 14:57 ` [PATCH 61/78] zram: do not call set_blocksize Christoph Hellwig
@ 2020-11-20  7:38   ` Hannes Reinecke
  2020-11-26  1:16   ` Minchan Kim
  1 sibling, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:38 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> set_blocksize is used by file systems to use their preferred buffer cache
> block size.  Block drivers should not set it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/zram/zram_drv.c | 11 +----------
>   drivers/block/zram/zram_drv.h |  1 -
>   2 files changed, 1 insertion(+), 11 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 62/78] loop: do not call set_blocksize
  2020-11-16 14:57 ` [PATCH 62/78] loop: " Christoph Hellwig
@ 2020-11-20  7:38   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:38 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> set_blocksize is used by file systems to use their preferred buffer cache
> block size.  Block drivers should not set it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/loop.c | 3 ---
>   1 file changed, 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush
  2020-11-16 14:57 ` [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush Christoph Hellwig
@ 2020-11-20  7:41   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:41 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> We don't really need the struct block_device to initialize a bio.  So
> switch from using bio_set_dev to manually setting up bi_disk (bi_partno
> will always be zero and has been cleared by bio_init already).
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/md/dm.c | 12 +++---------
>   1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 54739f1b579bc8..6d7eb72d41f9ea 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1422,18 +1422,12 @@ static int __send_empty_flush(struct clone_info *ci)
>   	 */
>   	bio_init(&flush_bio, NULL, 0);
>   	flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
> +	flush_bio.bi_disk = ci->io->md->disk;
> +	bio_associate_blkg(&flush_bio);
> +
>   	ci->bio = &flush_bio;
>   	ci->sector_count = 0;
>   
> -	/*
> -	 * Empty flush uses a statically initialized bio, as the base for
> -	 * cloning.  However, blkg association requires that a bdev is
> -	 * associated with a gendisk, which doesn't happen until the bdev is
> -	 * opened.  So, blkg association is done at issue time of the flush
> -	 * rather than when the device is created in alloc_dev().
> -	 */
> -	bio_set_dev(ci->bio, ci->io->md->bdev);
> -
>   	BUG_ON(bio_has_data(ci->bio));
>   	while ((ti = dm_table_get_target(ci->map, target_nr++)))
>   		__send_duplicate_bios(ci, ti, ti->num_flush_bios, NULL);
> 
Ah, thought as much. I've stumbled across this while debugging 
blk-interposer.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 65/78] dm: remove the block_device reference in struct mapped_device
  2020-11-16 14:57 ` [PATCH 65/78] dm: remove the block_device reference in struct mapped_device Christoph Hellwig
@ 2020-11-20  7:43   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:43 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Get rid of the long-lasting struct block_device reference in
> struct mapped_device.  The only remaining user is the freeze code,
> where we can trivially look up the block device at freeze time
> and release the reference at thaw time.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/md/dm-core.h |  2 --
>   drivers/md/dm.c      | 22 +++++++++++-----------
>   2 files changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index d522093cb39dda..b1b400ed76fe90 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -107,8 +107,6 @@ struct mapped_device {
>   	/* kobject and completion */
>   	struct dm_kobject_holder kobj_holder;
>   
> -	struct block_device *bdev;
> -
>   	struct dm_stats stats;
>   
>   	/* for blk-mq request-based DM support */
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 6d7eb72d41f9ea..c789ffea2badde 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1744,11 +1744,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
>   
>   	cleanup_srcu_struct(&md->io_barrier);
>   
> -	if (md->bdev) {
> -		bdput(md->bdev);
> -		md->bdev = NULL;
> -	}
> -
>   	mutex_destroy(&md->suspend_lock);
>   	mutex_destroy(&md->type_lock);
>   	mutex_destroy(&md->table_devices_lock);
> @@ -1840,10 +1835,6 @@ static struct mapped_device *alloc_dev(int minor)
>   	if (!md->wq)
>   		goto bad;
>   
> -	md->bdev = bdget_disk(md->disk, 0);
> -	if (!md->bdev)
> -		goto bad;
> -
>   	dm_stats_init(&md->stats);
>   
>   	/* Populate the mapping, nobody knows we exist yet */
> @@ -2384,12 +2375,17 @@ struct dm_table *dm_swap_table(struct mapped_device *md, struct dm_table *table)
>    */
>   static int lock_fs(struct mapped_device *md)
>   {
> +	struct block_device *bdev;
>   	int r;
>   
>   	WARN_ON(md->frozen_sb);
>   
> -	md->frozen_sb = freeze_bdev(md->bdev);
> +	bdev = bdget_disk(md->disk, 0);
> +	if (!bdev)
> +		return -ENOMEM;
> +	md->frozen_sb = freeze_bdev(bdev);
>   	if (IS_ERR(md->frozen_sb)) {
> +		bdput(bdev);
>   		r = PTR_ERR(md->frozen_sb);
>   		md->frozen_sb = NULL;
>   		return r;
> @@ -2402,10 +2398,14 @@ static int lock_fs(struct mapped_device *md)
>   
>   static void unlock_fs(struct mapped_device *md)
>   {
> +	struct block_device *bdev;
> +
>   	if (!test_bit(DMF_FROZEN, &md->flags))
>   		return;
>   
> -	thaw_bdev(md->bdev, md->frozen_sb);
> +	bdev = md->frozen_sb->s_bdev;
> +	thaw_bdev(bdev, md->frozen_sb);
> +	bdput(bdev);
>   	md->frozen_sb = NULL;
>   	clear_bit(DMF_FROZEN, &md->flags);
>   }
> 
Yay. Just what I need for the blk-interposer code, where the ->bdev
pointer is really getting in the way.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 66/78] block: keep a block_device reference for each hd_struct
  2020-11-16 14:57 ` [PATCH 66/78] block: keep a block_device reference for each hd_struct Christoph Hellwig
@ 2020-11-20  7:50   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:50 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> To simplify block device lookup and a few other upcomdin areas, make sure
> that we always have a struct block_device available for each disk and
> each partition.  The only downside of this is that each device and
> partition uses a little more memories.  The upside will be that a lot of
> code can be simplified.
> 
> With that all we need to look up the block device is to lookup the inode
> and do a few sanity checks on the gendisk, instead of the separate lookup
> for the gendisk.
> 
> As part of the change switch bdget() to only find existing block devices,
> given that we know that the block_device structure must be allocated at
> probe / partition scan time.
> 
> blk-cgroup needed a bit of a special treatment as the only place that
> wanted to lookup a gendisk outside of the normal blkdev_get path.  It is
> switched to lookup using the block device hash now that this is the
> primary lookup path.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/blk-cgroup.c         |  42 ++++-----
>   block/blk-iocost.c         |  36 +++----
>   block/blk.h                |   1 -
>   block/genhd.c              | 188 +++----------------------------------
>   block/partitions/core.c    |  28 +++---
>   fs/block_dev.c             | 133 +++++++++++++++-----------
>   include/linux/blk-cgroup.h |   4 +-
>   include/linux/blkdev.h     |   3 +
>   include/linux/genhd.h      |   4 +-
>   9 files changed, 153 insertions(+), 286 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 67/78] block: simplify the block device claiming interface
  2020-11-16 14:57 ` [PATCH 67/78] block: simplify the block device claiming interface Christoph Hellwig
@ 2020-11-20  7:51   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:51 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Stop passing the whole device as a separate argument given that it
> can be trivially deducted.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/loop.c   | 12 +++-----
>   fs/block_dev.c         | 69 +++++++++++++++++++-----------------------
>   include/linux/blkdev.h |  6 ++--
>   3 files changed, 38 insertions(+), 49 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 68/78] block: remove ->bd_contains
  2020-11-16 14:57 ` [PATCH 68/78] block: remove ->bd_contains Christoph Hellwig
@ 2020-11-20  7:52   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:52 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:57 PM, Christoph Hellwig wrote:
> Now that each gendisk has a reference to the block_device referencing
> it, we can just use that everywhere and get rid of ->bd_contain.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/scsi/scsicam.c    |  2 +-
>   fs/block_dev.c            | 50 +++++++++++++--------------------------
>   include/linux/blk_types.h |  4 +++-
>   3 files changed, 20 insertions(+), 36 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 69/78] block: remove the nr_sects field in struct hd_struct
  2020-11-16 14:58 ` [PATCH 69/78] block: remove the nr_sects field in struct hd_struct Christoph Hellwig
@ 2020-11-20  7:55   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:55 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> Now that the hd_struct always has a block device attached to it, there is
> no need for having two size field that just get out of sync.
> 
> Additional the field in hd_struct did not use proper serializiation,
> possibly allowing for torn writes.  By only using the block_device field
> this problem also gets fixed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/bio.c                        |  2 +-
>   block/blk-core.c                   |  2 +-
>   block/blk.h                        | 53 ----------------------
>   block/genhd.c                      | 34 +++++++-------
>   block/partitions/core.c            | 17 ++++---
>   drivers/block/loop.c               |  1 -
>   drivers/block/nbd.c                |  2 +-
>   drivers/block/xen-blkback/common.h |  4 +-
>   drivers/md/bcache/super.c          |  2 +-
>   drivers/s390/block/dasd_ioctl.c    |  4 +-
>   drivers/target/target_core_pscsi.c |  7 +--
>   fs/block_dev.c                     | 73 +-----------------------------
>   fs/f2fs/super.c                    |  2 +-
>   fs/pstore/blk.c                    |  2 +-
>   include/linux/genhd.h              | 29 +++---------
>   kernel/trace/blktrace.c            |  2 +-
>   16 files changed, 47 insertions(+), 189 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex
  2020-11-16 14:58 ` [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex Christoph Hellwig
@ 2020-11-20  7:58   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:58 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> bd_mutex is primarily used for synchronizing the block device open and
> release path, which recurses from partitions to the whole disk device.
> The fact that we have two locks makes life unnecessarily complex due
> to lock order constrains.  Replace the two levels of locking with a
> single mutex in the gendisk structure.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/genhd.c                   |  7 ++--
>   block/ioctl.c                   |  4 +-
>   block/partitions/core.c         | 22 +++++-----
>   drivers/block/loop.c            | 14 +++----
>   drivers/block/xen-blkfront.c    |  8 ++--
>   drivers/block/zram/zram_drv.c   |  4 +-
>   drivers/block/zram/zram_drv.h   |  2 +-
>   drivers/md/md.h                 |  7 +---
>   drivers/s390/block/dasd_genhd.c |  8 ++--
>   drivers/scsi/sd.c               |  4 +-
>   fs/block_dev.c                  | 71 +++++++++++++++++----------------
>   fs/btrfs/volumes.c              |  2 +-
>   fs/super.c                      |  8 ++--
>   include/linux/blk_types.h       |  1 -
>   include/linux/genhd.h           |  1 +
>   15 files changed, 80 insertions(+), 83 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 71/78] block: add a bdev_kobj helper
  2020-11-16 14:58 ` [PATCH 71/78] block: add a bdev_kobj helper Christoph Hellwig
@ 2020-11-20  7:59   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:59 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> Add a little helper to find the kobject for a struct block_device.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/md/bcache/super.c |  7 ++-----
>   drivers/md/md.c           |  4 +---
>   fs/btrfs/sysfs.c          | 15 +++------------
>   include/linux/blk_types.h |  3 +++
>   4 files changed, 9 insertions(+), 20 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next
  2020-11-16 14:58 ` [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next Christoph Hellwig
@ 2020-11-20  7:59   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  7:59 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> Call disk_part_iter_exit in disk_part_iter_next instead of duplicating
> the functionality.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/genhd.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 999f7142b04e7d..56bc37e98ed852 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -230,8 +230,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
>   	int inc, end;
>   
>   	/* put the last partition */
> -	disk_put_part(piter->part);
> -	piter->part = NULL;
> +	disk_part_iter_exit(piter);
>   
>   	/* get part_tbl */
>   	rcu_read_lock();
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 73/78] block: use put_device in put_disk
  2020-11-16 14:58 ` [PATCH 73/78] block: use put_device in put_disk Christoph Hellwig
@ 2020-11-20  8:02   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  8:02 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> Use put_device to put the device instead of poking into the internals
> and using kobject_put.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/genhd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 56bc37e98ed852..f1e20ec1b62887 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1659,7 +1659,7 @@ EXPORT_SYMBOL(__alloc_disk_node);
>   void put_disk(struct gendisk *disk)
>   {
>   	if (disk)
> -		kobject_put(&disk_to_dev(disk)->kobj);
> +		put_device(disk_to_dev(disk));
>   }
>   EXPORT_SYMBOL(put_disk);
>   
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 74/78] block: merge struct block_device and struct hd_struct
  2020-11-16 14:58 ` [PATCH 74/78] block: merge struct block_device and struct hd_struct Christoph Hellwig
@ 2020-11-20  8:58   ` Hannes Reinecke
  0 siblings, 0 replies; 113+ messages in thread
From: Hannes Reinecke @ 2020-11-20  8:58 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Mike Snitzer, Song Liu, Martin K. Petersen,
	dm-devel, linux-block, drbd-dev, nbd, ceph-devel, xen-devel,
	linux-raid, linux-nvme, linux-scsi, linux-fsdevel

On 11/16/20 3:58 PM, Christoph Hellwig wrote:
> Instead of having two structures that represent each block device with
> different lift time rules merged them into a single one.  This also
> greatly simplifies the reference counting rules, as we can use the inode
> reference count as the main reference count for the new struct
> block_device, with the device model reference front ending it for device
> model interaction.  The percpu refcount in struct hd_struct is entirely
> gone given that struct block_device must be opened and thus valid for
> the duration of the I/O.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/bio.c                        |   6 +-
>   block/blk-cgroup.c                 |   9 +-
>   block/blk-core.c                   |  85 +++++-----
>   block/blk-flush.c                  |   2 +-
>   block/blk-lib.c                    |   2 +-
>   block/blk-merge.c                  |   6 +-
>   block/blk-mq.c                     |  11 +-
>   block/blk-mq.h                     |   5 +-
>   block/blk.h                        |  38 ++---
>   block/genhd.c                      | 242 +++++++++++------------------
>   block/ioctl.c                      |   4 +-
>   block/partitions/core.c            | 221 +++++++-------------------
>   drivers/block/drbd/drbd_receiver.c |   2 +-
>   drivers/block/drbd/drbd_worker.c   |   2 +-
>   drivers/block/zram/zram_drv.c      |   2 +-
>   drivers/md/bcache/request.c        |   4 +-
>   drivers/md/dm.c                    |   8 +-
>   drivers/md/md.c                    |   4 +-
>   drivers/nvme/target/admin-cmd.c    |  20 +--
>   drivers/s390/block/dasd.c          |   8 +-
>   fs/block_dev.c                     |  68 +++-----
>   fs/ext4/super.c                    |  18 +--
>   fs/ext4/sysfs.c                    |  10 +-
>   fs/f2fs/checkpoint.c               |   5 +-
>   fs/f2fs/f2fs.h                     |   2 +-
>   fs/f2fs/super.c                    |   6 +-
>   fs/f2fs/sysfs.c                    |   9 --
>   include/linux/blk_types.h          |  23 ++-
>   include/linux/blkdev.h             |  13 +-
>   include/linux/genhd.h              |  67 ++------
>   include/linux/part_stat.h          |  17 +-
>   init/do_mounts.c                   |  20 +--
>   kernel/trace/blktrace.c            |  54 ++-----
>   33 files changed, 351 insertions(+), 642 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 60/78] zram: remove the claim mechanism
  2020-11-16 14:57 ` [PATCH 60/78] zram: remove the claim mechanism Christoph Hellwig
  2020-11-20  7:37   ` Hannes Reinecke
@ 2020-11-26  1:11   ` Minchan Kim
  2020-11-26  9:59     ` Christoph Hellwig
  1 sibling, 1 reply; 113+ messages in thread
From: Minchan Kim @ 2020-11-26  1:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Mike Snitzer, Song Liu, Martin K. Petersen, dm-devel,
	linux-block, drbd-dev, nbd, ceph-devel, xen-devel, linux-raid,
	linux-nvme, linux-scsi, linux-fsdevel, Sergey Senozhatsky

On Mon, Nov 16, 2020 at 03:57:51PM +0100, Christoph Hellwig wrote:
> The zram claim mechanism was added to ensure no new opens come in
> during teardown.  But the proper way to archive that is to call
> del_gendisk first, which takes care of all that.  Once del_gendisk
> is called in the right place, the reset side can also be simplified
> as no I/O can be outstanding on a block device that is not open.

It would be great if it makes the mess simple. Let me have a question
Please see below.

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/block/zram/zram_drv.c | 76 ++++++++++-------------------------
>  1 file changed, 21 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 6d15d51cee2b7e..3641434a9b154d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1756,64 +1756,33 @@ static ssize_t disksize_store(struct device *dev,
>  static ssize_t reset_store(struct device *dev,
>  		struct device_attribute *attr, const char *buf, size_t len)
>  {
> -	int ret;
> -	unsigned short do_reset;
> -	struct zram *zram;
> +	struct zram *zram = dev_to_zram(dev);
>  	struct block_device *bdev;
> +	unsigned short do_reset;
> +	int ret = 0;
>  
>  	ret = kstrtou16(buf, 10, &do_reset);
>  	if (ret)
>  		return ret;
> -
>  	if (!do_reset)
>  		return -EINVAL;
>  
> -	zram = dev_to_zram(dev);
>  	bdev = bdget_disk(zram->disk, 0);
>  	if (!bdev)
>  		return -ENOMEM;
>  
>  	mutex_lock(&bdev->bd_mutex);
> -	/* Do not reset an active device or claimed device */
> -	if (bdev->bd_openers || zram->claim) {
> -		mutex_unlock(&bdev->bd_mutex);
> -		bdput(bdev);
> -		return -EBUSY;
> -	}
> -
> -	/* From now on, anyone can't open /dev/zram[0-9] */
> -	zram->claim = true;
> +	if (bdev->bd_openers)
> +		ret = -EBUSY;
> +	else
> +		zram_reset_device(zram);
>  	mutex_unlock(&bdev->bd_mutex);
> -
> -	/* Make sure all the pending I/O are finished */
> -	fsync_bdev(bdev);
> -	zram_reset_device(zram);
>  	bdput(bdev);
>  
> -	mutex_lock(&bdev->bd_mutex);
> -	zram->claim = false;
> -	mutex_unlock(&bdev->bd_mutex);
> -
> -	return len;
> -}
> -
> -static int zram_open(struct block_device *bdev, fmode_t mode)
> -{
> -	int ret = 0;
> -	struct zram *zram;
> -
> -	WARN_ON(!mutex_is_locked(&bdev->bd_mutex));
> -
> -	zram = bdev->bd_disk->private_data;
> -	/* zram was claimed to reset so open request fails */
> -	if (zram->claim)
> -		ret = -EBUSY;
> -
> -	return ret;
> +	return ret ? ret : len;
>  }
>  
>  static const struct block_device_operations zram_devops = {
> -	.open = zram_open,
>  	.submit_bio = zram_submit_bio,
>  	.swap_slot_free_notify = zram_slot_free_notify,
>  	.rw_page = zram_rw_page,
> @@ -1821,7 +1790,6 @@ static const struct block_device_operations zram_devops = {
>  };
>  
>  static const struct block_device_operations zram_wb_devops = {
> -	.open = zram_open,
>  	.submit_bio = zram_submit_bio,
>  	.swap_slot_free_notify = zram_slot_free_notify,
>  	.owner = THIS_MODULE
> @@ -1972,34 +1940,32 @@ static int zram_add(void)
>  	return ret;
>  }
>  
> -static int zram_remove(struct zram *zram)
> +static bool zram_busy(struct zram *zram)
>  {
>  	struct block_device *bdev;
> +	bool busy = false;
>  
>  	bdev = bdget_disk(zram->disk, 0);
> -	if (!bdev)
> -		return -ENOMEM;
> -
> -	mutex_lock(&bdev->bd_mutex);
> -	if (bdev->bd_openers || zram->claim) {
> -		mutex_unlock(&bdev->bd_mutex);
> +	if (bdev) {
> +		if (bdev->bd_openers)
> +			busy = true;
>  		bdput(bdev);
> -		return -EBUSY;
>  	}
>  
> -	zram->claim = true;
> -	mutex_unlock(&bdev->bd_mutex);
> +	return busy;
> +}
>  
> -	zram_debugfs_unregister(zram);
> +static int zram_remove(struct zram *zram)
> +{
> +	if (zram_busy(zram))
> +		return -EBUSY;
>  
> -	/* Make sure all the pending I/O are finished */
> -	fsync_bdev(bdev);
> +	del_gendisk(zram->disk);
> +	zram_debugfs_unregister(zram);
>  	zram_reset_device(zram);
> -	bdput(bdev);
>  
>  	pr_info("Removed device: %s\n", zram->disk->disk_name);
>  
> -	del_gendisk(zram->disk);
>  	blk_cleanup_queue(zram->disk->queue);
>  	put_disk(zram->disk);
>  	kfree(zram);
> -- 
> 2.29.2
> 

With this patch, how deal with the race?

CPU 1                                     CPU 2

hot_remove_store
  zram_remove
    zram_busy
      return -EBUSY
                                         open /dev/zram0
    del_gendisk
    zram_reset and destroy


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 61/78] zram:  do not call set_blocksize
  2020-11-16 14:57 ` [PATCH 61/78] zram: do not call set_blocksize Christoph Hellwig
  2020-11-20  7:38   ` Hannes Reinecke
@ 2020-11-26  1:16   ` Minchan Kim
  1 sibling, 0 replies; 113+ messages in thread
From: Minchan Kim @ 2020-11-26  1:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Mike Snitzer, Song Liu, Martin K. Petersen, dm-devel,
	linux-block, drbd-dev, nbd, ceph-devel, xen-devel, linux-raid,
	linux-nvme, linux-scsi, linux-fsdevel, sergey.senozhatsky.work

On Mon, Nov 16, 2020 at 03:57:52PM +0100, Christoph Hellwig wrote:
> set_blocksize is used by file systems to use their preferred buffer cache
> block size.  Block drivers should not set it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks.

> ---
>  drivers/block/zram/zram_drv.c | 11 +----------
>  drivers/block/zram/zram_drv.h |  1 -
>  2 files changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 3641434a9b154d..d00b5761ec0b21 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -403,13 +403,10 @@ static void reset_bdev(struct zram *zram)
>  		return;
>  
>  	bdev = zram->bdev;
> -	if (zram->old_block_size)
> -		set_blocksize(bdev, zram->old_block_size);
>  	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
>  	/* hope filp_close flush all of IO */
>  	filp_close(zram->backing_dev, NULL);
>  	zram->backing_dev = NULL;
> -	zram->old_block_size = 0;
>  	zram->bdev = NULL;
>  	zram->disk->fops = &zram_devops;
>  	kvfree(zram->bitmap);
> @@ -454,7 +451,7 @@ static ssize_t backing_dev_store(struct device *dev,
>  	struct file *backing_dev = NULL;
>  	struct inode *inode;
>  	struct address_space *mapping;
> -	unsigned int bitmap_sz, old_block_size = 0;
> +	unsigned int bitmap_sz;
>  	unsigned long nr_pages, *bitmap = NULL;
>  	struct block_device *bdev = NULL;
>  	int err;
> @@ -509,14 +506,8 @@ static ssize_t backing_dev_store(struct device *dev,
>  		goto out;
>  	}
>  
> -	old_block_size = block_size(bdev);
> -	err = set_blocksize(bdev, PAGE_SIZE);
> -	if (err)
> -		goto out;
> -
>  	reset_bdev(zram);
>  
> -	zram->old_block_size = old_block_size;
>  	zram->bdev = bdev;
>  	zram->backing_dev = backing_dev;
>  	zram->bitmap = bitmap;
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index f2fd46daa76045..712354a4207c77 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -118,7 +118,6 @@ struct zram {
>  	bool wb_limit_enable;
>  	u64 bd_wb_limit;
>  	struct block_device *bdev;
> -	unsigned int old_block_size;
>  	unsigned long *bitmap;
>  	unsigned long nr_pages;
>  #endif
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 60/78] zram: remove the claim mechanism
  2020-11-26  1:11   ` Minchan Kim
@ 2020-11-26  9:59     ` Christoph Hellwig
  0 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2020-11-26  9:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Christoph Hellwig, Jens Axboe, Justin Sanders, Josef Bacik,
	Ilya Dryomov, Jack Wang, Michael S. Tsirkin, Jason Wang,
	Paolo Bonzini, Stefan Hajnoczi, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Mike Snitzer, Song Liu, Martin K. Petersen, dm-devel,
	linux-block, drbd-dev, nbd, ceph-devel, xen-devel, linux-raid,
	linux-nvme, linux-scsi, linux-fsdevel, Sergey Senozhatsky

On Wed, Nov 25, 2020 at 05:11:07PM -0800, Minchan Kim wrote:
> With this patch, how deal with the race?
> 
> CPU 1                                     CPU 2
> 
> hot_remove_store
>   zram_remove
>     zram_busy
>       return -EBUSY
>                                          open /dev/zram0
>     del_gendisk
>     zram_reset and destroy

Yeah, it looks like zram does not really handle hot unplugging unlike
other drivers.  So I've dropped this one for now.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 12/78] dm: use set_capacity_and_notify
  2020-11-16 14:57 ` [PATCH 12/78] dm: use set_capacity_and_notify Christoph Hellwig
@ 2021-02-12 15:45   ` Mike Snitzer
  2021-02-16 11:46     ` Christoph Hellwig
  0 siblings, 1 reply; 113+ messages in thread
From: Mike Snitzer @ 2021-02-12 15:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Justin Sanders, Josef Bacik, Ilya Dryomov, Jack Wang,
	Michael S. Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Minchan Kim, Song Liu, Martin K. Petersen,
	device-mapper development, linux-block, drbd-dev, nbd,
	ceph-devel, xen-devel, linux-raid, linux-nvme, linux-scsi,
	linux-fsdevel, Hannes Reinecke

On Mon, Nov 16, 2020 at 10:05 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Use set_capacity_and_notify to set the size of both the disk and block
> device.  This also gets the uevent notifications for the resize for free.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/md/dm.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index c18fc25485186d..62ad44925e73ec 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1971,8 +1971,7 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
>         if (size != dm_get_size(md))
>                 memset(&md->geometry, 0, sizeof(md->geometry));
>
> -       set_capacity(md->disk, size);
> -       bd_set_nr_sectors(md->bdev, size);
> +       set_capacity_and_notify(md->disk, size);
>
>         dm_table_event_callback(t, event_callback, md);
>

Not yet pinned down _why_ DM is calling set_capacity_and_notify() with
a size of 0 but, when running various DM regression tests, I'm seeing
a lot of noise like:

[  689.240037] dm-2: detected capacity change from 2097152 to 0

Is this pr_info really useful?  Should it be moved to below: if
(!capacity || !size) so that it only prints if a uevent is sent?

Mike

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 12/78] dm: use set_capacity_and_notify
  2021-02-12 15:45   ` Mike Snitzer
@ 2021-02-16 11:46     ` Christoph Hellwig
  0 siblings, 0 replies; 113+ messages in thread
From: Christoph Hellwig @ 2021-02-16 11:46 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, Justin Sanders, Josef Bacik,
	Ilya Dryomov, Jack Wang, Michael S. Tsirkin, Jason Wang,
	Paolo Bonzini, Stefan Hajnoczi, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Minchan Kim, Song Liu, Martin K. Petersen,
	device-mapper development, linux-block, drbd-dev, nbd,
	ceph-devel, xen-devel, linux-raid, linux-nvme, linux-scsi,
	linux-fsdevel, Hannes Reinecke

On Fri, Feb 12, 2021 at 10:45:32AM -0500, Mike Snitzer wrote:
> On Mon, Nov 16, 2020 at 10:05 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > Use set_capacity_and_notify to set the size of both the disk and block
> > device.  This also gets the uevent notifications for the resize for free.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Hannes Reinecke <hare@suse.de>
> > ---
> >  drivers/md/dm.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index c18fc25485186d..62ad44925e73ec 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1971,8 +1971,7 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
> >         if (size != dm_get_size(md))
> >                 memset(&md->geometry, 0, sizeof(md->geometry));
> >
> > -       set_capacity(md->disk, size);
> > -       bd_set_nr_sectors(md->bdev, size);
> > +       set_capacity_and_notify(md->disk, size);
> >
> >         dm_table_event_callback(t, event_callback, md);
> >
> 
> Not yet pinned down _why_ DM is calling set_capacity_and_notify() with
> a size of 0 but, when running various DM regression tests, I'm seeing
> a lot of noise like:
> 
> [  689.240037] dm-2: detected capacity change from 2097152 to 0
> 
> Is this pr_info really useful?  Should it be moved to below: if
> (!capacity || !size) so that it only prints if a uevent is sent?

In general I suspect such a size change might be interesting to users
if it e.g. comes from a remote event.  So I'd be curious why this happens
with DM, and if we can detect some higher level gendisk state to supress
it if it is indeed spurious.

^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, back to index

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16 14:56 cleanup updating the size of block devices v3 Christoph Hellwig
2020-11-16 14:56 ` [PATCH 01/78] block: remove the call to __invalidate_device in check_disk_size_change Christoph Hellwig
2020-11-16 14:56 ` [PATCH 02/78] loop: let set_capacity_revalidate_and_notify update the bdev size Christoph Hellwig
2020-11-16 14:56 ` [PATCH 03/78] nvme: " Christoph Hellwig
2020-11-16 14:56 ` [PATCH 04/78] sd: update the bdev size in sd_revalidate_disk Christoph Hellwig
2020-11-16 14:56 ` [PATCH 05/78] block: remove the update_bdev parameter to set_capacity_revalidate_and_notify Christoph Hellwig
2020-11-16 14:56 ` [PATCH 06/78] nbd: remove the call to set_blocksize Christoph Hellwig
2020-11-16 14:56 ` [PATCH 07/78] nbd: move the task_recv check into nbd_size_update Christoph Hellwig
2020-11-16 14:56 ` [PATCH 08/78] nbd: refactor size updates Christoph Hellwig
2020-11-16 14:57 ` [PATCH 09/78] nbd: validate the block size in nbd_set_size Christoph Hellwig
2020-11-16 14:57 ` [PATCH 10/78] nbd: use set_capacity_and_notify Christoph Hellwig
2020-11-16 14:57 ` [PATCH 11/78] aoe: don't call set_capacity from irq context Christoph Hellwig
2020-11-16 14:57 ` [PATCH 12/78] dm: use set_capacity_and_notify Christoph Hellwig
2021-02-12 15:45   ` Mike Snitzer
2021-02-16 11:46     ` Christoph Hellwig
2020-11-16 14:57 ` [PATCH 13/78] pktcdvd: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 14/78] nvme: use set_capacity_and_notify in nvme_set_queue_dying Christoph Hellwig
2020-11-16 14:57 ` [PATCH 15/78] drbd: use set_capacity_and_notify Christoph Hellwig
2020-11-16 14:57 ` [PATCH 16/78] rbd: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 17/78] rnbd: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 18/78] zram: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 19/78] dm-raid: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 20/78] md: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 21/78] md: remove a spurious call to revalidate_disk_size in update_size Christoph Hellwig
2020-11-16 14:57 ` [PATCH 22/78] virtio-blk: remove a spurious call to revalidate_disk_size Christoph Hellwig
2020-11-16 14:57 ` [PATCH 23/78] block: unexport revalidate_disk_size Christoph Hellwig
2020-11-16 14:57 ` [PATCH 24/78] mtd_blkdevs: don't override BLKFLSBUF Christoph Hellwig
2020-11-16 14:57 ` [PATCH 25/78] block: don't call into the driver for BLKFLSBUF Christoph Hellwig
2020-11-16 14:57 ` [PATCH 26/78] block: add a new set_read_only method Christoph Hellwig
2020-11-16 14:57 ` [PATCH 27/78] rbd: implement ->set_read_only to hook into BLKROSET processing Christoph Hellwig
2020-11-16 14:57 ` [PATCH 28/78] md: " Christoph Hellwig
2020-11-16 17:37   ` Song Liu
2020-11-16 14:57 ` [PATCH 29/78] dasd: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 30/78] block: don't call into the driver for BLKROSET Christoph Hellwig
2020-11-20  7:19   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 31/78] loop: use set_disk_ro Christoph Hellwig
2020-11-20  7:20   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 32/78] block: remove set_device_ro Christoph Hellwig
2020-11-20  7:20   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 33/78] block: remove __blkdev_driver_ioctl Christoph Hellwig
2020-11-20  7:22   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 34/78] block: propagate BLKROSET to all partitions Christoph Hellwig
2020-11-20  7:23   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 35/78] block: cleanup del_gendisk a bit Christoph Hellwig
2020-11-16 14:57 ` [PATCH 36/78] block: open code kobj_map into in block/genhd.c Christoph Hellwig
2020-11-16 14:57 ` [PATCH 37/78] block: split block_class_lock Christoph Hellwig
2020-11-16 14:57 ` [PATCH 38/78] block: rework requesting modules for unclaimed devices Christoph Hellwig
2020-11-16 14:57 ` [PATCH 39/78] block: add an optional probe callback to major_names Christoph Hellwig
2020-11-16 14:57 ` [PATCH 40/78] ide: remove ide_{,un}register_region Christoph Hellwig
2020-11-16 14:57 ` [PATCH 41/78] swim: don't call blk_register_region Christoph Hellwig
2020-11-16 14:57 ` [PATCH 42/78] sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t Christoph Hellwig
2020-11-16 14:57 ` [PATCH 43/78] brd: use __register_blkdev to allocate devices on demand Christoph Hellwig
2020-11-16 14:57 ` [PATCH 44/78] loop: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 45/78] md: " Christoph Hellwig
2020-11-16 14:57 ` [PATCH 46/78] ide: switch to __register_blkdev for command set probing Christoph Hellwig
2020-11-16 14:57 ` [PATCH 47/78] floppy: use a separate gendisk for each media format Christoph Hellwig
2020-11-16 14:57 ` [PATCH 48/78] amiflop: use separate gendisks for Amiga vs MS-DOS mode Christoph Hellwig
2020-11-16 14:57 ` [PATCH 49/78] ataflop: use a separate gendisk for each media format Christoph Hellwig
2020-11-16 14:57 ` [PATCH 50/78] z2ram: reindent Christoph Hellwig
2020-11-16 14:57 ` [PATCH 51/78] z2ram: use separate gendisk for the different modes Christoph Hellwig
2020-11-16 14:57 ` [PATCH 52/78] block: switch gendisk lookup to a simple xarray Christoph Hellwig
2020-11-16 14:57 ` [PATCH 53/78] blk-cgroup: fix a hd_struct leak in blkcg_fill_root_iostats Christoph Hellwig
2020-11-20  7:25   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 54/78] block: remove a duplicate __disk_get_part prototype Christoph Hellwig
2020-11-20  7:25   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 55/78] block: change the hash used for looking up block devices Christoph Hellwig
2020-11-20  7:26   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 56/78] init: refactor name_to_dev_t Christoph Hellwig
2020-11-20  7:31   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 57/78] init: refactor devt_from_partuuid Christoph Hellwig
2020-11-20  7:33   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 58/78] init: cleanup match_dev_by_uuid and match_dev_by_label Christoph Hellwig
2020-11-20  7:34   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 59/78] mtip32xx: remove the call to fsync_bdev on removal Christoph Hellwig
2020-11-20  7:35   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 60/78] zram: remove the claim mechanism Christoph Hellwig
2020-11-20  7:37   ` Hannes Reinecke
2020-11-26  1:11   ` Minchan Kim
2020-11-26  9:59     ` Christoph Hellwig
2020-11-16 14:57 ` [PATCH 61/78] zram: do not call set_blocksize Christoph Hellwig
2020-11-20  7:38   ` Hannes Reinecke
2020-11-26  1:16   ` Minchan Kim
2020-11-16 14:57 ` [PATCH 62/78] loop: " Christoph Hellwig
2020-11-20  7:38   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 63/78] bcache: remove a superflous lookup_bdev all Christoph Hellwig
2020-11-16 14:57 ` [PATCH 64/78] dm: simplify flush_bio initialization in __send_empty_flush Christoph Hellwig
2020-11-20  7:41   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 65/78] dm: remove the block_device reference in struct mapped_device Christoph Hellwig
2020-11-20  7:43   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 66/78] block: keep a block_device reference for each hd_struct Christoph Hellwig
2020-11-20  7:50   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 67/78] block: simplify the block device claiming interface Christoph Hellwig
2020-11-20  7:51   ` Hannes Reinecke
2020-11-16 14:57 ` [PATCH 68/78] block: remove ->bd_contains Christoph Hellwig
2020-11-20  7:52   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 69/78] block: remove the nr_sects field in struct hd_struct Christoph Hellwig
2020-11-20  7:55   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 70/78] block: replace bd_mutex with a per-gendisk mutex Christoph Hellwig
2020-11-20  7:58   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 71/78] block: add a bdev_kobj helper Christoph Hellwig
2020-11-20  7:59   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 72/78] block: use disk_part_iter_exit in disk_part_iter_next Christoph Hellwig
2020-11-20  7:59   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 73/78] block: use put_device in put_disk Christoph Hellwig
2020-11-20  8:02   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 74/78] block: merge struct block_device and struct hd_struct Christoph Hellwig
2020-11-20  8:58   ` Hannes Reinecke
2020-11-16 14:58 ` [PATCH 75/78] block: stop using bdget_disk for partition 0 Christoph Hellwig
2020-11-16 14:58 ` [PATCH 76/78] filemap: use ->f_mapping over ->i_mapping consistently Christoph Hellwig
2020-11-16 14:58 ` [PATCH 77/78] fs: simplify the get_super_thawed interface Christoph Hellwig
2020-11-16 14:58 ` [PATCH 78/78] block: remove i_bdev Christoph Hellwig
2020-11-16 15:05 ` cleanup updating the size of block devices v3 Christoph Hellwig
2020-11-16 15:40 ` Jens Axboe

CEPH-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/ceph-devel/0 ceph-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ceph-devel ceph-devel/ https://lore.kernel.org/ceph-devel \
		ceph-devel@vger.kernel.org
	public-inbox-index ceph-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.ceph-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git