linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* remove get_super
@ 2023-08-11 10:08 Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" Christoph Hellwig
                   ` (19 more replies)
  0 siblings, 20 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

Hi all,

this series against the VFS vfs.super branch finishes off the work to remove
get_super and move (almost) all upcalls to use the holder ops.

The first part is the missing btrfs bits so that all file systems use the
super_block as holder.

The second part is various block driver cleanups so that we use proper
interfaces instead of raw calls to __invalidate_device and fsync_bdev.

The last part than replaces __invalidate_device and fsync_bdev with upcalls
to the file system through the holder ops, and finally removes get_super.

It leaves user_get_super and get_active_super around.  The former is not
used for upcalls in the traditional sense, but for legacy UAPI that for
some weird reason take a dev_t argument (ustat) or a block device path
(quotactl).  get_active_super is only used for calling into the file system
on freeze and should get a similar treatment, but given that Darrick has
changes to that code queued up already this will be handled in the next
merge window.

A git tree is available here:

    git://git.infradead.org/users/hch/misc.git remove-get_super

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/remove-get_super

Diffstat:
 block/bdev.c              |   61 ++++++++++++++++++++-------------------------
 block/disk-events.c       |   23 ++++-------------
 block/genhd.c             |   45 +++++++++++++++++----------------
 block/ioctl.c             |    9 +++++-
 block/partitions/core.c   |    5 ---
 drivers/block/amiflop.c   |    1 
 drivers/block/floppy.c    |    2 -
 drivers/block/loop.c      |    6 ++--
 drivers/block/nbd.c       |    8 ++---
 drivers/s390/block/dasd.c |    7 +----
 fs/btrfs/disk-io.c        |    4 +-
 fs/btrfs/super.c          |   59 ++++++++++++++++++++++---------------------
 fs/btrfs/volumes.c        |   58 ++++++++++++++++++++++---------------------
 fs/btrfs/volumes.h        |    8 +++--
 fs/inode.c                |   16 +----------
 fs/internal.h             |    2 -
 fs/super.c                |   62 +++++++++++++++-------------------------------
 include/linux/blkdev.h    |   13 +++++----
 include/linux/fs.h        |    1 
 19 files changed, 175 insertions(+), 215 deletions(-)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems"
  2023-08-11 10:08 remove get_super Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:44   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device Christoph Hellwig
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

The btrfs hunk should be dropped because the prerequisite btrfs changes were
dropped from the branch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/super.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d58ace4c1d2962..f1dd172d8d5bd7 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -69,6 +69,8 @@ static const struct super_operations btrfs_super_ops;
  * requested by subvol=/path. That way the callchain is straightforward and we
  * don't have to play tricks with the mount options and recursive calls to
  * btrfs_mount.
+ *
+ * The new btrfs_root_fs_type also servers as a tag for the bdev_holder.
  */
 static struct file_system_type btrfs_fs_type;
 static struct file_system_type btrfs_root_fs_type;
@@ -1513,7 +1515,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
 		shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
 					s->s_id);
-		fs_info->bdev_holder = s;
+		btrfs_sb(s)->bdev_holder = fs_type;
 		error = btrfs_fill_super(s, fs_devices, data);
 	}
 	if (!error)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device
  2023-08-11 10:08 remove get_super Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:00   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb Christoph Hellwig
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

btrfs_scan_one_device opens the block device only to read the super
block.  Instead of passing a blk_mode_t argument to sometimes open
it for writing, just hard code BLK_OPEN_READ as it will never write
to the device or hand the block_device out to someone else.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/super.c   | 15 +++++++--------
 fs/btrfs/volumes.c |  4 ++--
 fs/btrfs/volumes.h |  2 +-
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f1dd172d8d5bd7..b318bddefd5236 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -849,7 +849,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
  * All other options will be parsed on much later in the mount process and
  * only when we need to allocate a new super block.
  */
-static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
+static int btrfs_parse_device_options(const char *options)
 {
 	substring_t args[MAX_OPT_ARGS];
 	char *device_name, *opts, *orig, *p;
@@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
 				error = -ENOMEM;
 				goto out;
 			}
-			device = btrfs_scan_one_device(device_name, flags);
+			device = btrfs_scan_one_device(device_name);
 			kfree(device_name);
 			if (IS_ERR(device)) {
 				error = PTR_ERR(device);
@@ -1440,7 +1440,6 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	struct btrfs_fs_devices *fs_devices = NULL;
 	struct btrfs_fs_info *fs_info = NULL;
 	void *new_sec_opts = NULL;
-	blk_mode_t mode = sb_open_mode(flags);
 	int error = 0;
 
 	if (data) {
@@ -1472,13 +1471,13 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	}
 
 	mutex_lock(&uuid_mutex);
-	error = btrfs_parse_device_options(data, mode);
+	error = btrfs_parse_device_options(data);
 	if (error) {
 		mutex_unlock(&uuid_mutex);
 		goto error_fs_info;
 	}
 
-	device = btrfs_scan_one_device(device_name, mode);
+	device = btrfs_scan_one_device(device_name);
 	if (IS_ERR(device)) {
 		mutex_unlock(&uuid_mutex);
 		error = PTR_ERR(device);
@@ -1488,7 +1487,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	fs_devices = device->fs_devices;
 	fs_info->fs_devices = fs_devices;
 
-	error = btrfs_open_devices(fs_devices, mode, fs_type);
+	error = btrfs_open_devices(fs_devices, sb_open_mode(flags), fs_type);
 	mutex_unlock(&uuid_mutex);
 	if (error)
 		goto error_fs_info;
@@ -2190,7 +2189,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 	switch (cmd) {
 	case BTRFS_IOC_SCAN_DEV:
 		mutex_lock(&uuid_mutex);
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
+		device = btrfs_scan_one_device(vol->name);
 		ret = PTR_ERR_OR_ZERO(device);
 		mutex_unlock(&uuid_mutex);
 		break;
@@ -2204,7 +2203,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 		break;
 	case BTRFS_IOC_DEVICES_READY:
 		mutex_lock(&uuid_mutex);
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
+		device = btrfs_scan_one_device(vol->name);
 		if (IS_ERR(device)) {
 			mutex_unlock(&uuid_mutex);
 			ret = PTR_ERR(device);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 73f9ea7672dbda..8246578c70f55b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1356,7 +1356,7 @@ int btrfs_forget_devices(dev_t devt)
  * and we are not allowed to call set_blocksize during the scan. The superblock
  * is read via pagecache
  */
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
+struct btrfs_device *btrfs_scan_one_device(const char *path)
 {
 	struct btrfs_super_block *disk_super;
 	bool new_device_added = false;
@@ -1384,7 +1384,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
 	 * values temporarily, as the device paths of the fsid are the only
 	 * required information for assembling the volume.
 	 */
-	bdev = blkdev_get_by_path(path, flags, NULL, NULL);
+	bdev = blkdev_get_by_path(path, BLK_OPEN_READ, NULL, NULL);
 	if (IS_ERR(bdev))
 		return ERR_CAST(bdev);
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index b8c51f16ba867f..824161c6dd063e 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -611,7 +611,7 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
 void btrfs_mapping_tree_free(struct extent_map_tree *tree);
 int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 		       blk_mode_t flags, void *holder);
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags);
+struct btrfs_device *btrfs_scan_one_device(const char *path);
 int btrfs_forget_devices(dev_t devt);
 void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
 void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb
  2023-08-11 10:08 remove get_super Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:03   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 04/17] btrfs: split btrfs_fs_devices.opened Christoph Hellwig
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

blkdev_put must not be called under sb->s_umount to avoid a lock order
reversal with disk->open_mutex once call backs from block devices to
the file system using the holder ops are supported.  Move the call
to btrfs_close_devices into btrfs_free_fs_info so that it is closed
from ->kill_sb (which is also called from the mount failure handling
path unlike ->put_super) as well as when an fs_info is freed because
an existing superblock already exists.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/disk-io.c | 4 ++--
 fs/btrfs/super.c   | 7 ++-----
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7513388b0567be..3788b55638392e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1243,6 +1243,8 @@ static void free_global_roots(struct btrfs_fs_info *fs_info)
 
 void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
 {
+	if (fs_info->fs_devices)
+		btrfs_close_devices(fs_info->fs_devices);
 	percpu_counter_destroy(&fs_info->dirty_metadata_bytes);
 	percpu_counter_destroy(&fs_info->delalloc_bytes);
 	percpu_counter_destroy(&fs_info->ordered_bytes);
@@ -3554,7 +3556,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 
 	iput(fs_info->btree_inode);
 fail:
-	btrfs_close_devices(fs_info->fs_devices);
 	ASSERT(ret < 0);
 	return ret;
 }
@@ -4408,7 +4409,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 #endif
 
 	btrfs_mapping_tree_free(&fs_info->mapping_tree);
-	btrfs_close_devices(fs_info->fs_devices);
 }
 
 void btrfs_mark_buffer_dirty(struct extent_buffer *buf)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b318bddefd5236..2bbc041ac2e2c5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1494,7 +1494,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 
 	if (!(flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
 		error = -EACCES;
-		goto error_close_devices;
+		goto error_fs_info;
 	}
 
 	bdev = fs_devices->latest_dev->bdev;
@@ -1502,11 +1502,10 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		 fs_info);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
-		goto error_close_devices;
+		goto error_fs_info;
 	}
 
 	if (s->s_root) {
-		btrfs_close_devices(fs_devices);
 		btrfs_free_fs_info(fs_info);
 		if ((flags ^ s->s_flags) & SB_RDONLY)
 			error = -EBUSY;
@@ -1527,8 +1526,6 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 
 	return dget(s->s_root);
 
-error_close_devices:
-	btrfs_close_devices(fs_devices);
 error_fs_info:
 	btrfs_free_fs_info(fs_info);
 error_sec_opts:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/17] btrfs: split btrfs_fs_devices.opened
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (2 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:40   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 05/17] btrfs: open block devices after superblock creation Christoph Hellwig
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

The btrfs_fs_devices.opened member mixes an in use counter for the
fs_devices structure that prevents it from being garbage collected with
a flag if the underlying devices were actually opened.  This not only
makes the code hard to follow, but also prevents btrfs from switching
to opening the block device only after super block creation.  Split it
into an in_use counter and an is_open boolean flag instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/volumes.c | 49 ++++++++++++++++++++++++++--------------------
 fs/btrfs/volumes.h |  6 ++++--
 2 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 8246578c70f55b..3ac1a3aa8939bc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -405,7 +405,8 @@ static void free_fs_devices(struct btrfs_fs_devices *fs_devices)
 {
 	struct btrfs_device *device;
 
-	WARN_ON(fs_devices->opened);
+	WARN_ON_ONCE(fs_devices->in_use);
+	WARN_ON_ONCE(fs_devices->is_open);
 	while (!list_empty(&fs_devices->devices)) {
 		device = list_entry(fs_devices->devices.next,
 				    struct btrfs_device, dev_list);
@@ -578,7 +579,7 @@ static int btrfs_free_stale_devices(dev_t devt, struct btrfs_device *skip_device
 				continue;
 			if (devt && devt != device->devt)
 				continue;
-			if (fs_devices->opened) {
+			if (fs_devices->in_use) {
 				/* for an already deleted device return 0 */
 				if (devt && ret != 0)
 					ret = -EBUSY;
@@ -849,7 +850,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 	if (!device) {
 		unsigned int nofs_flag;
 
-		if (fs_devices->opened) {
+		if (fs_devices->in_use) {
 			btrfs_err(NULL,
 		"device %s belongs to fsid %pU, and the fs is already mounted",
 				  path, fs_devices->fsid);
@@ -913,7 +914,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 		 * tracking a problem where systems fail mount by subvolume id
 		 * when we reject replacement on a mounted FS.
 		 */
-		if (!fs_devices->opened && found_transid < device->generation) {
+		if (!fs_devices->in_use && found_transid < device->generation) {
 			/*
 			 * That is if the FS is _not_ mounted and if you
 			 * are here, that means there is more than one
@@ -974,7 +975,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 	 * it back. We need it to pick the disk with largest generation
 	 * (as above).
 	 */
-	if (!fs_devices->opened) {
+	if (!fs_devices->in_use) {
 		device->generation = found_transid;
 		fs_devices->latest_generation = max_t(u64, found_transid,
 						fs_devices->latest_generation);
@@ -1172,15 +1173,19 @@ static void close_fs_devices(struct btrfs_fs_devices *fs_devices)
 
 	lockdep_assert_held(&uuid_mutex);
 
-	if (--fs_devices->opened > 0)
+	if (--fs_devices->in_use > 0)
 		return;
 
+	if (!fs_devices->is_open)
+		goto done;
+
 	list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list)
 		btrfs_close_one_device(device);
 
 	WARN_ON(fs_devices->open_devices);
 	WARN_ON(fs_devices->rw_devices);
-	fs_devices->opened = 0;
+	fs_devices->is_open = false;
+done:
 	fs_devices->seeding = false;
 	fs_devices->fs_info = NULL;
 }
@@ -1192,7 +1197,7 @@ void btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 
 	mutex_lock(&uuid_mutex);
 	close_fs_devices(fs_devices);
-	if (!fs_devices->opened) {
+	if (!fs_devices->in_use) {
 		list_splice_init(&fs_devices->seed_list, &list);
 
 		/*
@@ -1240,7 +1245,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	if (fs_devices->open_devices == 0)
 		return -EINVAL;
 
-	fs_devices->opened = 1;
+	fs_devices->is_open = true;
 	fs_devices->latest_dev = latest_dev;
 	fs_devices->total_rw_bytes = 0;
 	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
@@ -1277,16 +1282,14 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 	 * We also don't need the lock here as this is called during mount and
 	 * exclusion is provided by uuid_mutex
 	 */
-
-	if (fs_devices->opened) {
-		fs_devices->opened++;
-		ret = 0;
-	} else {
+	if (!fs_devices->is_open) {
 		list_sort(NULL, &fs_devices->devices, devid_cmp);
 		ret = open_fs_devices(fs_devices, flags, holder);
+		if (ret)
+			return ret;
 	}
-
-	return ret;
+	fs_devices->in_use++;
+	return 0;
 }
 
 void btrfs_release_disk_super(struct btrfs_super_block *super)
@@ -2242,13 +2245,14 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 	 * This can happen if cur_devices is the private seed devices list.  We
 	 * cannot call close_fs_devices() here because it expects the uuid_mutex
 	 * to be held, but in fact we don't need that for the private
-	 * seed_devices, we can simply decrement cur_devices->opened and then
+	 * seed_devices, we can simply decrement cur_devices->in_use and then
 	 * remove it from our list and free the fs_devices.
 	 */
 	if (cur_devices->num_devices == 0) {
 		list_del_init(&cur_devices->seed_list);
-		ASSERT(cur_devices->opened == 1);
-		cur_devices->opened--;
+		ASSERT(cur_devices->in_use == 1);
+		cur_devices->in_use--;
+		cur_devices->is_open = false;
 		free_fs_devices(cur_devices);
 	}
 
@@ -2478,7 +2482,8 @@ static struct btrfs_fs_devices *btrfs_init_sprout(struct btrfs_fs_info *fs_info)
 	list_add(&old_devices->fs_list, &fs_uuids);
 
 	memcpy(seed_devices, fs_devices, sizeof(*seed_devices));
-	seed_devices->opened = 1;
+	seed_devices->in_use = 1;
+	seed_devices->is_open = true;
 	INIT_LIST_HEAD(&seed_devices->devices);
 	INIT_LIST_HEAD(&seed_devices->alloc_list);
 	mutex_init(&seed_devices->device_list_mutex);
@@ -6883,7 +6888,8 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
 			return fs_devices;
 
 		fs_devices->seeding = true;
-		fs_devices->opened = 1;
+		fs_devices->in_use = 1;
+		fs_devices->is_open = true;
 		return fs_devices;
 	}
 
@@ -6900,6 +6906,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
 		free_fs_devices(fs_devices);
 		return ERR_PTR(ret);
 	}
+	fs_devices->in_use = 1;
 
 	if (!fs_devices->seeding) {
 		close_fs_devices(fs_devices);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 824161c6dd063e..f472d646715e72 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -346,8 +346,10 @@ struct btrfs_fs_devices {
 
 	struct list_head seed_list;
 
-	/* Count fs-devices opened. */
-	int opened;
+	/* Count if fs_device is in used. */
+	unsigned int in_use;
+	/* True if the devices were opened. */
+	bool is_open;
 
 	/* Set when we find or add a device that doesn't have the nonrot flag set. */
 	bool rotating;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/17] btrfs: open block devices after superblock creation
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (3 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 04/17] btrfs: split btrfs_fs_devices.opened Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:44   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems Christoph Hellwig
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

Currently btrfs_mount_root opens the block devices before committing to
allocating a super block. That creates problems for restricting the
number of writers to a device, and also leads to a unusual and not very
helpful holder (the fs_type).

Reorganize the code to first check whether the superblock for a
particular fsid does already exist and open the block devices only if it
doesn't, mirroring the recent changes to the VFS mount helpers.  To do
this the increment of the in_use counter moves out of btrfs_open_devices
and into the only caller in btrfs_mount_root so that it happens before
dropping uuid_mutex around the call to sget.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/super.c   | 40 +++++++++++++++++++++++-----------------
 fs/btrfs/volumes.c | 15 +++++----------
 2 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 2bbc041ac2e2c5..1079a0f541790d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1434,7 +1434,6 @@ static struct dentry *mount_subvol(const char *subvol_name, u64 subvol_objectid,
 static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		int flags, const char *device_name, void *data)
 {
-	struct block_device *bdev = NULL;
 	struct super_block *s;
 	struct btrfs_device *device = NULL;
 	struct btrfs_fs_devices *fs_devices = NULL;
@@ -1486,18 +1485,9 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 
 	fs_devices = device->fs_devices;
 	fs_info->fs_devices = fs_devices;
-
-	error = btrfs_open_devices(fs_devices, sb_open_mode(flags), fs_type);
+	fs_devices->in_use++;
 	mutex_unlock(&uuid_mutex);
-	if (error)
-		goto error_fs_info;
-
-	if (!(flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
-		error = -EACCES;
-		goto error_fs_info;
-	}
 
-	bdev = fs_devices->latest_dev->bdev;
 	s = sget(fs_type, btrfs_test_super, btrfs_set_super, flags | SB_NOSEC,
 		 fs_info);
 	if (IS_ERR(s)) {
@@ -1510,7 +1500,22 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		if ((flags ^ s->s_flags) & SB_RDONLY)
 			error = -EBUSY;
 	} else {
-		snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+		struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+
+		mutex_lock(&uuid_mutex);
+		error = btrfs_open_devices(fs_devices, sb_open_mode(flags),
+					   fs_type);
+		mutex_unlock(&uuid_mutex);
+		if (error)
+			goto error_deactivate;
+
+		if (!(flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
+			error = -EACCES;
+			goto error_deactivate;
+		}
+
+		snprintf(s->s_id, sizeof(s->s_id), "%pg",
+			 fs_devices->latest_dev->bdev);
 		shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
 					s->s_id);
 		btrfs_sb(s)->bdev_holder = fs_type;
@@ -1518,12 +1523,9 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	}
 	if (!error)
 		error = security_sb_set_mnt_opts(s, new_sec_opts, 0, NULL);
+	if (error)
+		goto error_deactivate;
 	security_free_mnt_opts(&new_sec_opts);
-	if (error) {
-		deactivate_locked_super(s);
-		return ERR_PTR(error);
-	}
-
 	return dget(s->s_root);
 
 error_fs_info:
@@ -1531,6 +1533,10 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 error_sec_opts:
 	security_free_mnt_opts(&new_sec_opts);
 	return ERR_PTR(error);
+
+error_deactivate:
+	deactivate_locked_super(s);
+	goto error_sec_opts;
 }
 
 /*
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 3ac1a3aa8939bc..b909e593c0f1bc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1272,8 +1272,6 @@ static int devid_cmp(void *priv, const struct list_head *a,
 int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 		       blk_mode_t flags, void *holder)
 {
-	int ret;
-
 	lockdep_assert_held(&uuid_mutex);
 	/*
 	 * The device_list_mutex cannot be taken here in case opening the
@@ -1282,14 +1280,11 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 	 * We also don't need the lock here as this is called during mount and
 	 * exclusion is provided by uuid_mutex
 	 */
-	if (!fs_devices->is_open) {
-		list_sort(NULL, &fs_devices->devices, devid_cmp);
-		ret = open_fs_devices(fs_devices, flags, holder);
-		if (ret)
-			return ret;
-	}
-	fs_devices->in_use++;
-	return 0;
+	ASSERT(fs_devices->in_use);
+	if (fs_devices->is_open)
+		return 0;
+	list_sort(NULL, &fs_devices->devices, devid_cmp);
+	return open_fs_devices(fs_devices, flags, holder);
 }
 
 void btrfs_release_disk_super(struct btrfs_super_block *super)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (4 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 05/17] btrfs: open block devices after superblock creation Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:45   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl Christoph Hellwig
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

The file system type is not a very useful holder as it doesn't allow us
to go back to the actual file system instance.  Pass the super_block
instead which is useful when passed back to the file system driver.

This matches what is done for all other block device based file systems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/super.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1079a0f541790d..0f7c402fb40349 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -69,8 +69,6 @@ static const struct super_operations btrfs_super_ops;
  * requested by subvol=/path. That way the callchain is straightforward and we
  * don't have to play tricks with the mount options and recursive calls to
  * btrfs_mount.
- *
- * The new btrfs_root_fs_type also servers as a tag for the bdev_holder.
  */
 static struct file_system_type btrfs_fs_type;
 static struct file_system_type btrfs_root_fs_type;
@@ -1503,8 +1501,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 
 		mutex_lock(&uuid_mutex);
-		error = btrfs_open_devices(fs_devices, sb_open_mode(flags),
-					   fs_type);
+		error = btrfs_open_devices(fs_devices, sb_open_mode(flags), s);
 		mutex_unlock(&uuid_mutex);
 		if (error)
 			goto error_deactivate;
@@ -1518,7 +1515,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 			 fs_devices->latest_dev->bdev);
 		shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
 					s->s_id);
-		btrfs_sb(s)->bdev_holder = fs_type;
+		btrfs_sb(s)->bdev_holder = s;
 		error = btrfs_fill_super(s, fs_devices, data);
 	}
 	if (!error)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (5 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-09-20 20:41   ` Samuel Holland
  2023-08-11 10:08 ` [PATCH 08/17] block: simplify the disk_force_media_change interface Christoph Hellwig
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

nbd_clear_sock_ioctl kills the socket and with that the block
device.  Instead of just invalidating file system buffers,
mark the device as dead, which will also invalidate the buffers
as part of the proper shutdown sequence.  This also includes
invalidating partitions if there are any.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/nbd.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 8576d696c7a221..42e0159bb258fa 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1434,12 +1434,10 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd)
 	return ret;
 }
 
-static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
-				 struct block_device *bdev)
+static void nbd_clear_sock_ioctl(struct nbd_device *nbd)
 {
+	blk_mark_disk_dead(nbd->disk);
 	nbd_clear_sock(nbd);
-	__invalidate_device(bdev, true);
-	nbd_bdev_reset(nbd);
 	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
 			       &nbd->config->runtime_flags))
 		nbd_config_put(nbd);
@@ -1465,7 +1463,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 	case NBD_DISCONNECT:
 		return nbd_disconnect(nbd);
 	case NBD_CLEAR_SOCK:
-		nbd_clear_sock_ioctl(nbd, bdev);
+		nbd_clear_sock_ioctl(nbd);
 		return 0;
 	case NBD_SET_SOCK:
 		return nbd_add_socket(nbd, arg, false);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/17] block: simplify the disk_force_media_change interface
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (6 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 09/17] floppy: call disk_force_media_change when changing the format Christoph Hellwig
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

Hard code the events to DISK_EVENT_MEDIA_CHANGE as that is the only
useful use case, and drop the superfluous return value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/disk-events.c    | 15 ++++-----------
 drivers/block/loop.c   |  6 +++---
 include/linux/blkdev.h |  2 +-
 3 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/block/disk-events.c b/block/disk-events.c
index 0cfac464e6d120..6189b819b2352e 100644
--- a/block/disk-events.c
+++ b/block/disk-events.c
@@ -294,25 +294,18 @@ EXPORT_SYMBOL(disk_check_media_change);
  * @disk: the disk which will raise the event
  * @events: the events to raise
  *
- * Generate uevents for the disk. If DISK_EVENT_MEDIA_CHANGE is present,
- * attempt to free all dentries and inodes and invalidates all block
+ * Should be called when the media changes for @disk.  Generates a uevent
+ * and attempts to free all dentries and inodes and invalidates all block
  * device page cache entries in that case.
- *
- * Returns %true if DISK_EVENT_MEDIA_CHANGE was raised, or %false if not.
  */
-bool disk_force_media_change(struct gendisk *disk, unsigned int events)
+void disk_force_media_change(struct gendisk *disk)
 {
-	disk_event_uevent(disk, events);
-
-	if (!(events & DISK_EVENT_MEDIA_CHANGE))
-		return false;
-
+	disk_event_uevent(disk, DISK_EVENT_MEDIA_CHANGE);
 	inc_diskseq(disk);
 	if (__invalidate_device(disk->part0, true))
 		pr_warn("VFS: busy inodes on changed media %s\n",
 			disk->disk_name);
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
-	return true;
 }
 EXPORT_SYMBOL_GPL(disk_force_media_change);
 
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 37511d2b2caf7d..705a0effa7d890 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -603,7 +603,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 		goto out_err;
 
 	/* and ... switch */
-	disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
+	disk_force_media_change(lo->lo_disk);
 	blk_mq_freeze_queue(lo->lo_queue);
 	mapping_set_gfp_mask(old_file->f_mapping, lo->old_gfp_mask);
 	lo->lo_backing_file = file;
@@ -1067,7 +1067,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
 	/* suppress uevents while reconfiguring the device */
 	dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
 
-	disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
+	disk_force_media_change(lo->lo_disk);
 	set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
 
 	lo->use_dio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
@@ -1171,7 +1171,7 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
 	if (!release)
 		blk_mq_unfreeze_queue(lo->lo_queue);
 
-	disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
+	disk_force_media_change(lo->lo_disk);
 
 	if (lo->lo_flags & LO_FLAGS_PARTSCAN) {
 		int err;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 83262702eea71a..c8eab6effc2267 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -750,7 +750,7 @@ static inline int bdev_read_only(struct block_device *bdev)
 }
 
 bool set_capacity_and_notify(struct gendisk *disk, sector_t size);
-bool disk_force_media_change(struct gendisk *disk, unsigned int events);
+void disk_force_media_change(struct gendisk *disk);
 
 void add_disk_randomness(struct gendisk *disk) __latent_entropy;
 void rand_initialize_disk(struct gendisk *disk);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/17] floppy: call disk_force_media_change when changing the format
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (7 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 08/17] block: simplify the disk_force_media_change interface Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 10/17] amiflop: don't call fsync_bdev in FDFMTBEG Christoph Hellwig
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

While changing the format of a floppy isn't strictly speaking a media
change, the effects are the same in that the content of the media
changes and the diskseq should be increased and uevent should be
sent.  Switch from calling __invalidate_device to
disk_force_media_change to do so.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/floppy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 2db9b186b977ac..ea4eb88a2e45f4 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3255,7 +3255,7 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
 
 			if (!disk || ITYPE(drive_state[cnt].fd_device) != type)
 				continue;
-			__invalidate_device(disk->part0, true);
+			disk_force_media_change(disk);
 		}
 		mutex_unlock(&open_lock);
 	} else {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/17] amiflop: don't call fsync_bdev in FDFMTBEG
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (8 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 09/17] floppy: call disk_force_media_change when changing the format Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 11/17] dasd: also call __invalidate_device when setting the device offline Christoph Hellwig
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

FDFMTBEG is used by fdformat to calibrate before formatting a disk.
Neither the atari nor PC floppy driver sync data, which also seems
a bit pointless for a disk hat is about to get formatted.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/amiflop.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c
index e460c9799d9f35..2b98114a9fe092 100644
--- a/drivers/block/amiflop.c
+++ b/drivers/block/amiflop.c
@@ -1547,7 +1547,6 @@ static int fd_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
 			rel_fdc();
 			return -EBUSY;
 		}
-		fsync_bdev(bdev);
 		if (fd_motor_on(drive) == 0) {
 			rel_fdc();
 			return -ENODEV;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/17] dasd: also call __invalidate_device when setting the device offline
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (9 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 10/17] amiflop: don't call fsync_bdev in FDFMTBEG Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 12/17] block: drop the "busy inodes on changed media" log message Christoph Hellwig
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

Don't just write out the data, but also invalidate all caches when setting
the device offline.  Stop canceling the offlining when writeback fails
as there is no way to recover from that anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/s390/block/dasd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index edcbf77852c31f..675b38ad00dc9e 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3627,9 +3627,8 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
 		 * empty
 		 */
 		if (device->block) {
-			rc = fsync_bdev(device->block->bdev);
-			if (rc != 0)
-				goto interrupted;
+			fsync_bdev(device->block->bdev);
+			__invalidate_device(device->block->bdev, true);
 		}
 		dasd_schedule_device_bh(device);
 		rc = wait_event_interruptible(shutdown_waitq,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/17] block: drop the "busy inodes on changed media" log message
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (10 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 11/17] dasd: also call __invalidate_device when setting the device offline Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev Christoph Hellwig
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

This message isn't exactly helpful, and file systems already print way more
useful messages when shut down while active.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/disk-events.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/block/disk-events.c b/block/disk-events.c
index 6189b819b2352e..6b858d3504772c 100644
--- a/block/disk-events.c
+++ b/block/disk-events.c
@@ -281,9 +281,7 @@ bool disk_check_media_change(struct gendisk *disk)
 	if (!(events & DISK_EVENT_MEDIA_CHANGE))
 		return false;
 
-	if (__invalidate_device(disk->part0, true))
-		pr_warn("VFS: busy inodes on changed media %s\n",
-			disk->disk_name);
+	__invalidate_device(disk->part0, true);
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
 	return true;
 }
@@ -302,9 +300,7 @@ void disk_force_media_change(struct gendisk *disk)
 {
 	disk_event_uevent(disk, DISK_EVENT_MEDIA_CHANGE);
 	inc_diskseq(disk);
-	if (__invalidate_device(disk->part0, true))
-		pr_warn("VFS: busy inodes on changed media %s\n",
-			disk->disk_name);
+	__invalidate_device(disk->part0, true);
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
 }
 EXPORT_SYMBOL_GPL(disk_force_media_change);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (11 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 12/17] block: drop the "busy inodes on changed media" log message Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-12 10:51   ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 14/17] block: call into the file system for bdev_mark_dead Christoph Hellwig
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

We currently have two interfaces that take a block_devices and the find
a mounted file systems to flush or invaldidate data on it.  Both are a
bit problematic because they only work for the "main" block devices
that is used as s_dev for the super_block, and because they don't call
into the file system at all.

Merge the two into a new bdev_mark_dead helper that does both the
syncing and invalidation and which is properly documented.  This is
in preparation of merging the functionality into the ->mark_dead
holder operation so that it will work on additional block devices
used by a file systems and give us a single entry point for invalidation
of dead devices or media.

Note that a single standalone fsync_bdev call for an obscure ioctl
remains for now, but that one will also be deal with in a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bdev.c              | 33 ++++++++++++++++++++++++++++-----
 block/disk-events.c       |  4 ++--
 block/genhd.c             |  3 +--
 block/partitions/core.c   |  5 +----
 drivers/s390/block/dasd.c |  6 ++----
 fs/super.c                |  4 ++--
 include/linux/blkdev.h    |  2 +-
 7 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 979e28a46b988e..b9ca947bd5e405 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -221,7 +221,6 @@ int fsync_bdev(struct block_device *bdev)
 	}
 	return sync_blockdev(bdev);
 }
-EXPORT_SYMBOL(fsync_bdev);
 
 /**
  * freeze_bdev - lock a filesystem and force it into a consistent state
@@ -960,12 +959,27 @@ int lookup_bdev(const char *pathname, dev_t *dev)
 }
 EXPORT_SYMBOL(lookup_bdev);
 
-int __invalidate_device(struct block_device *bdev, bool kill_dirty)
+/**
+ * bdev_mark_dead - mark a block device as dead
+ * @bdev: block device to operate on
+ * @surprise: indicate a surprise removal
+ *
+ * Tell the file system that this devices or media is dead.  If @surprise is set
+ * to %true the device or media is already gone, if not we are preparing for an
+ * orderly removal.
+ *
+ * This syncs out all dirty data and writes back inodes and then invalidates any
+ * cached data in the inodes on the file system, the inodes themselves and the
+ * block device mapping.
+ */
+void bdev_mark_dead(struct block_device *bdev, bool surprise)
 {
 	struct super_block *sb = get_super(bdev);
 	int res = 0;
 
 	if (sb) {
+		if (!surprise)
+			sync_filesystem(sb);
 		/*
 		 * no need to lock the super, get_super holds the
 		 * read mutex so the filesystem cannot go away
@@ -973,13 +987,22 @@ int __invalidate_device(struct block_device *bdev, bool kill_dirty)
 		 * hold).
 		 */
 		shrink_dcache_sb(sb);
-		res = invalidate_inodes(sb, kill_dirty);
+		res = invalidate_inodes(sb, true);
 		drop_super(sb);
+	} else {
+		if (!surprise)
+			sync_blockdev(bdev);
 	}
 	invalidate_bdev(bdev);
-	return res;
 }
-EXPORT_SYMBOL(__invalidate_device);
+#ifdef CONFIG_DASD
+/*
+ * Drivers should not use this directly, but the DASD driver has historically
+ * had a shutdown to offline mode that doesn't actually remove the gendisk
+ * that otherwise looks a lot like a safe device removal.
+ */
+EXPORT_SYMBOL_GPL(bdev_mark_dead);
+#endif
 
 void sync_bdevs(bool wait)
 {
diff --git a/block/disk-events.c b/block/disk-events.c
index 6b858d3504772c..422db8292d0997 100644
--- a/block/disk-events.c
+++ b/block/disk-events.c
@@ -281,7 +281,7 @@ bool disk_check_media_change(struct gendisk *disk)
 	if (!(events & DISK_EVENT_MEDIA_CHANGE))
 		return false;
 
-	__invalidate_device(disk->part0, true);
+	bdev_mark_dead(disk->part0, true);
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
 	return true;
 }
@@ -300,7 +300,7 @@ void disk_force_media_change(struct gendisk *disk)
 {
 	disk_event_uevent(disk, DISK_EVENT_MEDIA_CHANGE);
 	inc_diskseq(disk);
-	__invalidate_device(disk->part0, true);
+	bdev_mark_dead(disk->part0, true);
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
 }
 EXPORT_SYMBOL_GPL(disk_force_media_change);
diff --git a/block/genhd.c b/block/genhd.c
index 3d287b32d50dfd..afc2cb09eb94b9 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -647,8 +647,7 @@ void del_gendisk(struct gendisk *disk)
 	mutex_lock(&disk->open_mutex);
 	xa_for_each(&disk->part_tbl, idx, part) {
 		remove_inode_hash(part->bd_inode);
-		fsync_bdev(part);
-		__invalidate_device(part, true);
+		bdev_mark_dead(part, false);
 	}
 	mutex_unlock(&disk->open_mutex);
 
diff --git a/block/partitions/core.c b/block/partitions/core.c
index 13a7341299a913..e137a87f4db0d3 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -281,10 +281,7 @@ static void delete_partition(struct block_device *part)
 	 * looked up any more even when openers still hold references.
 	 */
 	remove_inode_hash(part->bd_inode);
-
-	fsync_bdev(part);
-	__invalidate_device(part, true);
-
+	bdev_mark_dead(part, false);
 	drop_partition(part);
 }
 
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 675b38ad00dc9e..1f642be840c3ef 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3626,10 +3626,8 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
 		 * so sync bdev first and then wait for our queues to become
 		 * empty
 		 */
-		if (device->block) {
-			fsync_bdev(device->block->bdev);
-			__invalidate_device(device->block->bdev, true);
-		}
+		if (device->block)
+			bdev_mark_dead(device->block->bdev, false);
 		dasd_schedule_device_bh(device);
 		rc = wait_event_interruptible(shutdown_waitq,
 					      _wait_for_empty_queues(device));
diff --git a/fs/super.c b/fs/super.c
index 71fe297a7e90a9..bbce0fdebf7e52 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1359,7 +1359,7 @@ int get_tree_bdev(struct fs_context *fc,
 		/*
 		 * We drop s_umount here because we need to open the bdev and
 		 * bdev->open_mutex ranks above s_umount (blkdev_put() ->
-		 * __invalidate_device()). It is safe because we have active sb
+		 * bdev_mark_dead()). It is safe because we have active sb
 		 * reference and SB_BORN is not set yet.
 		 */
 		up_write(&s->s_umount);
@@ -1411,7 +1411,7 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
 		/*
 		 * We drop s_umount here because we need to open the bdev and
 		 * bdev->open_mutex ranks above s_umount (blkdev_put() ->
-		 * __invalidate_device()). It is safe because we have active sb
+		 * bdev_mark_dead()). It is safe because we have active sb
 		 * reference and SB_BORN is not set yet.
 		 */
 		up_write(&s->s_umount);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c8eab6effc2267..6721595b9f9741 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -751,6 +751,7 @@ static inline int bdev_read_only(struct block_device *bdev)
 
 bool set_capacity_and_notify(struct gendisk *disk, sector_t size);
 void disk_force_media_change(struct gendisk *disk);
+void bdev_mark_dead(struct block_device *bdev, bool surprise);
 
 void add_disk_randomness(struct gendisk *disk) __latent_entropy;
 void rand_initialize_disk(struct gendisk *disk);
@@ -809,7 +810,6 @@ int __register_blkdev(unsigned int major, const char *name,
 void unregister_blkdev(unsigned int major, const char *name);
 
 bool disk_check_media_change(struct gendisk *disk);
-int __invalidate_device(struct block_device *bdev, bool kill_dirty);
 void set_capacity(struct gendisk *disk, sector_t size);
 
 #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/17] block: call into the file system for bdev_mark_dead
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (12 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 10:08 ` [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF Christoph Hellwig
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

Combine the newly merged bdev_mark_dead helper with the existing
mark_dead holder operation so that all operations that invalidate
a device that is dead or being removed now go through the holder
ops.  This allows file systems to explicitly shutdown either ASAP
(for a surprise removal) or after writing back data (for an orderly
removal), and do so not only for the main device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bdev.c           | 30 +++++++++-------------------
 block/genhd.c          | 44 +++++++++++++++++++++++-------------------
 fs/super.c             |  8 ++++++--
 include/linux/blkdev.h |  2 +-
 4 files changed, 40 insertions(+), 44 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index b9ca947bd5e405..658d5dd62cac0a 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -968,31 +968,19 @@ EXPORT_SYMBOL(lookup_bdev);
  * to %true the device or media is already gone, if not we are preparing for an
  * orderly removal.
  *
- * This syncs out all dirty data and writes back inodes and then invalidates any
- * cached data in the inodes on the file system, the inodes themselves and the
- * block device mapping.
+ * This calls into the file system, which then typicall syncs out all dirty data
+ * and writes back inodes and then invalidates any cached data in the inodes on
+ * the file system.  In addition we also invalidate the block device mapping.
  */
 void bdev_mark_dead(struct block_device *bdev, bool surprise)
 {
-	struct super_block *sb = get_super(bdev);
-	int res = 0;
+	mutex_lock(&bdev->bd_holder_lock);
+	if (bdev->bd_holder_ops && bdev->bd_holder_ops->mark_dead)
+		bdev->bd_holder_ops->mark_dead(bdev, surprise);
+	else
+		sync_blockdev(bdev);
+	mutex_unlock(&bdev->bd_holder_lock);
 
-	if (sb) {
-		if (!surprise)
-			sync_filesystem(sb);
-		/*
-		 * no need to lock the super, get_super holds the
-		 * read mutex so the filesystem cannot go away
-		 * under us (->put_super runs with the write lock
-		 * hold).
-		 */
-		shrink_dcache_sb(sb);
-		res = invalidate_inodes(sb, true);
-		drop_super(sb);
-	} else {
-		if (!surprise)
-			sync_blockdev(bdev);
-	}
 	invalidate_bdev(bdev);
 }
 #ifdef CONFIG_DASD
diff --git a/block/genhd.c b/block/genhd.c
index afc2cb09eb94b9..cc32a0c704eb84 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -554,7 +554,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
 }
 EXPORT_SYMBOL(device_add_disk);
 
-static void blk_report_disk_dead(struct gendisk *disk)
+static void blk_report_disk_dead(struct gendisk *disk, bool surprise)
 {
 	struct block_device *bdev;
 	unsigned long idx;
@@ -565,10 +565,7 @@ static void blk_report_disk_dead(struct gendisk *disk)
 			continue;
 		rcu_read_unlock();
 
-		mutex_lock(&bdev->bd_holder_lock);
-		if (bdev->bd_holder_ops && bdev->bd_holder_ops->mark_dead)
-			bdev->bd_holder_ops->mark_dead(bdev);
-		mutex_unlock(&bdev->bd_holder_lock);
+		bdev_mark_dead(bdev, surprise);
 
 		put_device(&bdev->bd_device);
 		rcu_read_lock();
@@ -576,14 +573,7 @@ static void blk_report_disk_dead(struct gendisk *disk)
 	rcu_read_unlock();
 }
 
-/**
- * blk_mark_disk_dead - mark a disk as dead
- * @disk: disk to mark as dead
- *
- * Mark as disk as dead (e.g. surprise removed) and don't accept any new I/O
- * to this disk.
- */
-void blk_mark_disk_dead(struct gendisk *disk)
+static void __blk_mark_disk_dead(struct gendisk *disk)
 {
 	/*
 	 * Fail any new I/O.
@@ -603,8 +593,19 @@ void blk_mark_disk_dead(struct gendisk *disk)
 	 * Prevent new I/O from crossing bio_queue_enter().
 	 */
 	blk_queue_start_drain(disk->queue);
+}
 
-	blk_report_disk_dead(disk);
+/**
+ * blk_mark_disk_dead - mark a disk as dead
+ * @disk: disk to mark as dead
+ *
+ * Mark as disk as dead (e.g. surprise removed) and don't accept any new I/O
+ * to this disk.
+ */
+void blk_mark_disk_dead(struct gendisk *disk)
+{
+	__blk_mark_disk_dead(disk);
+	blk_report_disk_dead(disk, true);
 }
 EXPORT_SYMBOL_GPL(blk_mark_disk_dead);
 
@@ -641,17 +642,20 @@ void del_gendisk(struct gendisk *disk)
 	disk_del_events(disk);
 
 	/*
-	 * Prevent new openers by unlinked the bdev inode, and write out
-	 * dirty data before marking the disk dead and stopping all I/O.
+	 * Prevent new openers by unlinked the bdev inode.
 	 */
 	mutex_lock(&disk->open_mutex);
-	xa_for_each(&disk->part_tbl, idx, part) {
+	xa_for_each(&disk->part_tbl, idx, part)
 		remove_inode_hash(part->bd_inode);
-		bdev_mark_dead(part, false);
-	}
 	mutex_unlock(&disk->open_mutex);
 
-	blk_mark_disk_dead(disk);
+	/*
+	 * Tell the file system to write back all dirty data and shut down if
+	 * it hasn't been notified earlier.
+	 */
+	if (!test_bit(GD_DEAD, &disk->state))
+		blk_report_disk_dead(disk, false);
+	__blk_mark_disk_dead(disk);
 
 	/*
 	 * Drop all partitions now that the disk is marked dead.
diff --git a/fs/super.c b/fs/super.c
index bbce0fdebf7e52..94d41040584f7b 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1228,7 +1228,7 @@ static bool lock_active_super(struct super_block *sb)
 	return true;
 }
 
-static void fs_mark_dead(struct block_device *bdev)
+static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
 {
 	struct super_block *sb = bdev->bd_holder;
 
@@ -1238,6 +1238,10 @@ static void fs_mark_dead(struct block_device *bdev)
 	if (!lock_active_super(sb))
 		return;
 
+	if (!surprise)
+		sync_filesystem(sb);
+	shrink_dcache_sb(sb);
+	invalidate_inodes(sb, true);
 	if (sb->s_op->shutdown)
 		sb->s_op->shutdown(sb);
 
@@ -1245,7 +1249,7 @@ static void fs_mark_dead(struct block_device *bdev)
 }
 
 const struct blk_holder_ops fs_holder_ops = {
-	.mark_dead		= fs_mark_dead,
+	.mark_dead		= fs_bdev_mark_dead,
 };
 EXPORT_SYMBOL_GPL(fs_holder_ops);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6721595b9f9741..cdd03c612d3957 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1461,7 +1461,7 @@ void blkdev_show(struct seq_file *seqf, off_t offset);
 #endif
 
 struct blk_holder_ops {
-	void (*mark_dead)(struct block_device *bdev);
+	void (*mark_dead)(struct block_device *bdev, bool surprise);
 };
 
 extern const struct blk_holder_ops fs_holder_ops;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (13 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 14/17] block: call into the file system for bdev_mark_dead Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 14:06   ` Josef Bacik
  2023-08-11 10:08 ` [PATCH 16/17] fs: remove get_super Christoph Hellwig
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

BLKFLSBUF is a historic ioctl that is called on a file handle to a
block device and syncs either the file system mounted on that block
device if there is one, or otherwise the just the data on the block
device.

Replace the get_super based syncing with a holder operation to remove
the last usage of get_super, and to also support syncing the file system
if the block device is not the main block device stored in s_dev.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bdev.c           | 16 ----------------
 block/ioctl.c          |  9 ++++++++-
 fs/super.c             | 13 +++++++++++++
 include/linux/blkdev.h |  7 +++++--
 4 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 658d5dd62cac0a..2a035be7f3ee90 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -206,22 +206,6 @@ int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
 }
 EXPORT_SYMBOL(sync_blockdev_range);
 
-/*
- * Write out and wait upon all dirty data associated with this
- * device.   Filesystem data as well as the underlying block
- * device.  Takes the superblock lock.
- */
-int fsync_bdev(struct block_device *bdev)
-{
-	struct super_block *sb = get_super(bdev);
-	if (sb) {
-		int res = sync_filesystem(sb);
-		drop_super(sb);
-		return res;
-	}
-	return sync_blockdev(bdev);
-}
-
 /**
  * freeze_bdev - lock a filesystem and force it into a consistent state
  * @bdev:	blockdevice to lock
diff --git a/block/ioctl.c b/block/ioctl.c
index 3be11941fb2ddc..648670ddb164a0 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -364,7 +364,14 @@ static int blkdev_flushbuf(struct block_device *bdev, unsigned cmd,
 {
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
-	fsync_bdev(bdev);
+
+	mutex_lock(&bdev->bd_holder_lock);
+	if (bdev->bd_holder_ops && bdev->bd_holder_ops->sync)
+		bdev->bd_holder_ops->sync(bdev);
+	else
+		sync_blockdev(bdev);
+	mutex_unlock(&bdev->bd_holder_lock);
+
 	invalidate_bdev(bdev);
 	return 0;
 }
diff --git a/fs/super.c b/fs/super.c
index 94d41040584f7b..714dbae58b5e8e 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1248,8 +1248,21 @@ static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
 	up_read(&sb->s_umount);
 }
 
+static void fs_bdev_sync(struct block_device *bdev)
+{
+	struct super_block *sb = bdev->bd_holder;
+
+	lockdep_assert_held(&bdev->bd_holder_lock);
+
+	if (!lock_active_super(sb))
+		return;
+	sync_filesystem(sb);
+	up_read(&sb->s_umount);
+}
+ 
 const struct blk_holder_ops fs_holder_ops = {
 	.mark_dead		= fs_bdev_mark_dead,
+	.sync			= fs_bdev_sync,
 };
 EXPORT_SYMBOL_GPL(fs_holder_ops);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index cdd03c612d3957..fa462d48ee9640 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1462,6 +1462,11 @@ void blkdev_show(struct seq_file *seqf, off_t offset);
 
 struct blk_holder_ops {
 	void (*mark_dead)(struct block_device *bdev, bool surprise);
+
+	/*
+	 * Sync the file system mounted on the block device.
+	 */
+	void (*sync)(struct block_device *bdev);
 };
 
 extern const struct blk_holder_ops fs_holder_ops;
@@ -1524,8 +1529,6 @@ static inline int early_lookup_bdev(const char *pathname, dev_t *dev)
 }
 #endif /* CONFIG_BLOCK */
 
-int fsync_bdev(struct block_device *bdev);
-
 int freeze_bdev(struct block_device *bdev);
 int thaw_bdev(struct block_device *bdev);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/17] fs: remove get_super
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (14 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:46   ` Christian Brauner
  2023-08-11 10:08 ` [PATCH 17/17] fs: simplify invalidate_inodes Christoph Hellwig
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

get_super is unused now, remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/super.c         | 37 -------------------------------------
 include/linux/fs.h |  1 -
 2 files changed, 38 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 714dbae58b5e8e..d3d27ff009d5f6 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -808,43 +808,6 @@ void iterate_supers_type(struct file_system_type *type,
 
 EXPORT_SYMBOL(iterate_supers_type);
 
-/**
- * get_super - get the superblock of a device
- * @bdev: device to get the superblock for
- *
- * Scans the superblock list and finds the superblock of the file system
- * mounted on the device given. %NULL is returned if no match is found.
- */
-struct super_block *get_super(struct block_device *bdev)
-{
-	struct super_block *sb;
-
-	if (!bdev)
-		return NULL;
-
-	spin_lock(&sb_lock);
-rescan:
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (hlist_unhashed(&sb->s_instances))
-			continue;
-		if (sb->s_bdev == bdev) {
-			sb->s_count++;
-			spin_unlock(&sb_lock);
-			down_read(&sb->s_umount);
-			/* still alive? */
-			if (sb->s_root && (sb->s_flags & SB_BORN))
-				return sb;
-			up_read(&sb->s_umount);
-			/* nope, got unmounted */
-			spin_lock(&sb_lock);
-			__put_super(sb);
-			goto rescan;
-		}
-	}
-	spin_unlock(&sb_lock);
-	return NULL;
-}
-
 /**
  * get_active_super - get an active reference to the superblock of a device
  * @bdev: device to get the superblock for
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6867512907d6e5..14b5777a24a0b2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2913,7 +2913,6 @@ extern int vfs_readlink(struct dentry *, char __user *, int);
 extern struct file_system_type *get_filesystem(struct file_system_type *fs);
 extern void put_filesystem(struct file_system_type *fs);
 extern struct file_system_type *get_fs_type(const char *name);
-extern struct super_block *get_super(struct block_device *);
 extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 17/17] fs: simplify invalidate_inodes
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (15 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 16/17] fs: remove get_super Christoph Hellwig
@ 2023-08-11 10:08 ` Christoph Hellwig
  2023-08-11 12:48   ` Christian Brauner
  2023-08-11 13:58 ` remove get_super Josef Bacik
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-11 10:08 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

kill_dirty has always been true for a long time, so hard code it and
remove the unused return value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/inode.c    | 16 ++--------------
 fs/internal.h |  2 +-
 fs/super.c    |  2 +-
 3 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 8fefb69e1f84a9..fd67ca46415d50 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -754,14 +754,10 @@ EXPORT_SYMBOL_GPL(evict_inodes);
  * @sb:		superblock to operate on
  * @kill_dirty: flag to guide handling of dirty inodes
  *
- * Attempts to free all inodes for a given superblock.  If there were any
- * busy inodes return a non-zero value, else zero.
- * If @kill_dirty is set, discard dirty inodes too, otherwise treat
- * them as busy.
+ * Attempts to free all inodes (including dirty inodes) for a given superblock.
  */
-int invalidate_inodes(struct super_block *sb, bool kill_dirty)
+void invalidate_inodes(struct super_block *sb)
 {
-	int busy = 0;
 	struct inode *inode, *next;
 	LIST_HEAD(dispose);
 
@@ -773,14 +769,8 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
 			spin_unlock(&inode->i_lock);
 			continue;
 		}
-		if (inode->i_state & I_DIRTY_ALL && !kill_dirty) {
-			spin_unlock(&inode->i_lock);
-			busy = 1;
-			continue;
-		}
 		if (atomic_read(&inode->i_count)) {
 			spin_unlock(&inode->i_lock);
-			busy = 1;
 			continue;
 		}
 
@@ -798,8 +788,6 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
 	spin_unlock(&sb->s_inode_list_lock);
 
 	dispose_list(&dispose);
-
-	return busy;
 }
 
 /*
diff --git a/fs/internal.h b/fs/internal.h
index f7a3dc11102647..b94290f61714a0 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -201,7 +201,7 @@ void lock_two_inodes(struct inode *inode1, struct inode *inode2,
  * fs-writeback.c
  */
 extern long get_nr_dirty_inodes(void);
-extern int invalidate_inodes(struct super_block *, bool);
+void invalidate_inodes(struct super_block *sb);
 
 /*
  * dcache.c
diff --git a/fs/super.c b/fs/super.c
index d3d27ff009d5f6..500c5308f2c8a0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1204,7 +1204,7 @@ static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
 	if (!surprise)
 		sync_filesystem(sb);
 	shrink_dcache_sb(sb);
-	invalidate_inodes(sb, true);
+	invalidate_inodes(sb);
 	if (sb->s_op->shutdown)
 		sb->s_op->shutdown(sb);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems"
  2023-08-11 10:08 ` [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" Christoph Hellwig
@ 2023-08-11 10:44   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 10:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:12PM +0200, Christoph Hellwig wrote:
> The btrfs hunk should be dropped because the prerequisite btrfs changes were
> dropped from the branch.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

I've already got that dropped in vfs.super.
So no need to for this.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device
  2023-08-11 10:08 ` [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device Christoph Hellwig
@ 2023-08-11 12:00   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:13PM +0200, Christoph Hellwig wrote:
> btrfs_scan_one_device opens the block device only to read the super
> block.  Instead of passing a blk_mode_t argument to sometimes open
> it for writing, just hard code BLK_OPEN_READ as it will never write
> to the device or hand the block_device out to someone else.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb
  2023-08-11 10:08 ` [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb Christoph Hellwig
@ 2023-08-11 12:03   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:14PM +0200, Christoph Hellwig wrote:
> blkdev_put must not be called under sb->s_umount to avoid a lock order
> reversal with disk->open_mutex once call backs from block devices to
> the file system using the holder ops are supported.  Move the call
> to btrfs_close_devices into btrfs_free_fs_info so that it is closed
> from ->kill_sb (which is also called from the mount failure handling
> path unlike ->put_super) as well as when an fs_info is freed because
> an existing superblock already exists.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/17] btrfs: split btrfs_fs_devices.opened
  2023-08-11 10:08 ` [PATCH 04/17] btrfs: split btrfs_fs_devices.opened Christoph Hellwig
@ 2023-08-11 12:40   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:15PM +0200, Christoph Hellwig wrote:
> The btrfs_fs_devices.opened member mixes an in use counter for the
> fs_devices structure that prevents it from being garbage collected with
> a flag if the underlying devices were actually opened.  This not only
> makes the code hard to follow, but also prevents btrfs from switching
> to opening the block device only after super block creation.  Split it
> into an in_use counter and an is_open boolean flag instead.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

That looks like it will fix the issue we've seen with the first version.
So Acked-by: Christian Brauner <brauner@kernel.org>
but it'd be excellent to hear from btrfs maintainers.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/17] btrfs: open block devices after superblock creation
  2023-08-11 10:08 ` [PATCH 05/17] btrfs: open block devices after superblock creation Christoph Hellwig
@ 2023-08-11 12:44   ` Christian Brauner
  2023-08-11 13:11     ` David Sterba
  0 siblings, 1 reply; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:16PM +0200, Christoph Hellwig wrote:
> Currently btrfs_mount_root opens the block devices before committing to
> allocating a super block. That creates problems for restricting the
> number of writers to a device, and also leads to a unusual and not very
> helpful holder (the fs_type).
> 
> Reorganize the code to first check whether the superblock for a
> particular fsid does already exist and open the block devices only if it
> doesn't, mirroring the recent changes to the VFS mount helpers.  To do
> this the increment of the in_use counter moves out of btrfs_open_devices
> and into the only caller in btrfs_mount_root so that it happens before
> dropping uuid_mutex around the call to sget.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Acked-by: Christian Brauner <brauner@kernel.org>

And ofc, would be great to get btrfs reviews.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems
  2023-08-11 10:08 ` [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems Christoph Hellwig
@ 2023-08-11 12:45   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:17PM +0200, Christoph Hellwig wrote:
> The file system type is not a very useful holder as it doesn't allow us
> to go back to the actual file system instance.  Pass the super_block
> instead which is useful when passed back to the file system driver.
> 
> This matches what is done for all other block device based file systems.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Acked-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 16/17] fs: remove get_super
  2023-08-11 10:08 ` [PATCH 16/17] fs: remove get_super Christoph Hellwig
@ 2023-08-11 12:46   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:27PM +0200, Christoph Hellwig wrote:
> get_super is unused now, remove it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 17/17] fs: simplify invalidate_inodes
  2023-08-11 10:08 ` [PATCH 17/17] fs: simplify invalidate_inodes Christoph Hellwig
@ 2023-08-11 12:48   ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2023-08-11 12:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Jens Axboe, Denis Efremov, Josef Bacik,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:28PM +0200, Christoph Hellwig wrote:
> kill_dirty has always been true for a long time, so hard code it and
> remove the unused return value.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Looks good to me,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/17] btrfs: open block devices after superblock creation
  2023-08-11 12:44   ` Christian Brauner
@ 2023-08-11 13:11     ` David Sterba
  2023-08-17 13:24       ` David Sterba
  0 siblings, 1 reply; 46+ messages in thread
From: David Sterba @ 2023-08-11 13:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Al Viro, Jens Axboe, Denis Efremov,
	Josef Bacik, Stefan Haberland, Jan Hoeppner, Heiko Carstens,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel

On Fri, Aug 11, 2023 at 02:44:50PM +0200, Christian Brauner wrote:
> On Fri, Aug 11, 2023 at 12:08:16PM +0200, Christoph Hellwig wrote:
> > Currently btrfs_mount_root opens the block devices before committing to
> > allocating a super block. That creates problems for restricting the
> > number of writers to a device, and also leads to a unusual and not very
> > helpful holder (the fs_type).
> > 
> > Reorganize the code to first check whether the superblock for a
> > particular fsid does already exist and open the block devices only if it
> > doesn't, mirroring the recent changes to the VFS mount helpers.  To do
> > this the increment of the in_use counter moves out of btrfs_open_devices
> > and into the only caller in btrfs_mount_root so that it happens before
> > dropping uuid_mutex around the call to sget.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> 
> Looks good to me,
> Acked-by: Christian Brauner <brauner@kernel.org>
> 
> And ofc, would be great to get btrfs reviews.

I'll take a look but there are some performance regressions to deal with
and pre-merge window freeze so it won't be soon.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (16 preceding siblings ...)
  2023-08-11 10:08 ` [PATCH 17/17] fs: simplify invalidate_inodes Christoph Hellwig
@ 2023-08-11 13:58 ` Josef Bacik
  2023-08-11 19:05 ` Josef Bacik
  2023-09-12 17:42 ` David Sterba
  19 siblings, 0 replies; 46+ messages in thread
From: Josef Bacik @ 2023-08-11 13:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series against the VFS vfs.super branch finishes off the work to remove
> get_super and move (almost) all upcalls to use the holder ops.
> 
> The first part is the missing btrfs bits so that all file systems use the
> super_block as holder.
> 
> The second part is various block driver cleanups so that we use proper
> interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> 
> The last part than replaces __invalidate_device and fsync_bdev with upcalls
> to the file system through the holder ops, and finally removes get_super.
> 
> It leaves user_get_super and get_active_super around.  The former is not
> used for upcalls in the traditional sense, but for legacy UAPI that for
> some weird reason take a dev_t argument (ustat) or a block device path
> (quotactl).  get_active_super is only used for calling into the file system
> on freeze and should get a similar treatment, but given that Darrick has
> changes to that code queued up already this will be handled in the next
> merge window.
> 
> A git tree is available here:
> 
>     git://git.infradead.org/users/hch/misc.git remove-get_super
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/remove-get_super
> 

I rebased this onto misc-next and put in a PR to get it running through the GH
CI, you can follow it here

https://github.com/btrfs/linux/actions/runs/5833422266

In the meantime I'll start reviewing the patches.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF
  2023-08-11 10:08 ` [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF Christoph Hellwig
@ 2023-08-11 14:06   ` Josef Bacik
  0 siblings, 0 replies; 46+ messages in thread
From: Josef Bacik @ 2023-08-11 14:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:26PM +0200, Christoph Hellwig wrote:
> BLKFLSBUF is a historic ioctl that is called on a file handle to a
> block device and syncs either the file system mounted on that block
> device if there is one, or otherwise the just the data on the block
> device.
> 
> Replace the get_super based syncing with a holder operation to remove
> the last usage of get_super, and to also support syncing the file system
> if the block device is not the main block device stored in s_dev.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/bdev.c           | 16 ----------------
>  block/ioctl.c          |  9 ++++++++-
>  fs/super.c             | 13 +++++++++++++
>  include/linux/blkdev.h |  7 +++++--
>  4 files changed, 26 insertions(+), 19 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 658d5dd62cac0a..2a035be7f3ee90 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -206,22 +206,6 @@ int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
>  }
>  EXPORT_SYMBOL(sync_blockdev_range);
>  
> -/*
> - * Write out and wait upon all dirty data associated with this
> - * device.   Filesystem data as well as the underlying block
> - * device.  Takes the superblock lock.
> - */
> -int fsync_bdev(struct block_device *bdev)
> -{
> -	struct super_block *sb = get_super(bdev);
> -	if (sb) {
> -		int res = sync_filesystem(sb);
> -		drop_super(sb);
> -		return res;
> -	}
> -	return sync_blockdev(bdev);
> -}
> -
>  /**
>   * freeze_bdev - lock a filesystem and force it into a consistent state
>   * @bdev:	blockdevice to lock
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 3be11941fb2ddc..648670ddb164a0 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -364,7 +364,14 @@ static int blkdev_flushbuf(struct block_device *bdev, unsigned cmd,
>  {
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EACCES;
> -	fsync_bdev(bdev);
> +
> +	mutex_lock(&bdev->bd_holder_lock);
> +	if (bdev->bd_holder_ops && bdev->bd_holder_ops->sync)
> +		bdev->bd_holder_ops->sync(bdev);
> +	else
> +		sync_blockdev(bdev);
> +	mutex_unlock(&bdev->bd_holder_lock);
> +
>  	invalidate_bdev(bdev);
>  	return 0;
>  }
> diff --git a/fs/super.c b/fs/super.c
> index 94d41040584f7b..714dbae58b5e8e 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1248,8 +1248,21 @@ static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
>  	up_read(&sb->s_umount);
>  }
>  
> +static void fs_bdev_sync(struct block_device *bdev)
> +{
> +	struct super_block *sb = bdev->bd_holder;
> +
> +	lockdep_assert_held(&bdev->bd_holder_lock);
> +
> +	if (!lock_active_super(sb))
> +		return;
> +	sync_filesystem(sb);
> +	up_read(&sb->s_umount);
> +}
> + 

Whitespace error.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (17 preceding siblings ...)
  2023-08-11 13:58 ` remove get_super Josef Bacik
@ 2023-08-11 19:05 ` Josef Bacik
  2023-08-14 19:19   ` David Sterba
  2023-09-12 17:42 ` David Sterba
  19 siblings, 1 reply; 46+ messages in thread
From: Josef Bacik @ 2023-08-11 19:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series against the VFS vfs.super branch finishes off the work to remove
> get_super and move (almost) all upcalls to use the holder ops.
> 
> The first part is the missing btrfs bits so that all file systems use the
> super_block as holder.
> 
> The second part is various block driver cleanups so that we use proper
> interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> 
> The last part than replaces __invalidate_device and fsync_bdev with upcalls
> to the file system through the holder ops, and finally removes get_super.
> 
> It leaves user_get_super and get_active_super around.  The former is not
> used for upcalls in the traditional sense, but for legacy UAPI that for
> some weird reason take a dev_t argument (ustat) or a block device path
> (quotactl).  get_active_super is only used for calling into the file system
> on freeze and should get a similar treatment, but given that Darrick has
> changes to that code queued up already this will be handled in the next
> merge window.
> 
> A git tree is available here:
> 
>     git://git.infradead.org/users/hch/misc.git remove-get_super
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/remove-get_super

The CI testing didn't show any regressions or lockdep errors related to moving
the device opening around (which is what I was worried about).  If you look at
the status you'll see the subpage compress timed out, that's not you, that's an
existing bug I haven't devoted time to figuring out yet.  You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to the series, thanks,

Josef

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev
  2023-08-11 10:08 ` [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev Christoph Hellwig
@ 2023-08-12 10:51   ` Christoph Hellwig
  2023-08-12 17:04     ` Heiko Carstens
  0 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-08-12 10:51 UTC (permalink / raw)
  To: Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

The buildbot pointed out correctly (but rather late), that the special
s390/dasd export needs a _MODULE postfix, so this will have to be
folded in:

diff --git a/block/bdev.c b/block/bdev.c
index 2a035be7f3ee90..a20263fa27a462 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -967,7 +967,7 @@ void bdev_mark_dead(struct block_device *bdev, bool surprise)
 
 	invalidate_bdev(bdev);
 }
-#ifdef CONFIG_DASD
+#ifdef CONFIG_DASD_MODULE
 /*
  * Drivers should not use this directly, but the DASD driver has historically
  * had a shutdown to offline mode that doesn't actually remove the gendisk

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev
  2023-08-12 10:51   ` Christoph Hellwig
@ 2023-08-12 17:04     ` Heiko Carstens
  2023-08-12 17:28       ` Heiko Carstens
  2023-08-12 20:43       ` Matthew Wilcox
  0 siblings, 2 replies; 46+ messages in thread
From: Heiko Carstens @ 2023-08-12 17:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Josef Bacik, Stefan Haberland, Jan Hoeppner, Vasily Gorbik,
	Alexander Gordeev, Darrick J . Wong, Chris Mason, David Sterba,
	linux-block, nbd, linux-s390, linux-btrfs, linux-fsdevel

On Sat, Aug 12, 2023 at 12:51:33PM +0200, Christoph Hellwig wrote:
> The buildbot pointed out correctly (but rather late), that the special
> s390/dasd export needs a _MODULE postfix, so this will have to be
> folded in:
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 2a035be7f3ee90..a20263fa27a462 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -967,7 +967,7 @@ void bdev_mark_dead(struct block_device *bdev, bool surprise)
>  
>  	invalidate_bdev(bdev);
>  }
> -#ifdef CONFIG_DASD
> +#ifdef CONFIG_DASD_MODULE

This needs to be

#if IS_ENABLED(CONFIG_DASD)

to cover both CONFIG_DASD=y and CONFIG_DASD=m.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev
  2023-08-12 17:04     ` Heiko Carstens
@ 2023-08-12 17:28       ` Heiko Carstens
  2023-08-12 20:43       ` Matthew Wilcox
  1 sibling, 0 replies; 46+ messages in thread
From: Heiko Carstens @ 2023-08-12 17:28 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel

On Sat, Aug 12, 2023 at 07:04:00PM +0200, Heiko Carstens wrote:
> On Sat, Aug 12, 2023 at 12:51:33PM +0200, Christoph Hellwig wrote:
> > The buildbot pointed out correctly (but rather late), that the special
> > s390/dasd export needs a _MODULE postfix, so this will have to be
> > folded in:
> > 
> > diff --git a/block/bdev.c b/block/bdev.c
> > index 2a035be7f3ee90..a20263fa27a462 100644
> > --- a/block/bdev.c
> > +++ b/block/bdev.c
> > @@ -967,7 +967,7 @@ void bdev_mark_dead(struct block_device *bdev, bool surprise)
> >  
> >  	invalidate_bdev(bdev);
> >  }
> > -#ifdef CONFIG_DASD
> > +#ifdef CONFIG_DASD_MODULE
> 
> This needs to be
> 
> #if IS_ENABLED(CONFIG_DASD)
> 
> to cover both CONFIG_DASD=y and CONFIG_DASD=m.

..but of course you only want the export for CONFIG_DASD=m.
So just ignore my comment, please.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev
  2023-08-12 17:04     ` Heiko Carstens
  2023-08-12 17:28       ` Heiko Carstens
@ 2023-08-12 20:43       ` Matthew Wilcox
  1 sibling, 0 replies; 46+ messages in thread
From: Matthew Wilcox @ 2023-08-12 20:43 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel

On Sat, Aug 12, 2023 at 07:04:00PM +0200, Heiko Carstens wrote:
> On Sat, Aug 12, 2023 at 12:51:33PM +0200, Christoph Hellwig wrote:
> > The buildbot pointed out correctly (but rather late), that the special
> > s390/dasd export needs a _MODULE postfix, so this will have to be
> > folded in:
> > 
> > diff --git a/block/bdev.c b/block/bdev.c
> > index 2a035be7f3ee90..a20263fa27a462 100644
> > --- a/block/bdev.c
> > +++ b/block/bdev.c
> > @@ -967,7 +967,7 @@ void bdev_mark_dead(struct block_device *bdev, bool surprise)
> >  
> >  	invalidate_bdev(bdev);
> >  }
> > -#ifdef CONFIG_DASD
> > +#ifdef CONFIG_DASD_MODULE
> 
> This needs to be
> 
> #if IS_ENABLED(CONFIG_DASD)
> 
> to cover both CONFIG_DASD=y and CONFIG_DASD=m.

Sure, but if DASD is built-in, the symbol doesn't need to be exported

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-08-11 19:05 ` Josef Bacik
@ 2023-08-14 19:19   ` David Sterba
  0 siblings, 0 replies; 46+ messages in thread
From: David Sterba @ 2023-08-14 19:19 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Stefan Haberland, Jan Hoeppner, Heiko Carstens,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel

On Fri, Aug 11, 2023 at 03:05:07PM -0400, Josef Bacik wrote:
> The CI testing didn't show any regressions or lockdep errors related to moving
> the device opening around (which is what I was worried about). 

The problems with device opening and scanning we fixed in the past were
not hit by regular testing, syzbot had reproducers that were relatively
reliable so we'd need this kind of verification.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/17] btrfs: open block devices after superblock creation
  2023-08-11 13:11     ` David Sterba
@ 2023-08-17 13:24       ` David Sterba
  0 siblings, 0 replies; 46+ messages in thread
From: David Sterba @ 2023-08-17 13:24 UTC (permalink / raw)
  To: David Sterba
  Cc: Christian Brauner, Christoph Hellwig, Al Viro, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On Fri, Aug 11, 2023 at 03:11:31PM +0200, David Sterba wrote:
> On Fri, Aug 11, 2023 at 02:44:50PM +0200, Christian Brauner wrote:
> > On Fri, Aug 11, 2023 at 12:08:16PM +0200, Christoph Hellwig wrote:
> > > Currently btrfs_mount_root opens the block devices before committing to
> > > allocating a super block. That creates problems for restricting the
> > > number of writers to a device, and also leads to a unusual and not very
> > > helpful holder (the fs_type).
> > > 
> > > Reorganize the code to first check whether the superblock for a
> > > particular fsid does already exist and open the block devices only if it
> > > doesn't, mirroring the recent changes to the VFS mount helpers.  To do
> > > this the increment of the in_use counter moves out of btrfs_open_devices
> > > and into the only caller in btrfs_mount_root so that it happens before
> > > dropping uuid_mutex around the call to sget.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > ---
> > 
> > Looks good to me,
> > Acked-by: Christian Brauner <brauner@kernel.org>
> > 
> > And ofc, would be great to get btrfs reviews.
> 
> I'll take a look but there are some performance regressions to deal with
> and pre-merge window freeze so it won't be soon.

I'd rather take the btrfs patches via my tree and get them tested for a
longer time.  This patch in particular changes locking, mount, device
management, that's beyond what I'd consider safe to get merged outside
of btrfs.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-08-11 10:08 remove get_super Christoph Hellwig
                   ` (18 preceding siblings ...)
  2023-08-11 19:05 ` Josef Bacik
@ 2023-09-12 17:42 ` David Sterba
  2023-09-14  8:48   ` Jan Kara
  19 siblings, 1 reply; 46+ messages in thread
From: David Sterba @ 2023-09-12 17:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Josef Bacik, Stefan Haberland, Jan Hoeppner, Heiko Carstens,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel

On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series against the VFS vfs.super branch finishes off the work to remove
> get_super and move (almost) all upcalls to use the holder ops.
> 
> The first part is the missing btrfs bits so that all file systems use the
> super_block as holder.
> 
> The second part is various block driver cleanups so that we use proper
> interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> 
> The last part than replaces __invalidate_device and fsync_bdev with upcalls
> to the file system through the holder ops, and finally removes get_super.
> 
> It leaves user_get_super and get_active_super around.  The former is not
> used for upcalls in the traditional sense, but for legacy UAPI that for
> some weird reason take a dev_t argument (ustat) or a block device path
> (quotactl).  get_active_super is only used for calling into the file system
> on freeze and should get a similar treatment, but given that Darrick has
> changes to that code queued up already this will be handled in the next
> merge window.
> 
> A git tree is available here:
> 
>     git://git.infradead.org/users/hch/misc.git remove-get_super

FYI, I've added patches 2-5 as a topic branch to btrfs for-next.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-09-12 17:42 ` David Sterba
@ 2023-09-14  8:48   ` Jan Kara
  2023-09-14 12:03     ` David Sterba
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Kara @ 2023-09-14  8:48 UTC (permalink / raw)
  To: David Sterba
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On Tue 12-09-23 19:42:45, David Sterba wrote:
> On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> > Hi all,
> > 
> > this series against the VFS vfs.super branch finishes off the work to remove
> > get_super and move (almost) all upcalls to use the holder ops.
> > 
> > The first part is the missing btrfs bits so that all file systems use the
> > super_block as holder.
> > 
> > The second part is various block driver cleanups so that we use proper
> > interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> > 
> > The last part than replaces __invalidate_device and fsync_bdev with upcalls
> > to the file system through the holder ops, and finally removes get_super.
> > 
> > It leaves user_get_super and get_active_super around.  The former is not
> > used for upcalls in the traditional sense, but for legacy UAPI that for
> > some weird reason take a dev_t argument (ustat) or a block device path
> > (quotactl).  get_active_super is only used for calling into the file system
> > on freeze and should get a similar treatment, but given that Darrick has
> > changes to that code queued up already this will be handled in the next
> > merge window.
> > 
> > A git tree is available here:
> > 
> >     git://git.infradead.org/users/hch/misc.git remove-get_super
> 
> FYI, I've added patches 2-5 as a topic branch to btrfs for-next.

Hum, I don't see them there. Some glitch somewhere?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-09-14  8:48   ` Jan Kara
@ 2023-09-14 12:03     ` David Sterba
  2023-09-14 12:54       ` Jan Kara
  2023-09-15 17:28       ` Jan Kara
  0 siblings, 2 replies; 46+ messages in thread
From: David Sterba @ 2023-09-14 12:03 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On Thu, Sep 14, 2023 at 10:48:09AM +0200, Jan Kara wrote:
> On Tue 12-09-23 19:42:45, David Sterba wrote:
> > On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> > > Hi all,
> > > 
> > > this series against the VFS vfs.super branch finishes off the work to remove
> > > get_super and move (almost) all upcalls to use the holder ops.
> > > 
> > > The first part is the missing btrfs bits so that all file systems use the
> > > super_block as holder.
> > > 
> > > The second part is various block driver cleanups so that we use proper
> > > interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> > > 
> > > The last part than replaces __invalidate_device and fsync_bdev with upcalls
> > > to the file system through the holder ops, and finally removes get_super.
> > > 
> > > It leaves user_get_super and get_active_super around.  The former is not
> > > used for upcalls in the traditional sense, but for legacy UAPI that for
> > > some weird reason take a dev_t argument (ustat) or a block device path
> > > (quotactl).  get_active_super is only used for calling into the file system
> > > on freeze and should get a similar treatment, but given that Darrick has
> > > changes to that code queued up already this will be handled in the next
> > > merge window.
> > > 
> > > A git tree is available here:
> > > 
> > >     git://git.infradead.org/users/hch/misc.git remove-get_super
> > 
> > FYI, I've added patches 2-5 as a topic branch to btrfs for-next.
> 
> Hum, I don't see them there. Some glitch somewhere?

There will be a delay before the patches show up in the pushed for-next
branch, some tests failed (maybe not related to this series) and there
are other merge conflicts that I need to resolve first.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-09-14 12:03     ` David Sterba
@ 2023-09-14 12:54       ` Jan Kara
  2023-09-15 17:28       ` Jan Kara
  1 sibling, 0 replies; 46+ messages in thread
From: Jan Kara @ 2023-09-14 12:54 UTC (permalink / raw)
  To: David Sterba
  Cc: Jan Kara, Christoph Hellwig, Al Viro, Christian Brauner,
	Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On Thu 14-09-23 14:03:20, David Sterba wrote:
> On Thu, Sep 14, 2023 at 10:48:09AM +0200, Jan Kara wrote:
> > On Tue 12-09-23 19:42:45, David Sterba wrote:
> > > On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> > > > Hi all,
> > > > 
> > > > this series against the VFS vfs.super branch finishes off the work to remove
> > > > get_super and move (almost) all upcalls to use the holder ops.
> > > > 
> > > > The first part is the missing btrfs bits so that all file systems use the
> > > > super_block as holder.
> > > > 
> > > > The second part is various block driver cleanups so that we use proper
> > > > interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> > > > 
> > > > The last part than replaces __invalidate_device and fsync_bdev with upcalls
> > > > to the file system through the holder ops, and finally removes get_super.
> > > > 
> > > > It leaves user_get_super and get_active_super around.  The former is not
> > > > used for upcalls in the traditional sense, but for legacy UAPI that for
> > > > some weird reason take a dev_t argument (ustat) or a block device path
> > > > (quotactl).  get_active_super is only used for calling into the file system
> > > > on freeze and should get a similar treatment, but given that Darrick has
> > > > changes to that code queued up already this will be handled in the next
> > > > merge window.
> > > > 
> > > > A git tree is available here:
> > > > 
> > > >     git://git.infradead.org/users/hch/misc.git remove-get_super
> > > 
> > > FYI, I've added patches 2-5 as a topic branch to btrfs for-next.
> > 
> > Hum, I don't see them there. Some glitch somewhere?
> 
> There will be a delay before the patches show up in the pushed for-next
> branch, some tests failed (maybe not related to this series) and there
> are other merge conflicts that I need to resolve first.

Ah, I see. Thanks for explanation!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: remove get_super
  2023-09-14 12:03     ` David Sterba
  2023-09-14 12:54       ` Jan Kara
@ 2023-09-15 17:28       ` Jan Kara
  1 sibling, 0 replies; 46+ messages in thread
From: Jan Kara @ 2023-09-15 17:28 UTC (permalink / raw)
  To: David Sterba
  Cc: Jan Kara, Christoph Hellwig, Al Viro, Christian Brauner,
	Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On Thu 14-09-23 14:03:20, David Sterba wrote:
> On Thu, Sep 14, 2023 at 10:48:09AM +0200, Jan Kara wrote:
> > On Tue 12-09-23 19:42:45, David Sterba wrote:
> > > On Fri, Aug 11, 2023 at 12:08:11PM +0200, Christoph Hellwig wrote:
> > > > Hi all,
> > > > 
> > > > this series against the VFS vfs.super branch finishes off the work to remove
> > > > get_super and move (almost) all upcalls to use the holder ops.
> > > > 
> > > > The first part is the missing btrfs bits so that all file systems use the
> > > > super_block as holder.
> > > > 
> > > > The second part is various block driver cleanups so that we use proper
> > > > interfaces instead of raw calls to __invalidate_device and fsync_bdev.
> > > > 
> > > > The last part than replaces __invalidate_device and fsync_bdev with upcalls
> > > > to the file system through the holder ops, and finally removes get_super.
> > > > 
> > > > It leaves user_get_super and get_active_super around.  The former is not
> > > > used for upcalls in the traditional sense, but for legacy UAPI that for
> > > > some weird reason take a dev_t argument (ustat) or a block device path
> > > > (quotactl).  get_active_super is only used for calling into the file system
> > > > on freeze and should get a similar treatment, but given that Darrick has
> > > > changes to that code queued up already this will be handled in the next
> > > > merge window.
> > > > 
> > > > A git tree is available here:
> > > > 
> > > >     git://git.infradead.org/users/hch/misc.git remove-get_super
> > > 
> > > FYI, I've added patches 2-5 as a topic branch to btrfs for-next.
> > 
> > Hum, I don't see them there. Some glitch somewhere?
> 
> There will be a delay before the patches show up in the pushed for-next
> branch, some tests failed (maybe not related to this series) and there
> are other merge conflicts that I need to resolve first.

Thanks for picking up the patches, I can see them in your tree now. But
I've also noticed (by comparing my local branch with your tree), that in
this series is also a patch 6/17 "btrfs: use the super_block as holder when
mounting file systems" which you didn't pick up. It actually fixes block
device freezing for btrfs as a sideeffect as Christian found out [1]. Can
you please pick it up as well? Thanks!

								Honza

[1] https://lore.kernel.org/all/20230908-merklich-bebauen-11914a630db4@brauner

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-08-11 10:08 ` [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl Christoph Hellwig
@ 2023-09-20 20:41   ` Samuel Holland
  2023-09-25  7:48     ` Christoph Hellwig
  0 siblings, 1 reply; 46+ messages in thread
From: Samuel Holland @ 2023-09-20 20:41 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Christian Brauner
  Cc: Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel

On 2023-08-11 5:08 AM, Christoph Hellwig wrote:
> nbd_clear_sock_ioctl kills the socket and with that the block
> device.  Instead of just invalidating file system buffers,
> mark the device as dead, which will also invalidate the buffers
> as part of the proper shutdown sequence.  This also includes
> invalidating partitions if there are any.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/block/nbd.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 8576d696c7a221..42e0159bb258fa 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1434,12 +1434,10 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd)
>  	return ret;
>  }
>  
> -static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
> -				 struct block_device *bdev)
> +static void nbd_clear_sock_ioctl(struct nbd_device *nbd)
>  {
> +	blk_mark_disk_dead(nbd->disk);
>  	nbd_clear_sock(nbd);
> -	__invalidate_device(bdev, true);
> -	nbd_bdev_reset(nbd);

This change breaks nbd-client, which calls the NBD_CLEAR_SOCK ioctl during
device setup and socket reconnection. After merging this series (bisected to
511fb5bafed1), all NBD devices are immediately dead on arrival:

[   14.605849] nbd0: detected capacity change from 0 to 4194304

[   14.606211] Buffer I/O error on dev nbd0, logical block 0, async page read
[   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read

[   14.630490]  nbd0: unable to read partition table

I wonder if disk_force_media_change() is the right thing to call here instead.

Regards,
Samuel

>  	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
>  			       &nbd->config->runtime_flags))
>  		nbd_config_put(nbd);
> @@ -1465,7 +1463,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
>  	case NBD_DISCONNECT:
>  		return nbd_disconnect(nbd);
>  	case NBD_CLEAR_SOCK:
> -		nbd_clear_sock_ioctl(nbd, bdev);
> +		nbd_clear_sock_ioctl(nbd);
>  		return 0;
>  	case NBD_SET_SOCK:
>  		return nbd_add_socket(nbd, arg, false);


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-09-20 20:41   ` Samuel Holland
@ 2023-09-25  7:48     ` Christoph Hellwig
  2023-10-01 17:10       ` Wouter Verhelst
  0 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-09-25  7:48 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel, Shinichiro Kawasaki

On Wed, Sep 20, 2023 at 03:41:11PM -0500, Samuel Holland wrote:
> [   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read
> 
> [   14.630490]  nbd0: unable to read partition table
> 
> I wonder if disk_force_media_change() is the right thing to call here instead.

So what are the semantics of clearing the socket?

The <= 6.5 behavior of invalidating fs caches, but not actually marking
the fs shutdown is pretty broken, especially if this expects to resurrect
the device and thus the file system later on.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-09-25  7:48     ` Christoph Hellwig
@ 2023-10-01 17:10       ` Wouter Verhelst
  2023-10-02  6:21         ` Christoph Hellwig
  0 siblings, 1 reply; 46+ messages in thread
From: Wouter Verhelst @ 2023-10-01 17:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Samuel Holland, Al Viro, Christian Brauner, Jens Axboe,
	Denis Efremov, Josef Bacik, Stefan Haberland, Jan Hoeppner,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel, Shinichiro Kawasaki

On Mon, Sep 25, 2023 at 09:48:38AM +0200, Christoph Hellwig wrote:
> On Wed, Sep 20, 2023 at 03:41:11PM -0500, Samuel Holland wrote:
> > [   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read
> > 
> > [   14.630490]  nbd0: unable to read partition table
> > 
> > I wonder if disk_force_media_change() is the right thing to call here instead.
> 
> So what are the semantics of clearing the socket?
> 
> The <= 6.5 behavior of invalidating fs caches, but not actually marking
> the fs shutdown is pretty broken, especially if this expects to resurrect
> the device and thus the file system later on.

nbd-client -d calls

ioctl(nbd, NBD_DISCONNECT);
ioctl(nbd, NBD_CLEAR_SOCK);

(error handling removed for clarity)

where "nbd" is the file handle to the nbd device. This expects that the
device is cleared and that then the device can be reused for a different
connection, much like "losetup -d". Expecting that the next connection
would talk to the same file system is wrong.

In netlink mode, it obviously doesn't use the ioctl()s, but instead
sends an NBD_CMD_DISCONNECT command, without any NBD_CLEAR_SOCK, for
which no equivalent message exists. At this point, obviously the same
result is expected in userspace, i.e., the device should now be
available for the next connection that may or may not be the same one.

nbd-client also has "-persist" option that used to work. This does
expect to resurrect the device and file system. It depends on semantics
where the kernel would block IO to the device until the nbd-client
process that initiated the connection exits, thus allowing it to
re-establish the connection if possible. When doing this, we don't issue
a DISCONNECT or CLEAR_SOCK message and obviously the client is expected
to re-establish a connection to the same device, thus some state should
be retained.

These semantics have however been broken at some point over the past decade
or so, but I didn't notice that at the time, so I didn't complain, and
it's therefore probably not relevant anymore. We should perhaps rethink
whether this is still a good idea given the way the netlink mode does
not have a client waiting for a return of the ioctl() call, and if so
how to implement a replacement.

Kind regards,

-- 
     w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}

I will have a Tin-Actinium-Potassium mixture, thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-10-01 17:10       ` Wouter Verhelst
@ 2023-10-02  6:21         ` Christoph Hellwig
  2023-10-02 19:15           ` Samuel Holland
  0 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2023-10-02  6:21 UTC (permalink / raw)
  To: Wouter Verhelst
  Cc: Christoph Hellwig, Samuel Holland, Al Viro, Christian Brauner,
	Jens Axboe, Denis Efremov, Josef Bacik, Stefan Haberland,
	Jan Hoeppner, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Darrick J . Wong, Chris Mason, David Sterba, linux-block, nbd,
	linux-s390, linux-btrfs, linux-fsdevel, Shinichiro Kawasaki

On Sun, Oct 01, 2023 at 07:10:53PM +0200, Wouter Verhelst wrote:
> > So what are the semantics of clearing the socket?
> > 
> > The <= 6.5 behavior of invalidating fs caches, but not actually marking
> > the fs shutdown is pretty broken, especially if this expects to resurrect
> > the device and thus the file system later on.
> 
> nbd-client -d calls
> 
> ioctl(nbd, NBD_DISCONNECT);
> ioctl(nbd, NBD_CLEAR_SOCK);
> 
> (error handling removed for clarity)
> 
> where "nbd" is the file handle to the nbd device. This expects that the
> device is cleared and that then the device can be reused for a different
> connection, much like "losetup -d". Expecting that the next connection
> would talk to the same file system is wrong.

So a fs shutdown seems like a the right thing.  So I'm a little confused
on what actualy broke here.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  2023-10-02  6:21         ` Christoph Hellwig
@ 2023-10-02 19:15           ` Samuel Holland
  0 siblings, 0 replies; 46+ messages in thread
From: Samuel Holland @ 2023-10-02 19:15 UTC (permalink / raw)
  To: Christoph Hellwig, Wouter Verhelst
  Cc: Al Viro, Christian Brauner, Jens Axboe, Denis Efremov,
	Josef Bacik, Stefan Haberland, Jan Hoeppner, Heiko Carstens,
	Vasily Gorbik, Alexander Gordeev, Darrick J . Wong, Chris Mason,
	David Sterba, linux-block, nbd, linux-s390, linux-btrfs,
	linux-fsdevel, Shinichiro Kawasaki

On 2023-10-02 1:21 AM, Christoph Hellwig wrote:
> On Sun, Oct 01, 2023 at 07:10:53PM +0200, Wouter Verhelst wrote:
>>> So what are the semantics of clearing the socket?
>>>
>>> The <= 6.5 behavior of invalidating fs caches, but not actually marking
>>> the fs shutdown is pretty broken, especially if this expects to resurrect
>>> the device and thus the file system later on.
>>
>> nbd-client -d calls
>>
>> ioctl(nbd, NBD_DISCONNECT);
>> ioctl(nbd, NBD_CLEAR_SOCK);
>>
>> (error handling removed for clarity)
>>
>> where "nbd" is the file handle to the nbd device. This expects that the
>> device is cleared and that then the device can be reused for a different
>> connection, much like "losetup -d". Expecting that the next connection
>> would talk to the same file system is wrong.
> 
> So a fs shutdown seems like a the right thing.  So I'm a little confused
> on what actualy broke here.

I'm not too familiar with the block subsystem, but my understanding is that
blk_mark_disk_dead(nbd->disk) is permanent -- there is no way to un-mark a disk
as dead. So this makes the device (e.g. /dev/nbd0) permanently unusable after
the call to ioctl(nbd, NBD_CLEAR_SOCK).

Like Wouter said, the semantics should be similar to a loop device, where the
same block device can be reused after being disconnected. That was why I
suggested disk_force_media_change() as called from __loop_clr_fd(). The loop
driver doesn't call blk_mark_disk_dead() anywhere.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2023-10-02 19:15 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-11 10:08 remove get_super Christoph Hellwig
2023-08-11 10:08 ` [PATCH 01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" Christoph Hellwig
2023-08-11 10:44   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 02/17] btrfs: always open the device read-only in btrfs_scan_one_device Christoph Hellwig
2023-08-11 12:00   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 03/17] btrfs: call btrfs_close_devices from ->kill_sb Christoph Hellwig
2023-08-11 12:03   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 04/17] btrfs: split btrfs_fs_devices.opened Christoph Hellwig
2023-08-11 12:40   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 05/17] btrfs: open block devices after superblock creation Christoph Hellwig
2023-08-11 12:44   ` Christian Brauner
2023-08-11 13:11     ` David Sterba
2023-08-17 13:24       ` David Sterba
2023-08-11 10:08 ` [PATCH 06/17] btrfs: use the super_block as holder when mounting file systems Christoph Hellwig
2023-08-11 12:45   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl Christoph Hellwig
2023-09-20 20:41   ` Samuel Holland
2023-09-25  7:48     ` Christoph Hellwig
2023-10-01 17:10       ` Wouter Verhelst
2023-10-02  6:21         ` Christoph Hellwig
2023-10-02 19:15           ` Samuel Holland
2023-08-11 10:08 ` [PATCH 08/17] block: simplify the disk_force_media_change interface Christoph Hellwig
2023-08-11 10:08 ` [PATCH 09/17] floppy: call disk_force_media_change when changing the format Christoph Hellwig
2023-08-11 10:08 ` [PATCH 10/17] amiflop: don't call fsync_bdev in FDFMTBEG Christoph Hellwig
2023-08-11 10:08 ` [PATCH 11/17] dasd: also call __invalidate_device when setting the device offline Christoph Hellwig
2023-08-11 10:08 ` [PATCH 12/17] block: drop the "busy inodes on changed media" log message Christoph Hellwig
2023-08-11 10:08 ` [PATCH 13/17] block: consolidate __invalidate_device and fsync_bdev Christoph Hellwig
2023-08-12 10:51   ` Christoph Hellwig
2023-08-12 17:04     ` Heiko Carstens
2023-08-12 17:28       ` Heiko Carstens
2023-08-12 20:43       ` Matthew Wilcox
2023-08-11 10:08 ` [PATCH 14/17] block: call into the file system for bdev_mark_dead Christoph Hellwig
2023-08-11 10:08 ` [PATCH 15/17] block: call into the file system for ioctl BLKFLSBUF Christoph Hellwig
2023-08-11 14:06   ` Josef Bacik
2023-08-11 10:08 ` [PATCH 16/17] fs: remove get_super Christoph Hellwig
2023-08-11 12:46   ` Christian Brauner
2023-08-11 10:08 ` [PATCH 17/17] fs: simplify invalidate_inodes Christoph Hellwig
2023-08-11 12:48   ` Christian Brauner
2023-08-11 13:58 ` remove get_super Josef Bacik
2023-08-11 19:05 ` Josef Bacik
2023-08-14 19:19   ` David Sterba
2023-09-12 17:42 ` David Sterba
2023-09-14  8:48   ` Jan Kara
2023-09-14 12:03     ` David Sterba
2023-09-14 12:54       ` Jan Kara
2023-09-15 17:28       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).