All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature
@ 2023-08-03 15:43 Guilherme G. Piccoli
  2023-08-03 15:43 ` [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Guilherme G. Piccoli
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-03 15:43 UTC (permalink / raw)
  To: linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, gpiccoli, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

Hi all, this is the 2nd attempt of supporting same fsid mounting
on btrfs. V1 is here:
https://lore.kernel.org/linux-btrfs/20230504170708.787361-1-gpiccoli@igalia.com/

The mechanism used to achieve that in V2 was a mix between the suggestion
from JohnS (spoofed fsid) and Qu (a single-dev compat_ro flag) - it is
still based in the metadata_uuid feature, leveraging that infrastructure
since it prevents lots of corner cases, like sysfs same-fsid crashes.

The patches are based on kernel v6.5-rc3 with Anand's metadata_uuid refactor
part 2 on top of it [0]; the btrfs-progs patch is based on "v6.3.3".

Comments/suggestions and overall feedback is much appreciated - tnx in advance!
Cheers,

Guilherme


[0] https://lore.kernel.org/linux-btrfs/cover.1690792823.git.anand.jain@oracle.com/


Guilherme G. Piccoli (3):
  btrfs-progs: Add the single-dev feature (to both mkfs/tune)
  btrfs: Introduce the single-dev feature
  btrfs: Add parameter to force devices behave as single-dev ones

btrfs-progs:
 common/fsfeatures.c        |  7 ++++
 kernel-shared/ctree.h      |  3 +-
 kernel-shared/uapi/btrfs.h |  7 ++++
 mkfs/main.c                |  4 ++-
 tune/main.c                | 72 +++++++++++++++++++++++---------------
 5 files changed, 63 insertions(+), 30 deletions(-)

kernel:
 fs/btrfs/disk-io.c         |  19 +++++-
 fs/btrfs/fs.h              |   3 +-
 fs/btrfs/ioctl.c           |  18 +++++
 fs/btrfs/super.c           |  13 ++--
 fs/btrfs/super.h           |   2 +
 fs/btrfs/volumes.c         | 136 +++++++++++++++++++++++++++++++------
 fs/btrfs/volumes.h         |   5 +-
 include/uapi/linux/btrfs.h |   7 ++
 8 files changed, 175 insertions(+), 28 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune)
  2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
@ 2023-08-03 15:43 ` Guilherme G. Piccoli
  2023-08-17 15:46   ` Josef Bacik
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-03 15:43 UTC (permalink / raw)
  To: linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, gpiccoli, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

The single-dev feature allows a device to be mounted regardless of
its fsid already being present in another device - in other words,
this feature disables RAID modes / metadata_uuid, allowing a single
device per filesystem. Its goal is mainly to allow mounting the
same fsid at the same time in the system.

Introduce hereby the feature to both mkfs (-O single-dev) and
btrfstune (-s), syncing the kernel-shared headers as well. The
feature is a compat_ro, its kernel version was set to v6.5.

Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
---

Hi folks, thanks in advance for reviews! Notice that I've added
the feature to btrfstune as well, but I found docs online saying
this tool is deprecated..so not sure if that was the proper approach.

Also, a design decision: I've skipped the btrfs_register_one_device()
call when mkfs was just used with the single-dev tuning, or else
it shows a (harmless) error and succeeds, since of course scanning
fails for such devices, as per the feature implementation.
So, I thought it was more straightforward to just skip the call itself.

Cheers,

Guilherme

 common/fsfeatures.c        |  7 ++++
 kernel-shared/ctree.h      |  3 +-
 kernel-shared/uapi/btrfs.h |  7 ++++
 mkfs/main.c                |  4 ++-
 tune/main.c                | 72 +++++++++++++++++++++++---------------
 5 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/common/fsfeatures.c b/common/fsfeatures.c
index 00658fa5159f..a320b7062b8c 100644
--- a/common/fsfeatures.c
+++ b/common/fsfeatures.c
@@ -160,6 +160,13 @@ static const struct btrfs_feature mkfs_features[] = {
 		VERSION_NULL(default),
 		.desc		= "RAID1 with 3 or 4 copies"
 	},
+	{
+		.name		= "single-dev",
+		.compat_ro_flag	= BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV,
+		.sysfs_name	= "single_dev",
+		VERSION_TO_STRING2(compat, 6,5),
+		.desc		= "single device (allows same fsid mounting)"
+	},
 #ifdef BTRFS_ZONED
 	{
 		.name		= "zoned",
diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index 59533879b939..e3fd834aa6dd 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -86,7 +86,8 @@ static inline u32 __BTRFS_LEAF_DATA_SIZE(u32 nodesize)
 	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
 	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
 	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
-	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
+	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
+	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
 
 #if EXPERIMENTAL
 #define BTRFS_FEATURE_INCOMPAT_SUPP			\
diff --git a/kernel-shared/uapi/btrfs.h b/kernel-shared/uapi/btrfs.h
index 85b04f89a2a9..2e0ee6ef6446 100644
--- a/kernel-shared/uapi/btrfs.h
+++ b/kernel-shared/uapi/btrfs.h
@@ -336,6 +336,13 @@ _static_assert(sizeof(struct btrfs_ioctl_fs_info_args) == 1024);
  */
 #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 3)
 
+/*
+ * Single devices (as flagged by the corresponding compat_ro flag) only
+ * gets scanned during mount time; also, a random fsid is generated for
+ * them, in order to cope with same-fsid filesystem mounts.
+ */
+#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV		(1ULL << 4)
+
 #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
 #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
diff --git a/mkfs/main.c b/mkfs/main.c
index 972ed1112ea6..429799932224 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1025,6 +1025,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	char *label = NULL;
 	int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN);
 	char *source_dir = NULL;
+	bool single_dev;
 
 	cpu_detect_flags();
 	hash_init_accel();
@@ -1218,6 +1219,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		usage(&mkfs_cmd, 1);
 
 	opt_zoned = !!(features.incompat_flags & BTRFS_FEATURE_INCOMPAT_ZONED);
+	single_dev = !!(features.compat_ro_flags & BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV);
 
 	if (source_dir && device_count > 1) {
 		error("the option -r is limited to a single device");
@@ -1815,7 +1817,7 @@ out:
 		device_count = argc - optind;
 		while (device_count-- > 0) {
 			file = argv[optind++];
-			if (path_is_block_device(file) == 1)
+			if (path_is_block_device(file) == 1 && !single_dev)
 				btrfs_register_one_device(file);
 		}
 	}
diff --git a/tune/main.c b/tune/main.c
index 0ca1e01282c9..95e55fcda44f 100644
--- a/tune/main.c
+++ b/tune/main.c
@@ -42,27 +42,31 @@
 #include "tune/tune.h"
 #include "check/clear-cache.h"
 
+#define SET_SUPER_FLAGS(type) \
+static int set_super_##type##_flags(struct btrfs_root *root, u64 flags) \
+{									\
+	struct btrfs_trans_handle *trans;				\
+	struct btrfs_super_block *disk_super;				\
+	u64 super_flags;						\
+	int ret;							\
+									\
+	disk_super = root->fs_info->super_copy;				\
+	super_flags = btrfs_super_##type##_flags(disk_super);		\
+	super_flags |= flags;						\
+	trans = btrfs_start_transaction(root, 1);			\
+	BUG_ON(IS_ERR(trans));						\
+	btrfs_set_super_##type##_flags(disk_super, super_flags);	\
+	ret = btrfs_commit_transaction(trans, root);			\
+									\
+	return ret;							\
+}
+
+SET_SUPER_FLAGS(incompat)
+SET_SUPER_FLAGS(compat_ro)
+
 static char *device;
 static int force = 0;
 
-static int set_super_incompat_flags(struct btrfs_root *root, u64 flags)
-{
-	struct btrfs_trans_handle *trans;
-	struct btrfs_super_block *disk_super;
-	u64 super_flags;
-	int ret;
-
-	disk_super = root->fs_info->super_copy;
-	super_flags = btrfs_super_incompat_flags(disk_super);
-	super_flags |= flags;
-	trans = btrfs_start_transaction(root, 1);
-	BUG_ON(IS_ERR(trans));
-	btrfs_set_super_incompat_flags(disk_super, super_flags);
-	ret = btrfs_commit_transaction(trans, root);
-
-	return ret;
-}
-
 static int convert_to_fst(struct btrfs_fs_info *fs_info)
 {
 	int ret;
@@ -102,6 +106,7 @@ static const char * const tune_usage[] = {
 	OPTLINE("-r", "enable extended inode refs (mkfs: extref, for hardlink limits)"),
 	OPTLINE("-x", "enable skinny metadata extent refs (mkfs: skinny-metadata)"),
 	OPTLINE("-n", "enable no-holes feature (mkfs: no-holes, more efficient sparse file representation)"),
+	OPTLINE("-s", "enable single device feature (mkfs: single-dev, allows same fsid mounting)"),
 	OPTLINE("-S <0|1>", "set/unset seeding status of a device"),
 	OPTLINE("--convert-to-block-group-tree", "convert filesystem to track block groups in "
 			"the separate block-group-tree instead of extent tree (sets the incompat bit)"),
@@ -146,7 +151,8 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 	int csum_type = -1;
 	char *new_fsid_str = NULL;
 	int ret;
-	u64 super_flags = 0;
+	u64 compat_ro_flags = 0;
+	u64 incompat_flags = 0;
 	int fd = -1;
 
 	btrfs_config_init();
@@ -169,7 +175,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 #endif
 			{ NULL, 0, NULL, 0 }
 		};
-		int c = getopt_long(argc, argv, "S:rxfuU:nmM:", long_options, NULL);
+		int c = getopt_long(argc, argv, "S:rxfuU:nsmM:", long_options, NULL);
 
 		if (c < 0)
 			break;
@@ -179,13 +185,16 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 			seeding_value = arg_strtou64(optarg);
 			break;
 		case 'r':
-			super_flags |= BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF;
+			incompat_flags |= BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF;
 			break;
 		case 'x':
-			super_flags |= BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA;
+			incompat_flags |= BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA;
 			break;
 		case 'n':
-			super_flags |= BTRFS_FEATURE_INCOMPAT_NO_HOLES;
+			incompat_flags |= BTRFS_FEATURE_INCOMPAT_NO_HOLES;
+			break;
+		case 's':
+			compat_ro_flags |= BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV;
 			break;
 		case 'f':
 			force = 1;
@@ -239,9 +248,9 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 		error("random fsid can't be used with specified fsid");
 		return 1;
 	}
-	if (!super_flags && !seeding_flag && !(random_fsid || new_fsid_str) &&
-	    !change_metadata_uuid && csum_type == -1 && !to_bg_tree &&
-	    !to_extent_tree && !to_fst) {
+	if (!compat_ro_flags && !incompat_flags && !seeding_flag &&
+	    !(random_fsid || new_fsid_str) && !change_metadata_uuid &&
+	    csum_type == -1 && !to_bg_tree && !to_extent_tree && !to_fst) {
 		error("at least one option should be specified");
 		usage(&tune_cmd, 1);
 		return 1;
@@ -363,8 +372,15 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 		total++;
 	}
 
-	if (super_flags) {
-		ret = set_super_incompat_flags(root, super_flags);
+	if (incompat_flags) {
+		ret = set_super_incompat_flags(root, incompat_flags);
+		if (!ret)
+			success++;
+		total++;
+	}
+
+	if (compat_ro_flags) {
+		ret = set_super_compat_ro_flags(root, compat_ro_flags);
 		if (!ret)
 			success++;
 		total++;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
  2023-08-03 15:43 ` [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Guilherme G. Piccoli
@ 2023-08-03 15:43 ` Guilherme G. Piccoli
  2023-08-04  8:27   ` Qu Wenruo
                     ` (3 more replies)
  2023-08-03 15:43 ` [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones Guilherme G. Piccoli
                   ` (2 subsequent siblings)
  4 siblings, 4 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-03 15:43 UTC (permalink / raw)
  To: linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, gpiccoli, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

Btrfs doesn't currently support to mount 2 different devices holding the
same filesystem - the fsid is used as a unique identifier in the driver.
This case is supported though in some other common filesystems, like
ext4; one of the reasons for which is not trivial supporting this case
on btrfs is due to its multi-device filesystem nature, native RAID, etc.

Supporting the same-fsid mounts has the advantage of allowing btrfs to
be used in A/B partitioned devices, like mobile phones or the Steam Deck
for example. Without this support, it's not safe for users to keep the
same "image version" in both A and B partitions, a setup that is quite
common for development, for example. Also, as a big bonus, it allows fs
integrity check based on block devices for RO devices (whereas currently
it is required that both have different fsid, breaking the block device
hash comparison).

Such same-fsid mounting is hereby added through the usage of the
filesystem feature "single-dev" - when such feature is used, btrfs
generates a random fsid for the filesystem and leverages the long-term
present metadata_uuid infrastructure to enable the usage of this
secondary virtual fsid, effectively requiring few non-invasive changes
to the code and no new potential corner cases.

In order to prevent more code complexity and corner cases, given
the nature of this mechanism (single-devices), the single-dev feature
is not allowed when the metadata_uuid flag is already present on the
fs, or if the device is on fsid-change state. Device removal/replace
is also disabled for devices presenting the single-dev feature.

Suggested-by: John Schoenick <johns@valvesoftware.com>
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
---
 fs/btrfs/disk-io.c         | 19 +++++++-
 fs/btrfs/fs.h              |  3 +-
 fs/btrfs/ioctl.c           | 18 +++++++
 fs/btrfs/super.c           |  8 ++--
 fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
 fs/btrfs/volumes.h         |  3 +-
 include/uapi/linux/btrfs.h |  7 +++
 7 files changed, 127 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 669b10355091..455fa4949c98 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -320,7 +320,7 @@ static bool check_tree_block_fsid(struct extent_buffer *eb)
 	/*
 	 * alloc_fs_devices() copies the fsid into metadata_uuid if the
 	 * metadata_uuid is unset in the superblock, including for a seed device.
-	 * So, we can use fs_devices->metadata_uuid.
+	 * So, we can use fs_devices->metadata_uuid; same for SINGLE_DEV devices.
 	 */
 	if (!memcmp(fsid, fs_info->fs_devices->metadata_uuid, BTRFS_FSID_SIZE))
 		return false;
@@ -2288,6 +2288,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
 {
 	u64 nodesize = btrfs_super_nodesize(sb);
 	u64 sectorsize = btrfs_super_sectorsize(sb);
+	u8 *fsid;
 	int ret = 0;
 
 	if (btrfs_super_magic(sb) != BTRFS_MAGIC) {
@@ -2368,7 +2369,21 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
 		ret = -EINVAL;
 	}
 
-	if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE)) {
+	/*
+	 * For SINGLE_DEV devices, btrfs creates a random fsid and makes
+	 * use of the metadata_uuid infrastructure in order to allow, for
+	 * example, two devices with same fsid getting mounted at the same
+	 * time. But notice no changes happen at the disk level, so the
+	 * random generated fsid is a driver abstraction, not to be written
+	 * in the disk. That's the reason we're required here to compare the
+	 * fsid with the metadata_uuid for such devices.
+	 */
+	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
+		fsid = fs_info->fs_devices->metadata_uuid;
+	else
+		fsid = fs_info->fs_devices->fsid;
+
+	if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) {
 		btrfs_err(fs_info,
 		"superblock fsid doesn't match fsid of fs_devices: %pU != %pU",
 			  sb->fsid, fs_info->fs_devices->fsid);
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 203d2a267828..c6d124973361 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -200,7 +200,8 @@ enum {
 	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
 	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
 	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
-	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
+	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
+	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
 
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a895d105464b..56703d87def9 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2678,6 +2678,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+		btrfs_err(fs_info,
+			  "device removal is unsupported on SINGLE_DEV devices\n");
+		return -EINVAL;
+	}
+
 	vol_args = memdup_user(arg, sizeof(*vol_args));
 	if (IS_ERR(vol_args))
 		return PTR_ERR(vol_args);
@@ -2744,6 +2750,12 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+		btrfs_err(fs_info,
+			  "device removal is unsupported on SINGLE_DEV devices\n");
+		return -EINVAL;
+	}
+
 	vol_args = memdup_user(arg, sizeof(*vol_args));
 	if (IS_ERR(vol_args))
 		return PTR_ERR(vol_args);
@@ -3268,6 +3280,12 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+		btrfs_err(fs_info,
+			  "device removal is unsupported on SINGLE_DEV devices\n");
+		return -EINVAL;
+	}
+
 	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
 		btrfs_err(fs_info, "device replace not supported on extent tree v2 yet");
 		return -EINVAL;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f1dd172d8d5b..ee87189b1ccd 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
 				error = -ENOMEM;
 				goto out;
 			}
-			device = btrfs_scan_one_device(device_name, flags);
+			device = btrfs_scan_one_device(device_name, flags, true);
 			kfree(device_name);
 			if (IS_ERR(device)) {
 				error = PTR_ERR(device);
@@ -1478,7 +1478,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 		goto error_fs_info;
 	}
 
-	device = btrfs_scan_one_device(device_name, mode);
+	device = btrfs_scan_one_device(device_name, mode, true);
 	if (IS_ERR(device)) {
 		mutex_unlock(&uuid_mutex);
 		error = PTR_ERR(device);
@@ -2190,7 +2190,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 	switch (cmd) {
 	case BTRFS_IOC_SCAN_DEV:
 		mutex_lock(&uuid_mutex);
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
+		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
 		ret = PTR_ERR_OR_ZERO(device);
 		mutex_unlock(&uuid_mutex);
 		break;
@@ -2204,7 +2204,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 		break;
 	case BTRFS_IOC_DEVICES_READY:
 		mutex_lock(&uuid_mutex);
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
+		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
 		if (IS_ERR(device)) {
 			mutex_unlock(&uuid_mutex);
 			ret = PTR_ERR(device);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 73753dae111a..433a490f2de8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -681,12 +681,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 	return -EINVAL;
 }
 
-static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb)
+static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb,
+					     bool has_metadata_uuid,
+					     bool single_dev)
 {
-	bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
-				  BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
+	if (has_metadata_uuid || single_dev)
+		return sb->metadata_uuid;
 
-	return has_metadata_uuid ? sb->metadata_uuid : NULL;
+	return NULL;
 }
 
 u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)
@@ -775,8 +777,36 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
 
 	return NULL;
 }
+
+static void prepare_virtual_fsid(struct btrfs_super_block *disk_super,
+				 const char *path)
+{
+	struct btrfs_fs_devices *fs_devices;
+	u8 vfsid[BTRFS_FSID_SIZE];
+	bool dup_fsid = true;
+
+	while (dup_fsid) {
+		dup_fsid = false;
+		generate_random_uuid(vfsid);
+
+		list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+			if (!memcmp(vfsid, fs_devices->fsid, BTRFS_FSID_SIZE) ||
+			    !memcmp(vfsid, fs_devices->metadata_uuid,
+				    BTRFS_FSID_SIZE))
+				dup_fsid = true;
+		}
+	}
+
+	memcpy(disk_super->metadata_uuid, disk_super->fsid, BTRFS_FSID_SIZE);
+	memcpy(disk_super->fsid, vfsid, BTRFS_FSID_SIZE);
+
+	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
+		disk_super->fsid, path, disk_super->metadata_uuid);
+}
+
 /*
- * Add new device to list of registered devices
+ * Add new device to list of registered devices, or in case of a SINGLE_DEV
+ * device, also creates a virtual fsid to cope with same-fsid cases.
  *
  * Returns:
  * device pointer which was just added or updated when successful
@@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
  */
 static noinline struct btrfs_device *device_list_add(const char *path,
 			   struct btrfs_super_block *disk_super,
-			   bool *new_device_added)
+			   bool *new_device_added, bool single_dev)
 {
 	struct btrfs_device *device;
 	struct btrfs_fs_devices *fs_devices = NULL;
@@ -805,23 +835,32 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 		return ERR_PTR(error);
 	}
 
-	if (fsid_change_in_progress) {
-		if (!has_metadata_uuid)
-			fs_devices = find_fsid_inprogress(disk_super);
-		else
-			fs_devices = find_fsid_changed(disk_super);
-	} else if (has_metadata_uuid) {
-		fs_devices = find_fsid_with_metadata_uuid(disk_super);
+	if (single_dev) {
+		if (has_metadata_uuid || fsid_change_in_progress) {
+			btrfs_err(NULL,
+		"SINGLE_DEV devices don't support the metadata_uuid feature\n");
+			return ERR_PTR(-EINVAL);
+		}
+		prepare_virtual_fsid(disk_super, path);
 	} else {
-		fs_devices = find_fsid_reverted_metadata(disk_super);
-		if (!fs_devices)
-			fs_devices = find_fsid(disk_super->fsid, NULL);
+		if (fsid_change_in_progress) {
+			if (!has_metadata_uuid)
+				fs_devices = find_fsid_inprogress(disk_super);
+			else
+				fs_devices = find_fsid_changed(disk_super);
+		} else if (has_metadata_uuid) {
+			fs_devices = find_fsid_with_metadata_uuid(disk_super);
+		} else {
+			fs_devices = find_fsid_reverted_metadata(disk_super);
+			if (!fs_devices)
+				fs_devices = find_fsid(disk_super->fsid, NULL);
+		}
 	}
 
-
 	if (!fs_devices) {
 		fs_devices = alloc_fs_devices(disk_super->fsid,
-				btrfs_sb_metadata_uuid_or_null(disk_super));
+				btrfs_sb_metadata_uuid_single_dev(disk_super,
+							has_metadata_uuid, single_dev));
 		if (IS_ERR(fs_devices))
 			return ERR_CAST(fs_devices);
 
@@ -1365,13 +1404,15 @@ int btrfs_forget_devices(dev_t devt)
  * and we are not allowed to call set_blocksize during the scan. The superblock
  * is read via pagecache
  */
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
+struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
+					   bool mounting)
 {
 	struct btrfs_super_block *disk_super;
 	bool new_device_added = false;
 	struct btrfs_device *device = NULL;
 	struct block_device *bdev;
 	u64 bytenr, bytenr_orig;
+	bool single_dev;
 	int ret;
 
 	lockdep_assert_held(&uuid_mutex);
@@ -1410,7 +1451,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
 		goto error_bdev_put;
 	}
 
-	device = device_list_add(path, disk_super, &new_device_added);
+	single_dev = btrfs_super_compat_ro_flags(disk_super) &
+			BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV;
+
+	if (!mounting && single_dev) {
+		pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n",
+			path);
+		btrfs_release_disk_super(disk_super);
+		return ERR_PTR(-EINVAL);
+	}
+
+	device = device_list_add(path, disk_super, &new_device_added, single_dev);
 	if (!IS_ERR(device) && new_device_added)
 		btrfs_free_stale_devices(device->devt, device);
 
@@ -2406,6 +2457,12 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
 
 	args->devid = btrfs_stack_device_id(&disk_super->dev_item);
 	memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE);
+
+	/*
+	 * Note that SINGLE_DEV devices are not handled in a special way here;
+	 * device removal/replace is instead forbidden when such feature is
+	 * present, this note is for future users/readers of this function.
+	 */
 	if (btrfs_fs_incompat(fs_info, METADATA_UUID))
 		memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE);
 	else
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 0f87057bb575..b9856c801567 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -611,7 +611,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
 void btrfs_mapping_tree_free(struct extent_map_tree *tree);
 int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 		       blk_mode_t flags, void *holder);
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags);
+struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
+					   bool mounting);
 int btrfs_forget_devices(dev_t devt);
 void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
 void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices);
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index dbb8b96da50d..cb7a7cfe1ea9 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -313,6 +313,13 @@ struct btrfs_ioctl_fs_info_args {
  */
 #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 3)
 
+/*
+ * Single devices (as flagged by the corresponding compat_ro flag) only
+ * gets scanned during mount time; also, a random fsid is generated for
+ * them, in order to cope with same-fsid filesystem mounts.
+ */
+#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV		(1ULL << 4)
+
 #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
 #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones
  2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
  2023-08-03 15:43 ` [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Guilherme G. Piccoli
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
@ 2023-08-03 15:43 ` Guilherme G. Piccoli
  2023-08-17 15:44   ` Josef Bacik
  2023-08-17 13:56 ` [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
  2023-08-17 14:19 ` Josef Bacik
  4 siblings, 1 reply; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-03 15:43 UTC (permalink / raw)
  To: linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, gpiccoli, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

Devices with the single-dev feature enabled in their superblock are
allowed to be mounted regardless of their fsid being already present
in the system - the goal of such feature is to have the device in a
single mode with no advanced features, like RAID; it is a compat_ro
feature present since kernel v6.5.

The thing is that such feature comes in the form of a superblock flag,
so devices that doesn't have it set, can't use the feature of course.
The Steam Deck console aims to have block-based updates in its
RO rootfs, and given its A/B partition nature, both block devices are
required to be the same for their hash to match, so it's not possible
to compare two images if one has this feature set in the superblock,
while the other has not. So if we end-up having two old images, we
couldn't make use of the single-dev feature to mount both at same time,
or if we set the flag in one of them to enable the feature, we break
the block-based hash comparison.

We propose here a module parameter approach to allow forcing any given
path (to a device holding a btrfs filesystem) behaving as a single-dev
device. That would useful for cases like the Steam Deck one, or for
debug purposes. If the filesystem already has the compat_ro flag set
in its superblock, the parameter is no-op.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
---
 fs/btrfs/disk-io.c |  2 +-
 fs/btrfs/ioctl.c   |  6 +++---
 fs/btrfs/super.c   |  5 +++++
 fs/btrfs/super.h   |  2 ++
 fs/btrfs/volumes.c | 45 ++++++++++++++++++++++++++++++++++++++++++---
 fs/btrfs/volumes.h |  2 ++
 6 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 455fa4949c98..8df1defa1ede 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2378,7 +2378,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
 	 * in the disk. That's the reason we're required here to compare the
 	 * fsid with the metadata_uuid for such devices.
 	 */
-	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
+	if (fs_info->fs_devices->single_dev)
 		fsid = fs_info->fs_devices->metadata_uuid;
 	else
 		fsid = fs_info->fs_devices->fsid;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 56703d87def9..4fc63e802b08 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2678,7 +2678,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+	if (fs_info->fs_devices->single_dev) {
 		btrfs_err(fs_info,
 			  "device removal is unsupported on SINGLE_DEV devices\n");
 		return -EINVAL;
@@ -2750,7 +2750,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+	if (fs_info->fs_devices->single_dev) {
 		btrfs_err(fs_info,
 			  "device removal is unsupported on SINGLE_DEV devices\n");
 		return -EINVAL;
@@ -3280,7 +3280,7 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
+	if (fs_info->fs_devices->single_dev) {
 		btrfs_err(fs_info,
 			  "device removal is unsupported on SINGLE_DEV devices\n");
 		return -EINVAL;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index ee87189b1ccd..3cfc9c63360f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -62,6 +62,11 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/btrfs.h>
 
+char *force_single_dev;
+module_param(force_single_dev, charp, 0444);
+MODULE_PARM_DESC(force_single_dev,
+	"User list of devices to force acting as single-dev (comma separated)");
+
 static const struct super_operations btrfs_super_ops;
 
 /*
diff --git a/fs/btrfs/super.h b/fs/btrfs/super.h
index 8dbb909b364f..c855127600c8 100644
--- a/fs/btrfs/super.h
+++ b/fs/btrfs/super.h
@@ -3,6 +3,8 @@
 #ifndef BTRFS_SUPER_H
 #define BTRFS_SUPER_H
 
+extern char *force_single_dev;
+
 int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 			unsigned long new_flags);
 int btrfs_sync_fs(struct super_block *sb, int wait);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 433a490f2de8..06c5bad77bdf 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -12,6 +12,7 @@
 #include <linux/uuid.h>
 #include <linux/list_sort.h>
 #include <linux/namei.h>
+#include <linux/string.h>
 #include "misc.h"
 #include "ctree.h"
 #include "extent_map.h"
@@ -865,6 +866,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
 			return ERR_CAST(fs_devices);
 
 		fs_devices->fsid_change = fsid_change_in_progress;
+		fs_devices->single_dev = single_dev;
 
 		mutex_lock(&fs_devices->device_list_mutex);
 		list_add(&fs_devices->fs_list, &fs_uuids);
@@ -1399,6 +1401,45 @@ int btrfs_forget_devices(dev_t devt)
 	return ret;
 }
 
+/*
+ * SINGLE_DEV is a compat_ro feature, but we also have the force_single_dev
+ * module parameter in order to allow forcing a device to behave as single-dev,
+ * so old filesystems could also get mounted in a same-fsid mounting way.
+ */
+
+static bool is_single_dev(const char *path, struct btrfs_super_block *sb)
+{
+
+	if (btrfs_super_compat_ro_flags(sb) & BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
+		return true;
+
+	if (force_single_dev) {
+		char *p, *skip_devs, *orig;
+
+		skip_devs = kstrdup(force_single_dev, GFP_KERNEL);
+		if (!skip_devs) {
+			pr_err("BTRFS: couldn't parse force_single_dev parameter\n");
+			return false;
+		}
+
+		orig = skip_devs;
+		while ((p = strsep(&skip_devs, ",")) != NULL) {
+			if (!*p)
+				continue;
+
+			if (!strcmp(p, path)) {
+				pr_info(
+			"BTRFS: forcing device %s to be single-dev\n", path);
+				kfree(orig);
+				return true;
+			}
+		}
+		kfree(orig);
+	}
+
+	return false;
+}
+
 /*
  * Look for a btrfs signature on a device. This may be called out of the mount path
  * and we are not allowed to call set_blocksize during the scan. The superblock
@@ -1451,9 +1492,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 		goto error_bdev_put;
 	}
 
-	single_dev = btrfs_super_compat_ro_flags(disk_super) &
-			BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV;
-
+	single_dev = is_single_dev(path, disk_super);
 	if (!mounting && single_dev) {
 		pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n",
 			path);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index b9856c801567..57a3969f101c 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -293,6 +293,8 @@ struct btrfs_fs_devices {
 	 */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
 
+	bool single_dev;
+
 	struct list_head fs_list;
 
 	/*
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
@ 2023-08-04  8:27   ` Qu Wenruo
  2023-08-04 11:38     ` Guilherme G. Piccoli
  2023-08-17 15:41   ` Josef Bacik
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2023-08-04  8:27 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis, wqu,
	vivek



On 2023/8/3 23:43, Guilherme G. Piccoli wrote:
> Btrfs doesn't currently support to mount 2 different devices holding the
> same filesystem - the fsid is used as a unique identifier in the driver.
> This case is supported though in some other common filesystems, like
> ext4; one of the reasons for which is not trivial supporting this case
> on btrfs is due to its multi-device filesystem nature, native RAID, etc.
>
> Supporting the same-fsid mounts has the advantage of allowing btrfs to
> be used in A/B partitioned devices, like mobile phones or the Steam Deck
> for example. Without this support, it's not safe for users to keep the
> same "image version" in both A and B partitions, a setup that is quite
> common for development, for example. Also, as a big bonus, it allows fs
> integrity check based on block devices for RO devices (whereas currently
> it is required that both have different fsid, breaking the block device
> hash comparison).
>
> Such same-fsid mounting is hereby added through the usage of the
> filesystem feature "single-dev" - when such feature is used, btrfs
> generates a random fsid for the filesystem and leverages the long-term
> present metadata_uuid infrastructure to enable the usage of this
> secondary virtual fsid, effectively requiring few non-invasive changes
> to the code and no new potential corner cases.

My concern is still about the "virtual" fsid part.

If we go virtual fsid, there can be some unexpected problems.

E.g. the /sys/fs/btrfs/<uuid>/ entry would be the new virtual one.

And there may be some other problems like user space UUID detection of
mounted fs, thus I'm not 100% sure if this is a good idea.

However I don't have any better solution either, so this may be the
least worst solution for now.

Thanks,
Qu
>
> In order to prevent more code complexity and corner cases, given
> the nature of this mechanism (single-devices), the single-dev feature
> is not allowed when the metadata_uuid flag is already present on the
> fs, or if the device is on fsid-change state. Device removal/replace
> is also disabled for devices presenting the single-dev feature.
>
> Suggested-by: John Schoenick <johns@valvesoftware.com>
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
>   fs/btrfs/disk-io.c         | 19 +++++++-
>   fs/btrfs/fs.h              |  3 +-
>   fs/btrfs/ioctl.c           | 18 +++++++
>   fs/btrfs/super.c           |  8 ++--
>   fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
>   fs/btrfs/volumes.h         |  3 +-
>   include/uapi/linux/btrfs.h |  7 +++
>   7 files changed, 127 insertions(+), 28 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 669b10355091..455fa4949c98 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -320,7 +320,7 @@ static bool check_tree_block_fsid(struct extent_buffer *eb)
>   	/*
>   	 * alloc_fs_devices() copies the fsid into metadata_uuid if the
>   	 * metadata_uuid is unset in the superblock, including for a seed device.
> -	 * So, we can use fs_devices->metadata_uuid.
> +	 * So, we can use fs_devices->metadata_uuid; same for SINGLE_DEV devices.
>   	 */
>   	if (!memcmp(fsid, fs_info->fs_devices->metadata_uuid, BTRFS_FSID_SIZE))
>   		return false;
> @@ -2288,6 +2288,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>   {
>   	u64 nodesize = btrfs_super_nodesize(sb);
>   	u64 sectorsize = btrfs_super_sectorsize(sb);
> +	u8 *fsid;
>   	int ret = 0;
>
>   	if (btrfs_super_magic(sb) != BTRFS_MAGIC) {
> @@ -2368,7 +2369,21 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>   		ret = -EINVAL;
>   	}
>
> -	if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE)) {
> +	/*
> +	 * For SINGLE_DEV devices, btrfs creates a random fsid and makes
> +	 * use of the metadata_uuid infrastructure in order to allow, for
> +	 * example, two devices with same fsid getting mounted at the same
> +	 * time. But notice no changes happen at the disk level, so the
> +	 * random generated fsid is a driver abstraction, not to be written
> +	 * in the disk. That's the reason we're required here to compare the
> +	 * fsid with the metadata_uuid for such devices.
> +	 */
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
> +		fsid = fs_info->fs_devices->metadata_uuid;
> +	else
> +		fsid = fs_info->fs_devices->fsid;
> +
> +	if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) {
>   		btrfs_err(fs_info,
>   		"superblock fsid doesn't match fsid of fs_devices: %pU != %pU",
>   			  sb->fsid, fs_info->fs_devices->fsid);
> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
> index 203d2a267828..c6d124973361 100644
> --- a/fs/btrfs/fs.h
> +++ b/fs/btrfs/fs.h
> @@ -200,7 +200,8 @@ enum {
>   	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
>   	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
>   	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
> -	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
> +	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
> +	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
>
>   #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
>   #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index a895d105464b..56703d87def9 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2678,6 +2678,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	vol_args = memdup_user(arg, sizeof(*vol_args));
>   	if (IS_ERR(vol_args))
>   		return PTR_ERR(vol_args);
> @@ -2744,6 +2750,12 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	vol_args = memdup_user(arg, sizeof(*vol_args));
>   	if (IS_ERR(vol_args))
>   		return PTR_ERR(vol_args);
> @@ -3268,6 +3280,12 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info,
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
>   		btrfs_err(fs_info, "device replace not supported on extent tree v2 yet");
>   		return -EINVAL;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f1dd172d8d5b..ee87189b1ccd 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
>   				error = -ENOMEM;
>   				goto out;
>   			}
> -			device = btrfs_scan_one_device(device_name, flags);
> +			device = btrfs_scan_one_device(device_name, flags, true);
>   			kfree(device_name);
>   			if (IS_ERR(device)) {
>   				error = PTR_ERR(device);
> @@ -1478,7 +1478,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
>   		goto error_fs_info;
>   	}
>
> -	device = btrfs_scan_one_device(device_name, mode);
> +	device = btrfs_scan_one_device(device_name, mode, true);
>   	if (IS_ERR(device)) {
>   		mutex_unlock(&uuid_mutex);
>   		error = PTR_ERR(device);
> @@ -2190,7 +2190,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>   	switch (cmd) {
>   	case BTRFS_IOC_SCAN_DEV:
>   		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>   		ret = PTR_ERR_OR_ZERO(device);
>   		mutex_unlock(&uuid_mutex);
>   		break;
> @@ -2204,7 +2204,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>   		break;
>   	case BTRFS_IOC_DEVICES_READY:
>   		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>   		if (IS_ERR(device)) {
>   			mutex_unlock(&uuid_mutex);
>   			ret = PTR_ERR(device);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 73753dae111a..433a490f2de8 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -681,12 +681,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>   	return -EINVAL;
>   }
>
> -static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb)
> +static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb,
> +					     bool has_metadata_uuid,
> +					     bool single_dev)
>   {
> -	bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
> -				  BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
> +	if (has_metadata_uuid || single_dev)
> +		return sb->metadata_uuid;
>
> -	return has_metadata_uuid ? sb->metadata_uuid : NULL;
> +	return NULL;
>   }
>
>   u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)
> @@ -775,8 +777,36 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>
>   	return NULL;
>   }
> +
> +static void prepare_virtual_fsid(struct btrfs_super_block *disk_super,
> +				 const char *path)
> +{
> +	struct btrfs_fs_devices *fs_devices;
> +	u8 vfsid[BTRFS_FSID_SIZE];
> +	bool dup_fsid = true;
> +
> +	while (dup_fsid) {
> +		dup_fsid = false;
> +		generate_random_uuid(vfsid);
> +
> +		list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> +			if (!memcmp(vfsid, fs_devices->fsid, BTRFS_FSID_SIZE) ||
> +			    !memcmp(vfsid, fs_devices->metadata_uuid,
> +				    BTRFS_FSID_SIZE))
> +				dup_fsid = true;
> +		}
> +	}
> +
> +	memcpy(disk_super->metadata_uuid, disk_super->fsid, BTRFS_FSID_SIZE);
> +	memcpy(disk_super->fsid, vfsid, BTRFS_FSID_SIZE);
> +
> +	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
> +		disk_super->fsid, path, disk_super->metadata_uuid);
> +}
> +
>   /*
> - * Add new device to list of registered devices
> + * Add new device to list of registered devices, or in case of a SINGLE_DEV
> + * device, also creates a virtual fsid to cope with same-fsid cases.
>    *
>    * Returns:
>    * device pointer which was just added or updated when successful
> @@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>    */
>   static noinline struct btrfs_device *device_list_add(const char *path,
>   			   struct btrfs_super_block *disk_super,
> -			   bool *new_device_added)
> +			   bool *new_device_added, bool single_dev)
>   {
>   	struct btrfs_device *device;
>   	struct btrfs_fs_devices *fs_devices = NULL;
> @@ -805,23 +835,32 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>   		return ERR_PTR(error);
>   	}
>
> -	if (fsid_change_in_progress) {
> -		if (!has_metadata_uuid)
> -			fs_devices = find_fsid_inprogress(disk_super);
> -		else
> -			fs_devices = find_fsid_changed(disk_super);
> -	} else if (has_metadata_uuid) {
> -		fs_devices = find_fsid_with_metadata_uuid(disk_super);
> +	if (single_dev) {
> +		if (has_metadata_uuid || fsid_change_in_progress) {
> +			btrfs_err(NULL,
> +		"SINGLE_DEV devices don't support the metadata_uuid feature\n");
> +			return ERR_PTR(-EINVAL);
> +		}
> +		prepare_virtual_fsid(disk_super, path);
>   	} else {
> -		fs_devices = find_fsid_reverted_metadata(disk_super);
> -		if (!fs_devices)
> -			fs_devices = find_fsid(disk_super->fsid, NULL);
> +		if (fsid_change_in_progress) {
> +			if (!has_metadata_uuid)
> +				fs_devices = find_fsid_inprogress(disk_super);
> +			else
> +				fs_devices = find_fsid_changed(disk_super);
> +		} else if (has_metadata_uuid) {
> +			fs_devices = find_fsid_with_metadata_uuid(disk_super);
> +		} else {
> +			fs_devices = find_fsid_reverted_metadata(disk_super);
> +			if (!fs_devices)
> +				fs_devices = find_fsid(disk_super->fsid, NULL);
> +		}
>   	}
>
> -
>   	if (!fs_devices) {
>   		fs_devices = alloc_fs_devices(disk_super->fsid,
> -				btrfs_sb_metadata_uuid_or_null(disk_super));
> +				btrfs_sb_metadata_uuid_single_dev(disk_super,
> +							has_metadata_uuid, single_dev));
>   		if (IS_ERR(fs_devices))
>   			return ERR_CAST(fs_devices);
>
> @@ -1365,13 +1404,15 @@ int btrfs_forget_devices(dev_t devt)
>    * and we are not allowed to call set_blocksize during the scan. The superblock
>    * is read via pagecache
>    */
> -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
> +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> +					   bool mounting)
>   {
>   	struct btrfs_super_block *disk_super;
>   	bool new_device_added = false;
>   	struct btrfs_device *device = NULL;
>   	struct block_device *bdev;
>   	u64 bytenr, bytenr_orig;
> +	bool single_dev;
>   	int ret;
>
>   	lockdep_assert_held(&uuid_mutex);
> @@ -1410,7 +1451,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
>   		goto error_bdev_put;
>   	}
>
> -	device = device_list_add(path, disk_super, &new_device_added);
> +	single_dev = btrfs_super_compat_ro_flags(disk_super) &
> +			BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV;
> +
> +	if (!mounting && single_dev) {
> +		pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n",
> +			path);
> +		btrfs_release_disk_super(disk_super);
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	device = device_list_add(path, disk_super, &new_device_added, single_dev);
>   	if (!IS_ERR(device) && new_device_added)
>   		btrfs_free_stale_devices(device->devt, device);
>
> @@ -2406,6 +2457,12 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
>
>   	args->devid = btrfs_stack_device_id(&disk_super->dev_item);
>   	memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE);
> +
> +	/*
> +	 * Note that SINGLE_DEV devices are not handled in a special way here;
> +	 * device removal/replace is instead forbidden when such feature is
> +	 * present, this note is for future users/readers of this function.
> +	 */
>   	if (btrfs_fs_incompat(fs_info, METADATA_UUID))
>   		memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE);
>   	else
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 0f87057bb575..b9856c801567 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -611,7 +611,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
>   void btrfs_mapping_tree_free(struct extent_map_tree *tree);
>   int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
>   		       blk_mode_t flags, void *holder);
> -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags);
> +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> +					   bool mounting);
>   int btrfs_forget_devices(dev_t devt);
>   void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
>   void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices);
> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> index dbb8b96da50d..cb7a7cfe1ea9 100644
> --- a/include/uapi/linux/btrfs.h
> +++ b/include/uapi/linux/btrfs.h
> @@ -313,6 +313,13 @@ struct btrfs_ioctl_fs_info_args {
>    */
>   #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 3)
>
> +/*
> + * Single devices (as flagged by the corresponding compat_ro flag) only
> + * gets scanned during mount time; also, a random fsid is generated for
> + * them, in order to cope with same-fsid filesystem mounts.
> + */
> +#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV		(1ULL << 4)
> +
>   #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
>   #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
>   #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-04  8:27   ` Qu Wenruo
@ 2023-08-04 11:38     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-04 11:38 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis, wqu,
	vivek

On 04/08/2023 05:27, Qu Wenruo wrote:
> [...] 
> My concern is still about the "virtual" fsid part.
> 
> If we go virtual fsid, there can be some unexpected problems.
> 
> E.g. the /sys/fs/btrfs/<uuid>/ entry would be the new virtual one.
> 
> And there may be some other problems like user space UUID detection of
> mounted fs, thus I'm not 100% sure if this is a good idea.
> 
> However I don't have any better solution either, so this may be the
> least worst solution for now.
> 
> Thanks,
> Qu

Hi Qu, thanks for your analysis!

I think the virtual/spoofed fsid part is not without problems but I
consider it to be less prone to unexpected issues than not.

It's based on the metadata_uuid code, which is stable and present in
btrfs for like 5 years. Also, we don't need to "corner-case" a lot of
stuff to use that, which would be needed if we went to the pure dup fsid
route. I tried that and it breaks in a lot of places, to which we
require a lot of if conditionals (I even discussed that briefly in the
last thread with you, about sysfs, remember?).

So, despite not perfect, I agree with you that seems to be the least
worse solution :)

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature
  2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
                   ` (2 preceding siblings ...)
  2023-08-03 15:43 ` [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones Guilherme G. Piccoli
@ 2023-08-17 13:56 ` Guilherme G. Piccoli
  2023-08-17 14:19 ` Josef Bacik
  4 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-17 13:56 UTC (permalink / raw)
  To: dsterba
  Cc: clm, linux-btrfs, josef, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 03/08/2023 12:43, Guilherme G. Piccoli wrote:
> Hi all, this is the 2nd attempt of supporting same fsid mounting
> on btrfs. V1 is here:
> https://lore.kernel.org/linux-btrfs/20230504170708.787361-1-gpiccoli@igalia.com/
> 
> The mechanism used to achieve that in V2 was a mix between the suggestion
> from JohnS (spoofed fsid) and Qu (a single-dev compat_ro flag) - it is
> still based in the metadata_uuid feature, leveraging that infrastructure
> since it prevents lots of corner cases, like sysfs same-fsid crashes.
> 
> The patches are based on kernel v6.5-rc3 with Anand's metadata_uuid refactor
> part 2 on top of it [0]; the btrfs-progs patch is based on "v6.3.3".
> 
> Comments/suggestions and overall feedback is much appreciated - tnx in advance!
> Cheers,
> 
> Guilherme
> 
> 
> [0] https://lore.kernel.org/linux-btrfs/cover.1690792823.git.anand.jain@oracle.com/
> 
> 
> Guilherme G. Piccoli (3):
>   btrfs-progs: Add the single-dev feature (to both mkfs/tune)
>   btrfs: Introduce the single-dev feature
>   btrfs: Add parameter to force devices behave as single-dev ones


Hi David, friendly ping about this one - do you think we could have it
merged, or do you have suggestions to improve it maybe?

Thanks in advance and apologies for the ping!
Cheers,


Guilherme



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature
  2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
                   ` (3 preceding siblings ...)
  2023-08-17 13:56 ` [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
@ 2023-08-17 14:19 ` Josef Bacik
  2023-08-17 14:23   ` Guilherme G. Piccoli
  4 siblings, 1 reply; 22+ messages in thread
From: Josef Bacik @ 2023-08-17 14:19 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On Thu, Aug 03, 2023 at 12:43:38PM -0300, Guilherme G. Piccoli wrote:
> Hi all, this is the 2nd attempt of supporting same fsid mounting
> on btrfs. V1 is here:
> https://lore.kernel.org/linux-btrfs/20230504170708.787361-1-gpiccoli@igalia.com/
> 
> The mechanism used to achieve that in V2 was a mix between the suggestion
> from JohnS (spoofed fsid) and Qu (a single-dev compat_ro flag) - it is
> still based in the metadata_uuid feature, leveraging that infrastructure
> since it prevents lots of corner cases, like sysfs same-fsid crashes.
> 
> The patches are based on kernel v6.5-rc3 with Anand's metadata_uuid refactor
> part 2 on top of it [0]; the btrfs-progs patch is based on "v6.3.3".
> 
> Comments/suggestions and overall feedback is much appreciated - tnx in advance!
> Cheers,
> 
> Guilherme
> 

In general the concept is fine with me, and the implementation seems reasonable.

With new features we want fstests to accompany them so we know they work
correctly, and we don't accidentally break them in the future.

I'd like to see tests that validate all the behaviors you're trying to
accomplish work as advertised, and that all the failure cases do in fact fail
properly.

Ideally a test that creates a single device fs image and mounts it in multiple
places as would be used in the Steam Deck.

Then a test that tries to add a device to it, replace, etc.  All the cases that
you expect to fail, and validate that they actually fail.

Then any other corner cases you can think of that I haven't thought of.

Make sure these new tests skip appropriately if the btrfs-progs support doesn't
exist, I'd likely throw the fstests into our CI before the code is merged to
make sure it's ready to be tested if/when it is merged.

Thanks,

Josef

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature
  2023-08-17 14:19 ` Josef Bacik
@ 2023-08-17 14:23   ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-17 14:23 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 17/08/2023 11:19, Josef Bacik wrote:
> [...]
> In general the concept is fine with me, and the implementation seems reasonable.
> 
> With new features we want fstests to accompany them so we know they work
> correctly, and we don't accidentally break them in the future.
> 
> I'd like to see tests that validate all the behaviors you're trying to
> accomplish work as advertised, and that all the failure cases do in fact fail
> properly.
> 
> Ideally a test that creates a single device fs image and mounts it in multiple
> places as would be used in the Steam Deck.
> 
> Then a test that tries to add a device to it, replace, etc.  All the cases that
> you expect to fail, and validate that they actually fail.
> 
> Then any other corner cases you can think of that I haven't thought of.
> 
> Make sure these new tests skip appropriately if the btrfs-progs support doesn't
> exist, I'd likely throw the fstests into our CI before the code is merged to
> make sure it's ready to be tested if/when it is merged.
> 
> Thanks,
> 
> Josef
> 

Hi Josef, thanks a lot for your comprehensive response, it was pretty
helpful for me.

I agree with you, test cases are important indeed and I'll work them,
re-submitting a V3 with tests included.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
  2023-08-04  8:27   ` Qu Wenruo
@ 2023-08-17 15:41   ` Josef Bacik
  2023-08-17 16:20     ` Guilherme G. Piccoli
  2023-08-23 16:31   ` Anand Jain
  2023-08-29 20:28   ` Guilherme G. Piccoli
  3 siblings, 1 reply; 22+ messages in thread
From: Josef Bacik @ 2023-08-17 15:41 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On Thu, Aug 03, 2023 at 12:43:40PM -0300, Guilherme G. Piccoli wrote:
> Btrfs doesn't currently support to mount 2 different devices holding the
> same filesystem - the fsid is used as a unique identifier in the driver.
> This case is supported though in some other common filesystems, like
> ext4; one of the reasons for which is not trivial supporting this case
> on btrfs is due to its multi-device filesystem nature, native RAID, etc.
> 
> Supporting the same-fsid mounts has the advantage of allowing btrfs to
> be used in A/B partitioned devices, like mobile phones or the Steam Deck
> for example. Without this support, it's not safe for users to keep the
> same "image version" in both A and B partitions, a setup that is quite
> common for development, for example. Also, as a big bonus, it allows fs
> integrity check based on block devices for RO devices (whereas currently
> it is required that both have different fsid, breaking the block device
> hash comparison).
> 
> Such same-fsid mounting is hereby added through the usage of the
> filesystem feature "single-dev" - when such feature is used, btrfs
> generates a random fsid for the filesystem and leverages the long-term
> present metadata_uuid infrastructure to enable the usage of this
> secondary virtual fsid, effectively requiring few non-invasive changes
> to the code and no new potential corner cases.
> 
> In order to prevent more code complexity and corner cases, given
> the nature of this mechanism (single-devices), the single-dev feature
> is not allowed when the metadata_uuid flag is already present on the
> fs, or if the device is on fsid-change state. Device removal/replace
> is also disabled for devices presenting the single-dev feature.
> 
> Suggested-by: John Schoenick <johns@valvesoftware.com>
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
>  fs/btrfs/disk-io.c         | 19 +++++++-
>  fs/btrfs/fs.h              |  3 +-
>  fs/btrfs/ioctl.c           | 18 +++++++
>  fs/btrfs/super.c           |  8 ++--
>  fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
>  fs/btrfs/volumes.h         |  3 +-
>  include/uapi/linux/btrfs.h |  7 +++
>  7 files changed, 127 insertions(+), 28 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 669b10355091..455fa4949c98 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -320,7 +320,7 @@ static bool check_tree_block_fsid(struct extent_buffer *eb)
>  	/*
>  	 * alloc_fs_devices() copies the fsid into metadata_uuid if the
>  	 * metadata_uuid is unset in the superblock, including for a seed device.
> -	 * So, we can use fs_devices->metadata_uuid.
> +	 * So, we can use fs_devices->metadata_uuid; same for SINGLE_DEV devices.
>  	 */
>  	if (!memcmp(fsid, fs_info->fs_devices->metadata_uuid, BTRFS_FSID_SIZE))
>  		return false;
> @@ -2288,6 +2288,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>  {
>  	u64 nodesize = btrfs_super_nodesize(sb);
>  	u64 sectorsize = btrfs_super_sectorsize(sb);
> +	u8 *fsid;
>  	int ret = 0;
>  
>  	if (btrfs_super_magic(sb) != BTRFS_MAGIC) {
> @@ -2368,7 +2369,21 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>  		ret = -EINVAL;
>  	}
>  
> -	if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE)) {
> +	/*
> +	 * For SINGLE_DEV devices, btrfs creates a random fsid and makes
> +	 * use of the metadata_uuid infrastructure in order to allow, for
> +	 * example, two devices with same fsid getting mounted at the same
> +	 * time. But notice no changes happen at the disk level, so the
> +	 * random generated fsid is a driver abstraction, not to be written
> +	 * in the disk. That's the reason we're required here to compare the
> +	 * fsid with the metadata_uuid for such devices.
> +	 */
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
> +		fsid = fs_info->fs_devices->metadata_uuid;
> +	else
> +		fsid = fs_info->fs_devices->fsid;
> +
> +	if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) {
>  		btrfs_err(fs_info,
>  		"superblock fsid doesn't match fsid of fs_devices: %pU != %pU",
>  			  sb->fsid, fs_info->fs_devices->fsid);
> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
> index 203d2a267828..c6d124973361 100644
> --- a/fs/btrfs/fs.h
> +++ b/fs/btrfs/fs.h
> @@ -200,7 +200,8 @@ enum {
>  	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
>  	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
>  	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
> -	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
> +	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
> +	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
>  
>  #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
>  #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index a895d105464b..56703d87def9 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2678,6 +2678,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>  	vol_args = memdup_user(arg, sizeof(*vol_args));
>  	if (IS_ERR(vol_args))
>  		return PTR_ERR(vol_args);
> @@ -2744,6 +2750,12 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>  	vol_args = memdup_user(arg, sizeof(*vol_args));
>  	if (IS_ERR(vol_args))
>  		return PTR_ERR(vol_args);
> @@ -3268,6 +3280,12 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info,
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>  	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
>  		btrfs_err(fs_info, "device replace not supported on extent tree v2 yet");
>  		return -EINVAL;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f1dd172d8d5b..ee87189b1ccd 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
>  				error = -ENOMEM;
>  				goto out;
>  			}
> -			device = btrfs_scan_one_device(device_name, flags);
> +			device = btrfs_scan_one_device(device_name, flags, true);
>  			kfree(device_name);
>  			if (IS_ERR(device)) {
>  				error = PTR_ERR(device);
> @@ -1478,7 +1478,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
>  		goto error_fs_info;
>  	}
>  
> -	device = btrfs_scan_one_device(device_name, mode);
> +	device = btrfs_scan_one_device(device_name, mode, true);
>  	if (IS_ERR(device)) {
>  		mutex_unlock(&uuid_mutex);
>  		error = PTR_ERR(device);
> @@ -2190,7 +2190,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>  	switch (cmd) {
>  	case BTRFS_IOC_SCAN_DEV:
>  		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>  		ret = PTR_ERR_OR_ZERO(device);
>  		mutex_unlock(&uuid_mutex);
>  		break;
> @@ -2204,7 +2204,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>  		break;
>  	case BTRFS_IOC_DEVICES_READY:
>  		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>  		if (IS_ERR(device)) {
>  			mutex_unlock(&uuid_mutex);
>  			ret = PTR_ERR(device);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 73753dae111a..433a490f2de8 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -681,12 +681,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>  	return -EINVAL;
>  }
>  
> -static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb)
> +static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb,
> +					     bool has_metadata_uuid,
> +					     bool single_dev)

Why pass as an argument? Just do the same thing as the code currently does and
check for the single device ro compat flag.

>  {
> -	bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
> -				  BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
> +	if (has_metadata_uuid || single_dev)
> +		return sb->metadata_uuid;
>  
> -	return has_metadata_uuid ? sb->metadata_uuid : NULL;
> +	return NULL;
>  }
>  
>  u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)
> @@ -775,8 +777,36 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>  
>  	return NULL;
>  }
> +
> +static void prepare_virtual_fsid(struct btrfs_super_block *disk_super,
> +				 const char *path)
> +{
> +	struct btrfs_fs_devices *fs_devices;
> +	u8 vfsid[BTRFS_FSID_SIZE];
> +	bool dup_fsid = true;
> +
> +	while (dup_fsid) {
> +		dup_fsid = false;
> +		generate_random_uuid(vfsid);
> +
> +		list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> +			if (!memcmp(vfsid, fs_devices->fsid, BTRFS_FSID_SIZE) ||
> +			    !memcmp(vfsid, fs_devices->metadata_uuid,
> +				    BTRFS_FSID_SIZE))
> +				dup_fsid = true;
> +		}
> +	}
> +
> +	memcpy(disk_super->metadata_uuid, disk_super->fsid, BTRFS_FSID_SIZE);
> +	memcpy(disk_super->fsid, vfsid, BTRFS_FSID_SIZE);
> +
> +	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
> +		disk_super->fsid, path, disk_super->metadata_uuid);

I think just

btrfs_info(NULL, "virtual fsid....")

is fine here.

> +}
> +
>  /*
> - * Add new device to list of registered devices
> + * Add new device to list of registered devices, or in case of a SINGLE_DEV
> + * device, also creates a virtual fsid to cope with same-fsid cases.
>   *
>   * Returns:
>   * device pointer which was just added or updated when successful
> @@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>   */
>  static noinline struct btrfs_device *device_list_add(const char *path,
>  			   struct btrfs_super_block *disk_super,
> -			   bool *new_device_added)
> +			   bool *new_device_added, bool single_dev)

Same as the comment above.  Generally speaking for stuff like this where we can
derive the value local to the function we want to do that instead of growing the
argument list.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones
  2023-08-03 15:43 ` [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones Guilherme G. Piccoli
@ 2023-08-17 15:44   ` Josef Bacik
  2023-08-20 18:16     ` Guilherme G. Piccoli
  0 siblings, 1 reply; 22+ messages in thread
From: Josef Bacik @ 2023-08-17 15:44 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On Thu, Aug 03, 2023 at 12:43:41PM -0300, Guilherme G. Piccoli wrote:
> Devices with the single-dev feature enabled in their superblock are
> allowed to be mounted regardless of their fsid being already present
> in the system - the goal of such feature is to have the device in a
> single mode with no advanced features, like RAID; it is a compat_ro
> feature present since kernel v6.5.
> 
> The thing is that such feature comes in the form of a superblock flag,
> so devices that doesn't have it set, can't use the feature of course.
> The Steam Deck console aims to have block-based updates in its
> RO rootfs, and given its A/B partition nature, both block devices are
> required to be the same for their hash to match, so it's not possible
> to compare two images if one has this feature set in the superblock,
> while the other has not. So if we end-up having two old images, we
> couldn't make use of the single-dev feature to mount both at same time,
> or if we set the flag in one of them to enable the feature, we break
> the block-based hash comparison.
> 
> We propose here a module parameter approach to allow forcing any given
> path (to a device holding a btrfs filesystem) behaving as a single-dev
> device. That would useful for cases like the Steam Deck one, or for
> debug purposes. If the filesystem already has the compat_ro flag set
> in its superblock, the parameter is no-op.
> 

Now this one I'm not a fan of.  For old file systems you can simply btrfstune
them to have your new flag.  Is there a reason why that wouldn't be an option?

If it is indeed required, which is a huge if, I'd rather this be accomplished a
mount option.  I have a strong dislike for new mount options, but I think that's
a cleaner way to accomplish this than a module option.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune)
  2023-08-03 15:43 ` [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Guilherme G. Piccoli
@ 2023-08-17 15:46   ` Josef Bacik
  2023-08-17 16:16     ` Guilherme G. Piccoli
  0 siblings, 1 reply; 22+ messages in thread
From: Josef Bacik @ 2023-08-17 15:46 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On Thu, Aug 03, 2023 at 12:43:39PM -0300, Guilherme G. Piccoli wrote:
> The single-dev feature allows a device to be mounted regardless of
> its fsid already being present in another device - in other words,
> this feature disables RAID modes / metadata_uuid, allowing a single
> device per filesystem. Its goal is mainly to allow mounting the
> same fsid at the same time in the system.
> 
> Introduce hereby the feature to both mkfs (-O single-dev) and
> btrfstune (-s), syncing the kernel-shared headers as well. The
> feature is a compat_ro, its kernel version was set to v6.5.
> 
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
> 
> Hi folks, thanks in advance for reviews! Notice that I've added
> the feature to btrfstune as well, but I found docs online saying
> this tool is deprecated..so not sure if that was the proper approach.
> 
> Also, a design decision: I've skipped the btrfs_register_one_device()
> call when mkfs was just used with the single-dev tuning, or else
> it shows a (harmless) error and succeeds, since of course scanning
> fails for such devices, as per the feature implementation.
> So, I thought it was more straightforward to just skip the call itself.
> 

This is a reasonable approach.

> Cheers,
> 
> Guilherme
> 
>  common/fsfeatures.c        |  7 ++++
>  kernel-shared/ctree.h      |  3 +-
>  kernel-shared/uapi/btrfs.h |  7 ++++
>  mkfs/main.c                |  4 ++-
>  tune/main.c                | 72 +++++++++++++++++++++++---------------
>  5 files changed, 63 insertions(+), 30 deletions(-)
> 
> diff --git a/common/fsfeatures.c b/common/fsfeatures.c
> index 00658fa5159f..a320b7062b8c 100644
> --- a/common/fsfeatures.c
> +++ b/common/fsfeatures.c
> @@ -160,6 +160,13 @@ static const struct btrfs_feature mkfs_features[] = {
>  		VERSION_NULL(default),
>  		.desc		= "RAID1 with 3 or 4 copies"
>  	},
> +	{
> +		.name		= "single-dev",
> +		.compat_ro_flag	= BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV,
> +		.sysfs_name	= "single_dev",
> +		VERSION_TO_STRING2(compat, 6,5),
> +		.desc		= "single device (allows same fsid mounting)"
> +	},
>  #ifdef BTRFS_ZONED
>  	{
>  		.name		= "zoned",
> diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
> index 59533879b939..e3fd834aa6dd 100644
> --- a/kernel-shared/ctree.h
> +++ b/kernel-shared/ctree.h
> @@ -86,7 +86,8 @@ static inline u32 __BTRFS_LEAF_DATA_SIZE(u32 nodesize)
>  	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
>  	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
>  	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
> -	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
> +	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
> +	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
>  
>  #if EXPERIMENTAL
>  #define BTRFS_FEATURE_INCOMPAT_SUPP			\
> diff --git a/kernel-shared/uapi/btrfs.h b/kernel-shared/uapi/btrfs.h
> index 85b04f89a2a9..2e0ee6ef6446 100644
> --- a/kernel-shared/uapi/btrfs.h
> +++ b/kernel-shared/uapi/btrfs.h
> @@ -336,6 +336,13 @@ _static_assert(sizeof(struct btrfs_ioctl_fs_info_args) == 1024);
>   */
>  #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 3)
>  
> +/*
> + * Single devices (as flagged by the corresponding compat_ro flag) only
> + * gets scanned during mount time; also, a random fsid is generated for
> + * them, in order to cope with same-fsid filesystem mounts.
> + */
> +#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV		(1ULL << 4)
> +
>  #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
>  #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
>  #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
> diff --git a/mkfs/main.c b/mkfs/main.c
> index 972ed1112ea6..429799932224 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -1025,6 +1025,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
>  	char *label = NULL;
>  	int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN);
>  	char *source_dir = NULL;
> +	bool single_dev;
>  
>  	cpu_detect_flags();
>  	hash_init_accel();
> @@ -1218,6 +1219,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
>  		usage(&mkfs_cmd, 1);
>  
>  	opt_zoned = !!(features.incompat_flags & BTRFS_FEATURE_INCOMPAT_ZONED);
> +	single_dev = !!(features.compat_ro_flags & BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV);
>  
>  	if (source_dir && device_count > 1) {
>  		error("the option -r is limited to a single device");
> @@ -1815,7 +1817,7 @@ out:
>  		device_count = argc - optind;
>  		while (device_count-- > 0) {
>  			file = argv[optind++];
> -			if (path_is_block_device(file) == 1)
> +			if (path_is_block_device(file) == 1 && !single_dev)
>  				btrfs_register_one_device(file);
>  		}
>  	}
> diff --git a/tune/main.c b/tune/main.c
> index 0ca1e01282c9..95e55fcda44f 100644
> --- a/tune/main.c
> +++ b/tune/main.c
> @@ -42,27 +42,31 @@
>  #include "tune/tune.h"
>  #include "check/clear-cache.h"
>  
> +#define SET_SUPER_FLAGS(type) \
> +static int set_super_##type##_flags(struct btrfs_root *root, u64 flags) \
> +{									\
> +	struct btrfs_trans_handle *trans;				\
> +	struct btrfs_super_block *disk_super;				\
> +	u64 super_flags;						\
> +	int ret;							\
> +									\
> +	disk_super = root->fs_info->super_copy;				\
> +	super_flags = btrfs_super_##type##_flags(disk_super);		\
> +	super_flags |= flags;						\
> +	trans = btrfs_start_transaction(root, 1);			\
> +	BUG_ON(IS_ERR(trans));						\
> +	btrfs_set_super_##type##_flags(disk_super, super_flags);	\
> +	ret = btrfs_commit_transaction(trans, root);			\
> +									\
> +	return ret;							\
> +}
> +
> +SET_SUPER_FLAGS(incompat)
> +SET_SUPER_FLAGS(compat_ro)
> +
>  static char *device;
>  static int force = 0;
>  
> -static int set_super_incompat_flags(struct btrfs_root *root, u64 flags)
> -{
> -	struct btrfs_trans_handle *trans;
> -	struct btrfs_super_block *disk_super;
> -	u64 super_flags;
> -	int ret;
> -
> -	disk_super = root->fs_info->super_copy;
> -	super_flags = btrfs_super_incompat_flags(disk_super);
> -	super_flags |= flags;
> -	trans = btrfs_start_transaction(root, 1);
> -	BUG_ON(IS_ERR(trans));
> -	btrfs_set_super_incompat_flags(disk_super, super_flags);
> -	ret = btrfs_commit_transaction(trans, root);
> -
> -	return ret;
> -}
> -
>  static int convert_to_fst(struct btrfs_fs_info *fs_info)
>  {
>  	int ret;
> @@ -102,6 +106,7 @@ static const char * const tune_usage[] = {
>  	OPTLINE("-r", "enable extended inode refs (mkfs: extref, for hardlink limits)"),
>  	OPTLINE("-x", "enable skinny metadata extent refs (mkfs: skinny-metadata)"),
>  	OPTLINE("-n", "enable no-holes feature (mkfs: no-holes, more efficient sparse file representation)"),
> +	OPTLINE("-s", "enable single device feature (mkfs: single-dev, allows same fsid mounting)"),

btrfstune is going to be integrated into an actual btrfs command, so we're no
longer using short options for new btrfstune commands.  Figure out a long name
instead and use that only.  Something like

--convert-to-single-device

as you would be using this on an existing file system.  The rest of the code is
generally fine.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune)
  2023-08-17 15:46   ` Josef Bacik
@ 2023-08-17 16:16     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-17 16:16 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 17/08/2023 12:46, Josef Bacik wrote:
> [...]
>> Also, a design decision: I've skipped the btrfs_register_one_device()
>> call when mkfs was just used with the single-dev tuning, or else
>> it shows a (harmless) error and succeeds, since of course scanning
>> fails for such devices, as per the feature implementation.
>> So, I thought it was more straightforward to just skip the call itself.
>>
> 
> This is a reasonable approach.
> [...]

Thanks for the review =)


>> [...]
>>  static int convert_to_fst(struct btrfs_fs_info *fs_info)
>>  {
>>  	int ret;
>> @@ -102,6 +106,7 @@ static const char * const tune_usage[] = {
>>  	OPTLINE("-r", "enable extended inode refs (mkfs: extref, for hardlink limits)"),
>>  	OPTLINE("-x", "enable skinny metadata extent refs (mkfs: skinny-metadata)"),
>>  	OPTLINE("-n", "enable no-holes feature (mkfs: no-holes, more efficient sparse file representation)"),
>> +	OPTLINE("-s", "enable single device feature (mkfs: single-dev, allows same fsid mounting)"),
> 
> btrfstune is going to be integrated into an actual btrfs command, so we're no
> longer using short options for new btrfstune commands.  Figure out a long name
> instead and use that only.  Something like
> 
> --convert-to-single-device
> 
> as you would be using this on an existing file system.  The rest of the code is
> generally fine.  Thanks,
> 
> Josef
> 

Perfect, will do! Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-17 15:41   ` Josef Bacik
@ 2023-08-17 16:20     ` Guilherme G. Piccoli
  2023-08-17 16:58       ` Josef Bacik
  0 siblings, 1 reply; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-17 16:20 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 17/08/2023 12:41, Josef Bacik wrote:
>> [...]
>> +	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
>> +		disk_super->fsid, path, disk_super->metadata_uuid);
> 
> I think just
> 
> btrfs_info(NULL, "virtual fsid....")
> 
> is fine here.
> 

So just for my full understanding, do you think we shouldn't show the
real fsid here, but keep showing the virtual one, right? Or you prefer
we literally show "virtual fsid...."?


>> +}
>> +
>>  /*
>> - * Add new device to list of registered devices
>> + * Add new device to list of registered devices, or in case of a SINGLE_DEV
>> + * device, also creates a virtual fsid to cope with same-fsid cases.
>>   *
>>   * Returns:
>>   * device pointer which was just added or updated when successful
>> @@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>>   */
>>  static noinline struct btrfs_device *device_list_add(const char *path,
>>  			   struct btrfs_super_block *disk_super,
>> -			   bool *new_device_added)
>> +			   bool *new_device_added, bool single_dev)
> 
> Same as the comment above.  Generally speaking for stuff like this where we can
> derive the value local to the function we want to do that instead of growing the
> argument list.  Thanks,
> 
> Josef
> 

OK, will do it both here and above as you suggested. Thanks for the review!
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-17 16:20     ` Guilherme G. Piccoli
@ 2023-08-17 16:58       ` Josef Bacik
  2023-08-17 17:09         ` Guilherme G. Piccoli
  0 siblings, 1 reply; 22+ messages in thread
From: Josef Bacik @ 2023-08-17 16:58 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On Thu, Aug 17, 2023 at 01:20:55PM -0300, Guilherme G. Piccoli wrote:
> On 17/08/2023 12:41, Josef Bacik wrote:
> >> [...]
> >> +	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
> >> +		disk_super->fsid, path, disk_super->metadata_uuid);
> > 
> > I think just
> > 
> > btrfs_info(NULL, "virtual fsid....")
> > 
> > is fine here.
> > 
> 
> So just for my full understanding, do you think we shouldn't show the
> real fsid here, but keep showing the virtual one, right? Or you prefer
> we literally show "virtual fsid...."?

Oh no sorry, just swap pr_info for btrfs_info, and keep the rest the same, so

	btrfs_info("virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
		   disk_super->fsid, path, disk_super->metadata_uuid);

thanks,

Josef

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-17 16:58       ` Josef Bacik
@ 2023-08-17 17:09         ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-17 17:09 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 17/08/2023 13:58, Josef Bacik wrote:
> [...]
> Oh no sorry, just swap pr_info for btrfs_info, and keep the rest the same, so
> 
> 	btrfs_info("virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
> 		   disk_super->fsid, path, disk_super->metadata_uuid);
> 
> thanks,
> 
> Josef
> 

Oh OK, thanks! My apologies, I confused the stuff heh
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones
  2023-08-17 15:44   ` Josef Bacik
@ 2023-08-20 18:16     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-20 18:16 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, clm, dsterba, linux-fsdevel, kernel, kernel-dev,
	anand.jain, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

On 17/08/2023 12:44, Josef Bacik wrote:
> [...]
> Now this one I'm not a fan of.  For old file systems you can simply btrfstune
> them to have your new flag.  Is there a reason why that wouldn't be an option?
> 
> If it is indeed required, which is a huge if, I'd rather this be accomplished a
> mount option.  I have a strong dislike for new mount options, but I think that's
> a cleaner way to accomplish this than a module option.  Thanks,
> 

Thanks Josef, for your feedback. I'm also not a fan of this module
parameter, and agree that a mount option would be more interesting if
that's indeed a requirement.

But I've discussed internally and it seems we can live without this one,
I'll drop it for next version.

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
  2023-08-04  8:27   ` Qu Wenruo
  2023-08-17 15:41   ` Josef Bacik
@ 2023-08-23 16:31   ` Anand Jain
  2023-08-24 20:55     ` Guilherme G. Piccoli
  2023-08-29 20:28   ` Guilherme G. Piccoli
  3 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-08-23 16:31 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-btrfs
  Cc: clm, josef, dsterba, linux-fsdevel, kernel, kernel-dev, david,
	kreijack, johns, ludovico.denittis, quwenruo.btrfs, wqu, vivek



On 8/3/23 23:43, Guilherme G. Piccoli wrote:
> Btrfs doesn't currently support to mount 2 different devices holding the
> same filesystem - the fsid is used as a unique identifier in the driver.

fsid is for the external frontend, systemd, and udev stuff;
metadata_uuid pertains to the actual btrfs on-disk.

> This case is supported though in some other common filesystems, like
> ext4;

> one of the reasons for which is not trivial supporting this case
> on btrfs is due to its multi-device filesystem nature, native RAID, etc.

How is it related to the multi-device aspect? The main limitation is
that a disk image can be cloned without maintaining a unique device-
uuid.

> Supporting the same-fsid mounts has the advantage of allowing btrfs to
> be used in A/B partitioned devices, like mobile phones or the Steam Deck
> for example.

> Without this support, it's not safe for users to keep the
> same "image version" in both A and B partitions, a setup that is quite
> common for development, for example. Also, as a big bonus, it allows fs
> integrity check based on block devices for RO devices (whereas currently
> it is required that both have different fsid, breaking the block device
> hash comparison).

Does it apply to smaller disk images? Otherwise, it will be very 
time-consuming. Just curious, how is the checksum verified for the entire
disk? (Btrfs might provide a checksum tree-based solution at
some point.)

> Such same-fsid mounting is hereby added through the usage of the
> filesystem feature "single-dev" - when such feature is used, btrfs
> generates a random fsid for the filesystem and leverages the long-term
> present metadata_uuid infrastructure to enable the usage of this
> secondary virtual fsid, effectively requiring few non-invasive changes
> to the code and no new potential corner cases.
> 
> In order to prevent more code complexity and corner cases, given
> the nature of this mechanism (single-devices), the single-dev feature
> is not allowed when the metadata_uuid flag is already present on the
> fs, or if the device is on fsid-change state. Device removal/replace
> is also disabled for devices presenting the single-dev feature.
> 
> Suggested-by: John Schoenick <johns@valvesoftware.com>
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
>   fs/btrfs/disk-io.c         | 19 +++++++-
>   fs/btrfs/fs.h              |  3 +-
>   fs/btrfs/ioctl.c           | 18 +++++++
>   fs/btrfs/super.c           |  8 ++--
>   fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
>   fs/btrfs/volumes.h         |  3 +-
>   include/uapi/linux/btrfs.h |  7 +++
>   7 files changed, 127 insertions(+), 28 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 669b10355091..455fa4949c98 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -320,7 +320,7 @@ static bool check_tree_block_fsid(struct extent_buffer *eb)
>   	/*
>   	 * alloc_fs_devices() copies the fsid into metadata_uuid if the
>   	 * metadata_uuid is unset in the superblock, including for a seed device.
> -	 * So, we can use fs_devices->metadata_uuid.
> +	 * So, we can use fs_devices->metadata_uuid; same for SINGLE_DEV devices.
>   	 */
>   	if (!memcmp(fsid, fs_info->fs_devices->metadata_uuid, BTRFS_FSID_SIZE))
>   		return false;
> @@ -2288,6 +2288,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>   {
>   	u64 nodesize = btrfs_super_nodesize(sb);
>   	u64 sectorsize = btrfs_super_sectorsize(sb);
> +	u8 *fsid;
>   	int ret = 0;
>   
>   	if (btrfs_super_magic(sb) != BTRFS_MAGIC) {
> @@ -2368,7 +2369,21 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
>   		ret = -EINVAL;
>   	}
>   
> -	if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE)) {
> +	/*
> +	 * For SINGLE_DEV devices, btrfs creates a random fsid and makes
> +	 * use of the metadata_uuid infrastructure in order to allow, for
> +	 * example, two devices with same fsid getting mounted at the same
> +	 * time. But notice no changes happen at the disk level, so the
> +	 * random generated fsid is a driver abstraction, not to be written
> +	 * in the disk. That's the reason we're required here to compare the
> +	 * fsid with the metadata_uuid for such devices.
> +	 */
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
> +		fsid = fs_info->fs_devices->metadata_uuid;
> +	else
> +		fsid = fs_info->fs_devices->fsid;

Below alloc_fs_device(), fsid is still being kept equal to metadata_uuid
in memory for single_dev. So, this distinction is unnecessary.


> +
> +	if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) {

David prefers memcmp to be either compared to == or != to 0
depending on the requirement.


>   		btrfs_err(fs_info,
>   		"superblock fsid doesn't match fsid of fs_devices: %pU != %pU",
>   			  sb->fsid, fs_info->fs_devices->fsid);
> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
> index 203d2a267828..c6d124973361 100644
> --- a/fs/btrfs/fs.h
> +++ b/fs/btrfs/fs.h
> @@ -200,7 +200,8 @@ enum {
>   	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
>   	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
>   	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
> -	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
> +	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE |	\
> +	 BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV)
>   
>   #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
>   #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index a895d105464b..56703d87def9 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2678,6 +2678,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>   
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	vol_args = memdup_user(arg, sizeof(*vol_args));
>   	if (IS_ERR(vol_args))
>   		return PTR_ERR(vol_args);
> @@ -2744,6 +2750,12 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>   
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	vol_args = memdup_user(arg, sizeof(*vol_args));
>   	if (IS_ERR(vol_args))
>   		return PTR_ERR(vol_args);
> @@ -3268,6 +3280,12 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info,
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EPERM;
>   
> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) {
> +		btrfs_err(fs_info,
> +			  "device removal is unsupported on SINGLE_DEV devices\n");
> +		return -EINVAL;
> +	}
> +
>   	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
>   		btrfs_err(fs_info, "device replace not supported on extent tree v2 yet");
>   		return -EINVAL;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f1dd172d8d5b..ee87189b1ccd 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags)
>   				error = -ENOMEM;
>   				goto out;
>   			}
> -			device = btrfs_scan_one_device(device_name, flags);
> +			device = btrfs_scan_one_device(device_name, flags, true);
>   			kfree(device_name);
>   			if (IS_ERR(device)) {
>   				error = PTR_ERR(device);
> @@ -1478,7 +1478,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
>   		goto error_fs_info;
>   	}
>   
> -	device = btrfs_scan_one_device(device_name, mode);
> +	device = btrfs_scan_one_device(device_name, mode, true);
>   	if (IS_ERR(device)) {
>   		mutex_unlock(&uuid_mutex);
>   		error = PTR_ERR(device);
> @@ -2190,7 +2190,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>   	switch (cmd) {
>   	case BTRFS_IOC_SCAN_DEV:
>   		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>   		ret = PTR_ERR_OR_ZERO(device);
>   		mutex_unlock(&uuid_mutex);
>   		break;
> @@ -2204,7 +2204,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
>   		break;
>   	case BTRFS_IOC_DEVICES_READY:
>   		mutex_lock(&uuid_mutex);
> -		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ);
> +		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
>   		if (IS_ERR(device)) {
>   			mutex_unlock(&uuid_mutex);
>   			ret = PTR_ERR(device);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 73753dae111a..433a490f2de8 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -681,12 +681,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>   	return -EINVAL;
>   }
>   


> -static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb)
> +static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb,
> +					     bool has_metadata_uuid,
> +					     bool single_dev)
>   {
> -	bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
> -				  BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
> +	if (has_metadata_uuid || single_dev)
> +		return sb->metadata_uuid;
>   
> -	return has_metadata_uuid ? sb->metadata_uuid : NULL;
> +	return NULL;
>   }
>   
>   u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)

You can rebase the code onto the latest misc-next branch.
This is because we have dropped the function
btrfs_sb_metadata_uuid_or_null() in the final integration.


> @@ -775,8 +777,36 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>   
>   	return NULL;
>   }
> +
> +static void prepare_virtual_fsid(struct btrfs_super_block *disk_super,
> +				 const char *path)
> +{
> +	struct btrfs_fs_devices *fs_devices;
> +	u8 vfsid[BTRFS_FSID_SIZE];
> +	bool dup_fsid = true;
> +
> +	while (dup_fsid) {
> +		dup_fsid = false;
> +		generate_random_uuid(vfsid);
> +
> +		list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> +			if (!memcmp(vfsid, fs_devices->fsid, BTRFS_FSID_SIZE) ||
> +			    !memcmp(vfsid, fs_devices->metadata_uuid,
> +				    BTRFS_FSID_SIZE))
> +				dup_fsid = true;
> +		}
> +	}
> +
> +	memcpy(disk_super->metadata_uuid, disk_super->fsid, BTRFS_FSID_SIZE);
> +	memcpy(disk_super->fsid, vfsid, BTRFS_FSID_SIZE);
> +
> +	pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n",
> +		disk_super->fsid, path, disk_super->metadata_uuid);
> +}
> +
>   /*
> - * Add new device to list of registered devices
> + * Add new device to list of registered devices, or in case of a SINGLE_DEV
> + * device, also creates a virtual fsid to cope with same-fsid cases.
>    *
>    * Returns:
>    * device pointer which was just added or updated when successful
> @@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata(
>    */
>   static noinline struct btrfs_device *device_list_add(const char *path,
>   			   struct btrfs_super_block *disk_super,
> -			   bool *new_device_added)
> +			   bool *new_device_added, bool single_dev)
>   {
>   	struct btrfs_device *device;
>   	struct btrfs_fs_devices *fs_devices = NULL;
> @@ -805,23 +835,32 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>   		return ERR_PTR(error);
>   	}
>   
> -	if (fsid_change_in_progress) {
> -		if (!has_metadata_uuid)
> -			fs_devices = find_fsid_inprogress(disk_super);
> -		else
> -			fs_devices = find_fsid_changed(disk_super);
> -	} else if (has_metadata_uuid) {
> -		fs_devices = find_fsid_with_metadata_uuid(disk_super);
> +	if (single_dev) {
> +		if (has_metadata_uuid || fsid_change_in_progress) {
> +			btrfs_err(NULL,
> +		"SINGLE_DEV devices don't support the metadata_uuid feature\n");
> +			return ERR_PTR(-EINVAL);
> +		}
> +		prepare_virtual_fsid(disk_super, path);
>   	} else {
> -		fs_devices = find_fsid_reverted_metadata(disk_super);
> -		if (!fs_devices)
> -			fs_devices = find_fsid(disk_super->fsid, NULL);
> +		if (fsid_change_in_progress) {
> +			if (!has_metadata_uuid)
> +				fs_devices = find_fsid_inprogress(disk_super);
> +			else
> +				fs_devices = find_fsid_changed(disk_super);
> +		} else if (has_metadata_uuid) {
> +			fs_devices = find_fsid_with_metadata_uuid(disk_super);
> +		} else {
> +			fs_devices = find_fsid_reverted_metadata(disk_super);
> +			if (!fs_devices)
> +				fs_devices = find_fsid(disk_super->fsid, NULL);
> +		}
>   	}
>   
> -
>   	if (!fs_devices) {
>   		fs_devices = alloc_fs_devices(disk_super->fsid,
> -				btrfs_sb_metadata_uuid_or_null(disk_super));
> +				btrfs_sb_metadata_uuid_single_dev(disk_super,
> +							has_metadata_uuid, single_dev));

I think it is a good idea to rebase on latest misc-next and add the
below patch, as the arguments of alloc_fs_device() have been simplified.

    [PATCH resend] btrfs: simplify alloc_fs_devices() remove arg2


Thanks, Anand

>   		if (IS_ERR(fs_devices))
>   			return ERR_CAST(fs_devices);
>   
> @@ -1365,13 +1404,15 @@ int btrfs_forget_devices(dev_t devt)
>    * and we are not allowed to call set_blocksize during the scan. The superblock
>    * is read via pagecache
>    */
> -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
> +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> +					   bool mounting)
>   {
>   	struct btrfs_super_block *disk_super;
>   	bool new_device_added = false;
>   	struct btrfs_device *device = NULL;
>   	struct block_device *bdev;
>   	u64 bytenr, bytenr_orig;
> +	bool single_dev;
>   	int ret;
>   
>   	lockdep_assert_held(&uuid_mutex);
> @@ -1410,7 +1451,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags)
>   		goto error_bdev_put;
>   	}
>   
> -	device = device_list_add(path, disk_super, &new_device_added);
> +	single_dev = btrfs_super_compat_ro_flags(disk_super) &
> +			BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV;
> +
> +	if (!mounting && single_dev) {
> +		pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n",
> +			path);
> +		btrfs_release_disk_super(disk_super);
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	device = device_list_add(path, disk_super, &new_device_added, single_dev);
>   	if (!IS_ERR(device) && new_device_added)
>   		btrfs_free_stale_devices(device->devt, device);
>   
> @@ -2406,6 +2457,12 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
>   
>   	args->devid = btrfs_stack_device_id(&disk_super->dev_item);
>   	memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE);
> +
> +	/*
> +	 * Note that SINGLE_DEV devices are not handled in a special way here;
> +	 * device removal/replace is instead forbidden when such feature is
> +	 * present, this note is for future users/readers of this function.
> +	 */
>   	if (btrfs_fs_incompat(fs_info, METADATA_UUID))
>   		memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE);
>   	else
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 0f87057bb575..b9856c801567 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -611,7 +611,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
>   void btrfs_mapping_tree_free(struct extent_map_tree *tree);
>   int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
>   		       blk_mode_t flags, void *holder);
> -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags);
> +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> +					   bool mounting);
>   int btrfs_forget_devices(dev_t devt);
>   void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
>   void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices);
> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> index dbb8b96da50d..cb7a7cfe1ea9 100644
> --- a/include/uapi/linux/btrfs.h
> +++ b/include/uapi/linux/btrfs.h
> @@ -313,6 +313,13 @@ struct btrfs_ioctl_fs_info_args {
>    */
>   #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 3)
>   
> +/*
> + * Single devices (as flagged by the corresponding compat_ro flag) only
> + * gets scanned during mount time; also, a random fsid is generated for
> + * them, in order to cope with same-fsid filesystem mounts.
> + */
> +#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV		(1ULL << 4)
> +
>   #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
>   #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
>   #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-23 16:31   ` Anand Jain
@ 2023-08-24 20:55     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-24 20:55 UTC (permalink / raw)
  To: Anand Jain
  Cc: linux-btrfs, clm, josef, dsterba, linux-fsdevel, kernel,
	kernel-dev, david, kreijack, johns, ludovico.denittis,
	quwenruo.btrfs, wqu, vivek

Hi Anand, first of all, thanks for the review! More comments inline below.


On 23/08/2023 13:31, Anand Jain wrote:
> [...]
> On 8/3/23 23:43, Guilherme G. Piccoli wrote:
>> Btrfs doesn't currently support to mount 2 different devices holding the
>> same filesystem - the fsid is used as a unique identifier in the driver.
> 
> fsid is for the external frontend, systemd, and udev stuff;
> metadata_uuid pertains to the actual btrfs on-disk.
>

True, agreed - but I guess my phrase is not wrong per se, right? It's
more like an "abstraction" of the concept...I'm in fact talking about
the fsid as the exposed entity to the system, like for udev, etc.

I guess I'll keep it on V3 if you don't oppose.


> [...] 
>> one of the reasons for which is not trivial supporting this case
>> on btrfs is due to its multi-device filesystem nature, native RAID, etc.
> 
> How is it related to the multi-device aspect? The main limitation is
> that a disk image can be cloned without maintaining a unique device-
> uuid.
> 

Oh okay, I'll rephrase / drop this part, you're right - thanks.


> [...]
>> Without this support, it's not safe for users to keep the
>> same "image version" in both A and B partitions, a setup that is quite
>> common for development, for example. Also, as a big bonus, it allows fs
>> integrity check based on block devices for RO devices (whereas currently
>> it is required that both have different fsid, breaking the block device
>> hash comparison).
> 
> Does it apply to smaller disk images? Otherwise, it will be very 
> time-consuming. Just curious, how is the checksum verified for the entire
> disk? (Btrfs might provide a checksum tree-based solution at
> some point.)
>

The disk image is currently 5G, NVMe device - it's usually fast in our case.

After discussing internally with the folks knowledgeable about the
update process, it seems the Desync tool is responsible for that, taking
an index chunk-comparison during the update. Based in my quick check,
seems the codes related to that are here:

https://github.com/folbricht/desync/blob/master/verifyindex.go
https://github.com/folbricht/desync/blob/master/index.go


>> +	 */
>> +	if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV))
>> +		fsid = fs_info->fs_devices->metadata_uuid;
>> +	else
>> +		fsid = fs_info->fs_devices->fsid;
> 
> Below alloc_fs_device(), fsid is still being kept equal to metadata_uuid
> in memory for single_dev. So, this distinction is unnecessary.
> 
> 
>> +
>> +	if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) {
> 
> David prefers memcmp to be either compared to == or != to 0
> depending on the requirement.
> 

OK, thanks - I'll experiment dropping this part, and I see you have code
present in for-next, changing this specific memcmp. So hopefully don't
need my code anymore but if we do, I'll keep the convention =)


> [...]
>> -static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb)
>> +static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb,
>> +					     bool has_metadata_uuid,
>> +					     bool single_dev)
>>   {
>> -	bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
>> -				  BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
>> +	if (has_metadata_uuid || single_dev)
>> +		return sb->metadata_uuid;
>>   
>> -	return has_metadata_uuid ? sb->metadata_uuid : NULL;
>> +	return NULL;
>>   }
>>   
>>   u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)
> 
> You can rebase the code onto the latest misc-next branch.
> This is because we have dropped the function
> btrfs_sb_metadata_uuid_or_null() in the final integration.
> [...]
>> -				btrfs_sb_metadata_uuid_or_null(disk_super));
>> +				btrfs_sb_metadata_uuid_single_dev(disk_super,
>> +							has_metadata_uuid, single_dev));
> 
> I think it is a good idea to rebase on latest misc-next and add the
> below patch, as the arguments of alloc_fs_device() have been simplified.
> 
>     [PATCH resend] btrfs: simplify alloc_fs_devices() remove arg2
> 
> 

Very good idea, thanks for pointing that, will do it for V3!
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
                     ` (2 preceding siblings ...)
  2023-08-23 16:31   ` Anand Jain
@ 2023-08-29 20:28   ` Guilherme G. Piccoli
  2023-08-30  7:11     ` Anand Jain
  3 siblings, 1 reply; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-29 20:28 UTC (permalink / raw)
  To: anand.jain, dsterba, josef
  Cc: clm, linux-btrfs, linux-fsdevel, kernel, kernel-dev, david,
	kreijack, johns, ludovico.denittis, quwenruo.btrfs, wqu, vivek

On 03/08/2023 12:43, Guilherme G. Piccoli wrote:
> [...]
>  fs/btrfs/disk-io.c         | 19 +++++++-
>  fs/btrfs/fs.h              |  3 +-
>  fs/btrfs/ioctl.c           | 18 +++++++
>  fs/btrfs/super.c           |  8 ++--
>  fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
>  fs/btrfs/volumes.h         |  3 +-
>  include/uapi/linux/btrfs.h |  7 +++
>  7 files changed, 127 insertions(+), 28 deletions(-)
> 

Hi folks, while working the xfstests for this case, I've noticed the
single-dev feature is not exposed in sysfs! Should we make it available
there?

A quick change here made me see it there, but it sticks to value 0 ...
maybe I'm not really aware of how the sysfs/features directory should
work, hence I appreciate if you could enlighten me if makes sense to
have this feature there (and if it's OK showing zero or should flip in
case a device makes use of the feature, maybe?).

Thanks in advance,


Guilherme

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-29 20:28   ` Guilherme G. Piccoli
@ 2023-08-30  7:11     ` Anand Jain
  2023-08-30 12:00       ` Guilherme G. Piccoli
  0 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-08-30  7:11 UTC (permalink / raw)
  To: Guilherme G. Piccoli, dsterba, josef
  Cc: clm, linux-btrfs, linux-fsdevel, kernel, kernel-dev, david,
	kreijack, johns, ludovico.denittis, quwenruo.btrfs, wqu, vivek

On 30/08/2023 04:28, Guilherme G. Piccoli wrote:
> On 03/08/2023 12:43, Guilherme G. Piccoli wrote:
>> [...]
>>   fs/btrfs/disk-io.c         | 19 +++++++-
>>   fs/btrfs/fs.h              |  3 +-
>>   fs/btrfs/ioctl.c           | 18 +++++++
>>   fs/btrfs/super.c           |  8 ++--
>>   fs/btrfs/volumes.c         | 97 ++++++++++++++++++++++++++++++--------
>>   fs/btrfs/volumes.h         |  3 +-
>>   include/uapi/linux/btrfs.h |  7 +++
>>   7 files changed, 127 insertions(+), 28 deletions(-)
>>
> 
> Hi folks, while working the xfstests for this case, I've noticed the
> single-dev feature is not exposed in sysfs! Should we make it available
> there?
>  > A quick change here made me see it there, but it sticks to value 0 ...
> maybe I'm not really aware of how the sysfs/features directory should
> work, hence I appreciate if you could enlighten me if makes sense to
> have this feature there (and if it's OK showing zero or should flip in
> case a device makes use of the feature, maybe?).
> 

Yeah, we need sysfs entries for the new feature and test cases that 
generally rely on it to determine whether to skip or run the test case.

The paths would be:
- /sys/fs/btrfs/features/..
- /sys/fs/btrfs/<uuid>/features/..

These paths emit only output 0. The presence of the sysfs file indicates 
that the feature is supported.

Thanks,
Anand


> Thanks in advance,
> 
> 
> Guilherme


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] btrfs: Introduce the single-dev feature
  2023-08-30  7:11     ` Anand Jain
@ 2023-08-30 12:00       ` Guilherme G. Piccoli
  0 siblings, 0 replies; 22+ messages in thread
From: Guilherme G. Piccoli @ 2023-08-30 12:00 UTC (permalink / raw)
  To: Anand Jain
  Cc: clm, linux-btrfs, linux-fsdevel, kernel, kernel-dev, david,
	kreijack, johns, ludovico.denittis, quwenruo.btrfs, wqu, vivek,
	dsterba, josef

On 30/08/2023 04:11, Anand Jain wrote:
> [...]
> Yeah, we need sysfs entries for the new feature and test cases that 
> generally rely on it to determine whether to skip or run the test case.
> 
> The paths would be:
> - /sys/fs/btrfs/features/..
> - /sys/fs/btrfs/<uuid>/features/..
> 
> These paths emit only output 0. The presence of the sysfs file indicates 
> that the feature is supported.
> 
> Thanks,
> Anand
> 

Thanks a lot Anand!

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-08-30 18:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03 15:43 [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
2023-08-03 15:43 ` [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Guilherme G. Piccoli
2023-08-17 15:46   ` Josef Bacik
2023-08-17 16:16     ` Guilherme G. Piccoli
2023-08-03 15:43 ` [PATCH 2/3] btrfs: Introduce the single-dev feature Guilherme G. Piccoli
2023-08-04  8:27   ` Qu Wenruo
2023-08-04 11:38     ` Guilherme G. Piccoli
2023-08-17 15:41   ` Josef Bacik
2023-08-17 16:20     ` Guilherme G. Piccoli
2023-08-17 16:58       ` Josef Bacik
2023-08-17 17:09         ` Guilherme G. Piccoli
2023-08-23 16:31   ` Anand Jain
2023-08-24 20:55     ` Guilherme G. Piccoli
2023-08-29 20:28   ` Guilherme G. Piccoli
2023-08-30  7:11     ` Anand Jain
2023-08-30 12:00       ` Guilherme G. Piccoli
2023-08-03 15:43 ` [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones Guilherme G. Piccoli
2023-08-17 15:44   ` Josef Bacik
2023-08-20 18:16     ` Guilherme G. Piccoli
2023-08-17 13:56 ` [PATCH V2 0/3] Supporting same fsid mounting through a compat_ro feature Guilherme G. Piccoli
2023-08-17 14:19 ` Josef Bacik
2023-08-17 14:23   ` Guilherme G. Piccoli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.