All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
@ 2019-12-05 11:27 Anand Jain
  2019-12-05 11:27 ` [PATCH 1/4] btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid Anand Jain
                   ` (6 more replies)
  0 siblings, 7 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-05 11:27 UTC (permalink / raw)
  To: linux-btrfs

Patch 1/4 is a cleanup patch.
Patch 2/4 adds the kobject devinfo which is a preparatory to patch 4/4.
Patch 3/4 was sent before, no functional changes, change log is updated.
Patch 4/4 creates the attribute dev_state.

Anand Jain (4):
  btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
  btrfs: sysfs, add UUID/devinfo kobject
  btrfs: sysfs, rename device_link add,remove functions
  btrfs: sysfs, add devid/dev_state kobject and attribute

 fs/btrfs/dev-replace.c |   4 +-
 fs/btrfs/sysfs.c       | 130 +++++++++++++++++++++++++++++++----------
 fs/btrfs/sysfs.h       |   4 +-
 fs/btrfs/volumes.c     |   8 +--
 fs/btrfs/volumes.h     |   3 +
 5 files changed, 111 insertions(+), 38 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/4] btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
@ 2019-12-05 11:27 ` Anand Jain
  2019-12-05 11:27 ` [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject Anand Jain
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-05 11:27 UTC (permalink / raw)
  To: linux-btrfs

We have one simple function btrfs_sysfs_remove_fsid() to cleanup
kobjects added by btrfs_sysfs_add_fsid() and calls kobject_put() and
kobject_delete() only if the kobject is initialized or not null.
So use this function while retreating in the function
btrfs_sysfs_add_fsid().

One difference though, earlier we did not call kobject_del() during
retreat in btrfs_sysfs_add_fsid() and I did experiment to figureout
if that's an error or warning, however I didn't notice any such issues
with or without kobject_del() not being called.

So this patch is just for cleanup.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 16379f491ca1..4cda410f0e6b 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1076,7 +1076,7 @@ int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs)
 	if (!fs_devs->devices_kobj) {
 		btrfs_err(fs_devs->fs_info,
 			  "failed to init sysfs device interface");
-		kobject_put(&fs_devs->fsid_kobj);
+		btrfs_sysfs_remove_fsid(fs_devs);
 		return -ENOMEM;
 	}
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
  2019-12-05 11:27 ` [PATCH 1/4] btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid Anand Jain
@ 2019-12-05 11:27 ` Anand Jain
  2020-01-14 18:32   ` David Sterba
  2019-12-05 11:27 ` [PATCH v2 3/4] btrfs: sysfs, rename device_link add,remove functions Anand Jain
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-05 11:27 UTC (permalink / raw)
  To: linux-btrfs

Preparatory patch creates kobject /sys/fs/btrfs/UUID/devinfo to hold
btrfs_fs_devices::dev_state attribute.

This is being added in the mount context, that means we don't see
UUID/devinfo before the mount (yet).

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c   | 15 +++++++++++++++
 fs/btrfs/volumes.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 4cda410f0e6b..af169970e818 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -734,6 +734,12 @@ static int addrm_unknown_feature_attrs(struct btrfs_fs_info *fs_info, bool add)
 
 static void __btrfs_sysfs_remove_fsid(struct btrfs_fs_devices *fs_devs)
 {
+	if (fs_devs->devinfo_kobj) {
+		kobject_del(fs_devs->devinfo_kobj);
+		kobject_put(fs_devs->devinfo_kobj);
+		fs_devs->devinfo_kobj = NULL;
+	}
+
 	if (fs_devs->devices_kobj) {
 		kobject_del(fs_devs->devices_kobj);
 		kobject_put(fs_devs->devices_kobj);
@@ -1080,6 +1086,15 @@ int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs)
 		return -ENOMEM;
 	}
 
+	fs_devs->devinfo_kobj = kobject_create_and_add("devinfo",
+						       &fs_devs->fsid_kobj);
+	if (!fs_devs->devinfo_kobj) {
+		btrfs_err(fs_devs->fs_info,
+			  "failed to init sysfs devinfo kobject");
+		btrfs_sysfs_remove_fsid(fs_devs);
+		return -ENOMEM;
+	}
+
 	return 0;
 }
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 3c56ef571b00..38f2e8437b68 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -256,6 +256,7 @@ struct btrfs_fs_devices {
 	/* sysfs kobjects */
 	struct kobject fsid_kobj;
 	struct kobject *devices_kobj;
+	struct kobject *devinfo_kobj;
 	struct completion kobj_unregister;
 };
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/4] btrfs: sysfs, rename device_link add,remove functions
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
  2019-12-05 11:27 ` [PATCH 1/4] btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid Anand Jain
  2019-12-05 11:27 ` [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject Anand Jain
@ 2019-12-05 11:27 ` Anand Jain
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-05 11:27 UTC (permalink / raw)
  To: linux-btrfs

In preparation to add btrfs_device::dev_state attribute in
  /sys/fs/btrfs/UUID/devinfo

Rename btrfs_sysfs_add_device_link() and btrfs_sysfs_rm_device_link() to
btrfs_sysfs_add_devices_attr() and btrfs_sysfs_remove_devices_attr() as
these functions is going to create dev_state attribute rather than just
the link to the disk. No functional changes.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v2: Update change log. Change ..UUID/devices to ..UUID/devinfo

 fs/btrfs/dev-replace.c |  4 ++--
 fs/btrfs/sysfs.c       | 12 ++++++------
 fs/btrfs/sysfs.h       |  4 ++--
 fs/btrfs/volumes.c     |  8 ++++----
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index f639dde2a679..9a29d6de9017 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -472,7 +472,7 @@ static int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info,
 	atomic64_set(&dev_replace->num_uncorrectable_read_errors, 0);
 	up_write(&dev_replace->rwsem);
 
-	ret = btrfs_sysfs_add_device_link(tgt_device->fs_devices, tgt_device);
+	ret = btrfs_sysfs_add_devices_attr(tgt_device->fs_devices, tgt_device);
 	if (ret)
 		btrfs_err(fs_info, "kobj add dev failed %d", ret);
 
@@ -706,7 +706,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 
 	/* replace the sysfs entry */
-	btrfs_sysfs_rm_device_link(fs_info->fs_devices, src_device);
+	btrfs_sysfs_remove_devices_attr(fs_info->fs_devices, src_device);
 	btrfs_rm_dev_replace_free_srcdev(src_device);
 
 	/* write back the superblocks */
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index af169970e818..834f712ed60c 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -780,7 +780,7 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info)
 	addrm_unknown_feature_attrs(fs_info, false);
 	sysfs_remove_group(&fs_info->fs_devices->fsid_kobj, &btrfs_feature_attr_group);
 	sysfs_remove_files(&fs_info->fs_devices->fsid_kobj, btrfs_attrs);
-	btrfs_sysfs_rm_device_link(fs_info->fs_devices, NULL);
+	btrfs_sysfs_remove_devices_attr(fs_info->fs_devices, NULL);
 }
 
 static const char * const btrfs_feature_set_names[FEAT_MAX] = {
@@ -969,7 +969,7 @@ int btrfs_sysfs_add_space_info_type(struct btrfs_fs_info *fs_info,
 
 /* when one_device is NULL, it removes all device links */
 
-int btrfs_sysfs_rm_device_link(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 		struct btrfs_device *one_device)
 {
 	struct hd_struct *disk;
@@ -1001,8 +1001,8 @@ int btrfs_sysfs_rm_device_link(struct btrfs_fs_devices *fs_devices,
 	return 0;
 }
 
-int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
-				struct btrfs_device *one_device)
+int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
+				 struct btrfs_device *one_device)
 {
 	int error = 0;
 	struct btrfs_device *dev;
@@ -1106,13 +1106,13 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 
 	btrfs_set_fs_info_ptr(fs_info);
 
-	error = btrfs_sysfs_add_device_link(fs_devs, NULL);
+	error = btrfs_sysfs_add_devices_attr(fs_devs, NULL);
 	if (error)
 		return error;
 
 	error = sysfs_create_files(fsid_kobj, btrfs_attrs);
 	if (error) {
-		btrfs_sysfs_rm_device_link(fs_devs, NULL);
+		btrfs_sysfs_remove_devices_attr(fs_devs, NULL);
 		return error;
 	}
 
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index 3d27b39eaf94..9d97b3c8db4e 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -14,9 +14,9 @@ enum btrfs_feature_set {
 
 char *btrfs_printable_features(enum btrfs_feature_set set, u64 flags);
 const char * const btrfs_feature_set_name(enum btrfs_feature_set set);
-int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 		struct btrfs_device *one_device);
-int btrfs_sysfs_rm_device_link(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
                 struct btrfs_device *one_device);
 int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs);
 void btrfs_sysfs_remove_fsid(struct btrfs_fs_devices *fs_devs);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c565650639ee..1d80ac164fe3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2014,7 +2014,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	if (device->bdev) {
 		cur_devices->open_devices--;
 		/* remove sysfs entry */
-		btrfs_sysfs_rm_device_link(fs_devices, device);
+		btrfs_sysfs_remove_devices_attr(fs_devices, device);
 	}
 
 	num_devices = btrfs_super_num_devices(fs_info->super_copy) - 1;
@@ -2135,7 +2135,7 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_device *tgtdev)
 	WARN_ON(!tgtdev);
 	mutex_lock(&fs_devices->device_list_mutex);
 
-	btrfs_sysfs_rm_device_link(fs_devices, tgtdev);
+	btrfs_sysfs_remove_devices_attr(fs_devices, tgtdev);
 
 	if (tgtdev->bdev)
 		fs_devices->open_devices--;
@@ -2483,7 +2483,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 				    orig_super_num_devices + 1);
 
 	/* add sysfs device entry */
-	btrfs_sysfs_add_device_link(fs_devices, device);
+	btrfs_sysfs_add_devices_attr(fs_devices, device);
 
 	/*
 	 * we've got more storage, clear any full flags on the space
@@ -2551,7 +2551,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	return ret;
 
 error_sysfs:
-	btrfs_sysfs_rm_device_link(fs_devices, device);
+	btrfs_sysfs_remove_devices_attr(fs_devices, device);
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
 	mutex_lock(&fs_info->chunk_mutex);
 	list_del_rcu(&device->dev_list);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
                   ` (2 preceding siblings ...)
  2019-12-05 11:27 ` [PATCH v2 3/4] btrfs: sysfs, rename device_link add,remove functions Anand Jain
@ 2019-12-05 11:27 ` Anand Jain
  2019-12-05 14:10   ` David Sterba
                     ` (3 more replies)
  2019-12-05 15:00 ` [PATCH 0/4] btrfs, sysfs cleanup and add dev_state David Sterba
                   ` (2 subsequent siblings)
  6 siblings, 4 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-05 11:27 UTC (permalink / raw)
  To: linux-btrfs

A new sysfs RW attribute
  UUID/devinfo/<devid>/dev_state
is added here. The dev_state here reflects the state of the device from
the kernel parameter btrfs_device::dev_state and this attribute is born
after the device is mounted and goes along with the dynamic nature of the
device add and delete. This attribute gets deleted at unmount.

For example:
pwd
 /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
cat 1/dev_state
 IN_FS_METADATA MISSING
cat 2/dev_state
 WRITABLE IN_FS_METADATA

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c   | 101 ++++++++++++++++++++++++++++++++++-----------
 fs/btrfs/volumes.h |   2 +
 2 files changed, 80 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 834f712ed60c..99c7269e0524 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -978,29 +978,76 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	if (!fs_devices->devices_kobj)
 		return -EINVAL;
 
-	if (one_device && one_device->bdev) {
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	if (one_device) {
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
-	}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 
-	if (one_device)
 		return 0;
+	}
 
-	list_for_each_entry(one_device,
-			&fs_devices->devices, dev_list) {
-		if (!one_device->bdev)
-			continue;
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 	}
 
 	return 0;
 }
 
+static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
+{
+	int state;
+	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
+					   "MISSING", "REPLACE_TGT", "FLUSHING"
+					 };
+
+	n = n - strlen(dev_state_str) - 1;
+
+	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
+		if (test_bit(state, &device->dev_state)) {
+			if (strlen(dev_state_str))
+				strncat(dev_state_str, " ", n);
+			strncat(dev_state_str, btrfs_dev_states[state], n);
+		}
+	}
+	strncat(dev_state_str, "\n", n);
+}
+
+static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
+					  struct kobj_attribute *a, char *buf)
+{
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
+	return strlen(buf);
+}
+
+BTRFS_ATTR(, dev_state, btrfs_sysfs_dev_state_show);
+
+static struct attribute *devinfo_attrs[] = {
+	BTRFS_ATTR_PTR(, dev_state),
+	NULL
+};
+
+ATTRIBUTE_GROUPS(devinfo);
+
+static struct kobj_type devinfo_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = devinfo_groups,
+};
+
 int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 				 struct btrfs_device *one_device)
 {
@@ -1008,22 +1055,30 @@ int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *dev;
 
 	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		if (!dev->bdev)
-			continue;
 
 		if (one_device && one_device != dev)
 			continue;
 
-		disk = dev->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		if (dev->bdev) {
+			struct hd_struct *disk;
+			struct kobject *disk_kobj;
+
+			disk = dev->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
 
-		error = sysfs_create_link(fs_devices->devices_kobj,
-					  disk_kobj, disk_kobj->name);
-		if (error)
+			error = sysfs_create_link(fs_devices->devices_kobj,
+						  disk_kobj, disk_kobj->name);
+			if (error)
+				break;
+		}
+
+		error = kobject_init_and_add(&dev->devid_kobj, &devinfo_ktype,
+					     fs_devices->devinfo_kobj, "%llu",
+					     dev->devid);
+		if (error) {
+			kobject_put(&dev->devid_kobj);
 			break;
+		}
 	}
 
 	return error;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 38f2e8437b68..68021d1ee216 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,8 @@ struct btrfs_device {
 	atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
 
 	struct extent_io_tree alloc_state;
+
+	struct kobject devid_kobj;
 };
 
 /*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
@ 2019-12-05 14:10   ` David Sterba
  2019-12-05 14:38     ` Anand Jain
  2019-12-05 14:21   ` David Sterba
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-05 14:10 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
> +static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
> +{
> +	int state;
> +	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
> +					   "MISSING", "REPLACE_TGT", "FLUSHING"
> +					 };
> +
> +	n = n - strlen(dev_state_str) - 1;
> +
> +	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
> +		if (test_bit(state, &device->dev_state)) {
> +			if (strlen(dev_state_str))
> +				strncat(dev_state_str, " ", n);
> +			strncat(dev_state_str, btrfs_dev_states[state], n);
> +		}
> +	}
> +	strncat(dev_state_str, "\n", n);

Please use the snprintf way as in supported_checksums_show and not the
str* functions.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
  2019-12-05 14:10   ` David Sterba
@ 2019-12-05 14:21   ` David Sterba
  2019-12-05 14:38     ` Anand Jain
  2019-12-09 14:06   ` [PATCH v2 " Anand Jain
  2020-01-06 11:38   ` [PATCH v4] " Anand Jain
  3 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-05 14:21 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
> A new sysfs RW attribute

Why RW? Most attributes/files in sysfs are supposed to be read-only, but
you actually define the attribute as read-only with the macro
BTRFS_ATTR.

>   UUID/devinfo/<devid>/dev_state
> is added here. The dev_state here reflects the state of the device from
> the kernel parameter btrfs_device::dev_state and this attribute is born
> after the device is mounted and goes along with the dynamic nature of the
> device add and delete. This attribute gets deleted at unmount.
> 
> For example:
> pwd
>  /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
> cat 1/dev_state
>  IN_FS_METADATA MISSING
> cat 2/dev_state
>  WRITABLE IN_FS_METADATA

So the values copy the device state macros, that's probably ok.

> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> +					  struct kobj_attribute *a, char *buf)
> +{
> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
> +						   devid_kobj);
> +
> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);

The device access is unprotected, you need at least RCU but that still
does not prevent from the device being freed by deletion. The
device_list_mutex is quite heavy and allowing a DoS by repeatedly
reading the file contents is not something we want to allow.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 14:21   ` David Sterba
@ 2019-12-05 14:38     ` Anand Jain
  2019-12-05 15:14       ` David Sterba
  0 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-05 14:38 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 12/5/19 10:21 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
>> A new sysfs RW attribute
> 
> Why RW? Most attributes/files in sysfs are supposed to be read-only, but
> you actually define the attribute as read-only with the macro
> BTRFS_ATTR.
> 

  Oh. Change log has to be fixed here.

  But I am also working on a new patch to make this RW, to set the
  new read_preferred state on the device.

>>    UUID/devinfo/<devid>/dev_state
>> is added here. The dev_state here reflects the state of the device from
>> the kernel parameter btrfs_device::dev_state and this attribute is born
>> after the device is mounted and goes along with the dynamic nature of the
>> device add and delete. This attribute gets deleted at unmount.
>>
>> For example:
>> pwd
>>   /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
>> cat 1/dev_state
>>   IN_FS_METADATA MISSING
>> cat 2/dev_state
>>   WRITABLE IN_FS_METADATA
> 
> So the values copy the device state macros, that's probably ok.
> 
  Yep.

>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>> +					  struct kobj_attribute *a, char *buf)
>> +{
>> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
>> +						   devid_kobj);
>> +
>> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> 
> The device access is unprotected, you need at least RCU but that still
> does not prevent from the device being freed by deletion.

  We need RCU let me fix. Device being deleted is fine, there
  is nothing to lose, another directory lookup will show that
  UUID/devinfo/<devid> is gone is the device is deleted.

> The
> device_list_mutex is quite heavy and allowing a DoS by repeatedly
> reading the file contents is not something we want to allow.
> 

   Yes we don't have to use device_list_mutex here, as its read,
   a refresh/re-read will refresh the dev_state.

Thanks, Anand



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 14:10   ` David Sterba
@ 2019-12-05 14:38     ` Anand Jain
  0 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-05 14:38 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 12/5/19 10:10 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
>> +static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
>> +{
>> +	int state;
>> +	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
>> +					   "MISSING", "REPLACE_TGT", "FLUSHING"
>> +					 };
>> +
>> +	n = n - strlen(dev_state_str) - 1;
>> +
>> +	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
>> +		if (test_bit(state, &device->dev_state)) {
>> +			if (strlen(dev_state_str))
>> +				strncat(dev_state_str, " ", n);
>> +			strncat(dev_state_str, btrfs_dev_states[state], n);
>> +		}
>> +	}
>> +	strncat(dev_state_str, "\n", n);
> 
> Please use the snprintf way as in supported_checksums_show and not the
> str* functions.
> 

Ok. Will fix.

Thanks, Anand

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
                   ` (3 preceding siblings ...)
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
@ 2019-12-05 15:00 ` David Sterba
  2019-12-06 13:46   ` Anand Jain
  2019-12-09 22:48 ` Anand Jain
  2020-01-14 18:39 ` David Sterba
  6 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-05 15:00 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Dec 05, 2019 at 07:27:02PM +0800, Anand Jain wrote:
> Anand Jain (4):
>   btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
>   btrfs: sysfs, add UUID/devinfo kobject
>   btrfs: sysfs, rename device_link add,remove functions
>   btrfs: sysfs, add devid/dev_state kobject and attribute

So we can start adding things to devinfo, I did a quick test how the
sysfs directory looks like, the base structure seems ok.

Unlike other sources for sysfs file data (like superblock), the devices
can appear and disappear during the lifetime of the filesystem and as
pointed out in the patches, some synchronization is needed.

But it could be more tricky. Reading from the sysfs files should not
block normal operation (no device_list_mutex) but also must not lead to
use-after-free in case the device gets deleted.

I haven't found a simple locking scheme that would avoid accessing a
freed device structure, the sysfs callback can happen at any time and
the structure can be freed already.

So this means that btrfs_sysfs_dev_state_show cannot access it directly
(using offsetof(kobj, ...)). The safe (but not necessarily the best) way
I have so far is to track the device kobjects in the superblock and add
own lock for accessing this structure.

This avoids increasing contention of device_list_mutex, each sysfs
callback needs to take the lock first, lookup the device and print the
value if it's found. Otherwise we know the device is gone.

The lock is rwlock_t, sysfs callbacks take read-side, device deletion
takes write possibly outside of the device_list_mutex before the device
is actually going to be deleted. This relies on fairness of the lock so
the write will happen eventually (even if there are many readers).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 14:38     ` Anand Jain
@ 2019-12-05 15:14       ` David Sterba
  2019-12-06 13:49         ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-05 15:14 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs

On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
> > So the values copy the device state macros, that's probably ok.
>   Yep.

Although, sysfs files should print one value per file, which makes sense
in many cases, so eg. missing should exist separately too for quick
checks for the most common device states. The dev_state reflects the
internal state and is likely useful only for debugging.

> >> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> >> +					  struct kobj_attribute *a, char *buf)
> >> +{
> >> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
> >> +						   devid_kobj);
> >> +
> >> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> > 
> > The device access is unprotected, you need at least RCU but that still
> > does not prevent from the device being freed by deletion.
> 
>   We need RCU let me fix. Device being deleted is fine, there
>   is nothing to lose, another directory lookup will show that
>   UUID/devinfo/<devid> is gone is the device is deleted.

The device can be gone from the list but the sysfs files can still
exist.

    CPU1                                   CPU2

btrfs_rm_device
                                        open file
  btrfs_sysfs_rm_device_link
    btrfs_free_device
      kfree(device)
                                        call read, access freed device

> > The
> > device_list_mutex is quite heavy and allowing a DoS by repeatedly
> > reading the file contents is not something we want to allow.
> > 
> 
>    Yes we don't have to use device_list_mutex here, as its read,
>    a refresh/re-read will refresh the dev_state.

The point is not to synchronize the device state values but the device
object itself.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-05 15:00 ` [PATCH 0/4] btrfs, sysfs cleanup and add dev_state David Sterba
@ 2019-12-06 13:46   ` Anand Jain
  2019-12-09 14:09     ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-06 13:46 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 5/12/19 11:00 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:02PM +0800, Anand Jain wrote:
>> Anand Jain (4):
>>    btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
>>    btrfs: sysfs, add UUID/devinfo kobject
>>    btrfs: sysfs, rename device_link add,remove functions
>>    btrfs: sysfs, add devid/dev_state kobject and attribute
> 
> So we can start adding things to devinfo, I did a quick test how the
> sysfs directory looks like, the base structure seems ok.
> 
> Unlike other sources for sysfs file data (like superblock), the devices
> can appear and disappear during the lifetime of the filesystem and as
> pointed out in the patches, some synchronization is needed.
> 
> But it could be more tricky. Reading from the sysfs files should not
> block normal operation (no device_list_mutex) but also must not lead to
> use-after-free in case the device gets deleted.
> 
> I haven't found a simple locking scheme that would avoid accessing a
> freed device structure, the sysfs callback can happen at any time and
> the structure can be freed already.
> 
> So this means that btrfs_sysfs_dev_state_show cannot access it directly
> (using offsetof(kobj, ...)). The safe (but not necessarily the best) way
> I have so far is to track the device kobjects in the superblock and add
> own lock for accessing this structure.
> 
> This avoids increasing contention of device_list_mutex, each sysfs
> callback needs to take the lock first, lookup the device and print the
> value if it's found. Otherwise we know the device is gone.



> The lock is rwlock_t, sysfs callbacks take read-side, device deletion
> takes write possibly outside of the device_list_mutex before the device
> is actually going to be deleted. This relies on fairness of the lock so
> the write will happen eventually (even if there are many readers).
> 

  Yeah this makes sense to me. I completely forgot about the %device
  getting deleted while sysfs is reading. Let me fix in the patch 4/4.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 15:14       ` David Sterba
@ 2019-12-06 13:49         ` Anand Jain
  2019-12-09 14:05           ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-06 13:49 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 5/12/19 11:14 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
>>> So the values copy the device state macros, that's probably ok.
>>    Yep.
> 
> Although, sysfs files should print one value per file, which makes sense
> in many cases, so eg. missing should exist separately too for quick
> checks for the most common device states. The dev_state reflects the
> internal state and is likely useful only for debugging.
> 

  I agree. Its better to create an individual attribute for each of the
  device states. For instance..

    under the 'UUID/devinfo/<devid>' kobject
    attributes will be:
      missing
      in_fs_metadata
      replace_target

   cat missing
    0
   cat in_fs_metadata
    1

   ..etc

  which seems to be more or less standard for block devices.

  Will fix it in v2.

>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>>>> +					  struct kobj_attribute *a, char *buf)
>>>> +{
>>>> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
>>>> +						   devid_kobj);
>>>> +
>>>> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
>>>
>>> The device access is unprotected, you need at least RCU but that still
>>> does not prevent from the device being freed by deletion.
>>
>>    We need RCU let me fix. Device being deleted is fine, there
>>    is nothing to lose, another directory lookup will show that
>>    UUID/devinfo/<devid> is gone is the device is deleted.
> 
> The device can be gone from the list but the sysfs files can still
> exist.
> 
>      CPU1                                   CPU2
> 
> btrfs_rm_device
>                                          open file
>    btrfs_sysfs_rm_device_link
>      btrfs_free_device
>        kfree(device)
>                                          call read, access freed device
> 

   I completely missed the sysfs synchronization with device delete.
   As in the other email discussion, I think a new rwlock shall suffice.
   And as its lock is only between device delete and sysfs so it shall
   be light weight without affecting the other device_list_mutex holders.

Thanks,
Anand

>>> The
>>> device_list_mutex is quite heavy and allowing a DoS by repeatedly
>>> reading the file contents is not something we want to allow.
>>>
>>
>>     Yes we don't have to use device_list_mutex here, as its read,
>>     a refresh/re-read will refresh the dev_state.
> 
> The point is not to synchronize the device state values but the device
> object itself.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-06 13:49         ` Anand Jain
@ 2019-12-09 14:05           ` Anand Jain
  2019-12-09 18:31             ` David Sterba
  2019-12-13 16:43             ` David Sterba
  0 siblings, 2 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-09 14:05 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On 12/6/19 9:49 PM, Anand Jain wrote:
> 
> 
> On 5/12/19 11:14 PM, David Sterba wrote:
>> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
>>>> So the values copy the device state macros, that's probably ok.
>>>    Yep.
>>
>> Although, sysfs files should print one value per file, which makes sense
>> in many cases, so eg. missing should exist separately too for quick
>> checks for the most common device states. The dev_state reflects the
>> internal state and is likely useful only for debugging.
>>
> 
>   I agree. Its better to create an individual attribute for each of the
>   device states. For instance..
> 
>     under the 'UUID/devinfo/<devid>' kobject
>     attributes will be:
>       missing
>       in_fs_metadata
>       replace_target
> 
>    cat missing
>     0
>    cat in_fs_metadata
>     1
> 
>    ..etc
> 
>   which seems to be more or less standard for block devices.
> 
>   Will fix it in v2.

This is fixed in v2.


> 
>>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>>>>> +                      struct kobj_attribute *a, char *buf)
>>>>> +{
>>>>> +    struct btrfs_device *device = container_of(kobj, struct 
>>>>> btrfs_device,
>>>>> +                           devid_kobj);
>>>>> +
>>>>> +    btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
>>>>
>>>> The device access is unprotected, you need at least RCU but that still
>>>> does not prevent from the device being freed by deletion.
>>>
>>>    We need RCU let me fix. Device being deleted is fine, there
>>>    is nothing to lose, another directory lookup will show that
>>>    UUID/devinfo/<devid> is gone is the device is deleted.
>>
>> The device can be gone from the list but the sysfs files can still
>> exist.
>>
>>      CPU1                                   CPU2
>>
>> btrfs_rm_device
>>                                          open file
>>    btrfs_sysfs_rm_device_link
>>      btrfs_free_device
>>        kfree(device)
>>                                          call read, access freed device
>>
> 
>    I completely missed the sysfs synchronization with device delete.
>    As in the other email discussion, I think a new rwlock shall suffice.
>    And as its lock is only between device delete and sysfs so it shall
>    be light weight without affecting the other device_list_mutex holders.

  Looked into this further, actually we don't need any lock here
  the device delete thread which calls kobject_put() makes sure
  sysfs read is closed. So an existing sysfs read thread will have
  to complete before device free.


       CPU1                                   CPU2

  btrfs_rm_device
                                           open file
     btrfs_sysfs_rm_device_link
                                           call read, access freed device
       sysfs waits for the open file
       to close.
       btrfs_free_device
         kfree(device)


Thanks
Anand


> Thanks,
> Anand
> 
>>>> The
>>>> device_list_mutex is quite heavy and allowing a DoS by repeatedly
>>>> reading the file contents is not something we want to allow.
>>>>
>>>
>>>     Yes we don't have to use device_list_mutex here, as its read,
>>>     a refresh/re-read will refresh the dev_state.
>>
>> The point is not to synchronize the device state values but the device
>> object itself.
> 
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
  2019-12-05 14:10   ` David Sterba
  2019-12-05 14:21   ` David Sterba
@ 2019-12-09 14:06   ` Anand Jain
  2019-12-19 10:41     ` [PATCH v3 " Anand Jain
  2020-01-06 11:38   ` [PATCH v4] " Anand Jain
  3 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-09 14:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba

New sysfs attributes
  in_fs_metadata  missing  replace_target  writeable
are added under a new kobject
  UUID/devinfo/<devid>

These attributes reflects the state of the device from the kernel
fed by %btrfs_device::dev_state.
These attributes are born during mount and goes along with the dynamic
nature of the device add and delete, otherwise these attribute and kobject
gets deleted at unmount.

Sample output:
pwd
 /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo/1/
ls
  in_fs_metadata  missing  replace_target  writeable
cat missing
  0

The output from these attributes are 0 or 1. 0 indicates unset and 1
indicates set.

As of now these attributes are readonly.

It is observed that the device delete thread and sysfs read thread will
not race because the delete thread calls sysfs kobject_put() which in turn
waits for existing sysfs read to complete.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
V2:
  Make the devinfo attribute to carry one parameter, so now
  instead of dev_state attribute, we create in_fs_metadata,
  writeable, missing and replace_target attributes.

 fs/btrfs/sysfs.c   | 127 +++++++++++++++++++++++++++++++++++++++++++----------
 fs/btrfs/volumes.h |   2 +
 2 files changed, 106 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 834f712ed60c..bace9cda433a 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -978,29 +978,102 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	if (!fs_devices->devices_kobj)
 		return -EINVAL;
 
-	if (one_device && one_device->bdev) {
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	if (one_device) {
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
-	}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 
-	if (one_device)
 		return 0;
+	}
 
-	list_for_each_entry(one_device,
-			&fs_devices->devices, dev_list) {
-		if (!one_device->bdev)
-			continue;
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 	}
 
 	return 0;
 }
 
+static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,
+					  struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(, writeable, btrfs_sysfs_writeable_show);
+
+static ssize_t btrfs_sysfs_in_fs_metadata_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+		       &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(, in_fs_metadata, btrfs_sysfs_in_fs_metadata_show);
+
+static ssize_t btrfs_sysfs_missing_show(struct kobject *kobj,
+					struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(, missing, btrfs_sysfs_missing_show);
+
+static ssize_t btrfs_sysfs_replace_target_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(, replace_target, btrfs_sysfs_replace_target_show);
+
+static struct attribute *devinfo_attrs[] = {
+	BTRFS_ATTR_PTR(, writeable),
+	BTRFS_ATTR_PTR(, in_fs_metadata),
+	BTRFS_ATTR_PTR(, missing),
+	BTRFS_ATTR_PTR(, replace_target),
+	NULL
+};
+ATTRIBUTE_GROUPS(devinfo);
+
+static struct kobj_type devinfo_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = devinfo_groups,
+};
+
 int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 				 struct btrfs_device *one_device)
 {
@@ -1008,22 +1081,30 @@ int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *dev;
 
 	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		if (!dev->bdev)
-			continue;
 
 		if (one_device && one_device != dev)
 			continue;
 
-		disk = dev->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		if (dev->bdev) {
+			struct hd_struct *disk;
+			struct kobject *disk_kobj;
+
+			disk = dev->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
 
-		error = sysfs_create_link(fs_devices->devices_kobj,
-					  disk_kobj, disk_kobj->name);
-		if (error)
+			error = sysfs_create_link(fs_devices->devices_kobj,
+						  disk_kobj, disk_kobj->name);
+			if (error)
+				break;
+		}
+
+		error = kobject_init_and_add(&dev->devid_kobj, &devinfo_ktype,
+					     fs_devices->devinfo_kobj, "%llu",
+					     dev->devid);
+		if (error) {
+			kobject_put(&dev->devid_kobj);
 			break;
+		}
 	}
 
 	return error;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 38f2e8437b68..68021d1ee216 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,8 @@ struct btrfs_device {
 	atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
 
 	struct extent_io_tree alloc_state;
+
+	struct kobject devid_kobj;
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-06 13:46   ` Anand Jain
@ 2019-12-09 14:09     ` Anand Jain
  0 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-09 14:09 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On 12/6/19 9:46 PM, Anand Jain wrote:
> 
> 
> On 5/12/19 11:00 PM, David Sterba wrote:
>> On Thu, Dec 05, 2019 at 07:27:02PM +0800, Anand Jain wrote:
>>> Anand Jain (4):
>>>    btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
>>>    btrfs: sysfs, add UUID/devinfo kobject
>>>    btrfs: sysfs, rename device_link add,remove functions
>>>    btrfs: sysfs, add devid/dev_state kobject and attribute
>>
>> So we can start adding things to devinfo, I did a quick test how the
>> sysfs directory looks like, the base structure seems ok.
>>
>> Unlike other sources for sysfs file data (like superblock), the devices
>> can appear and disappear during the lifetime of the filesystem and as
>> pointed out in the patches, some synchronization is needed.
>>
>> But it could be more tricky. Reading from the sysfs files should not
>> block normal operation (no device_list_mutex) but also must not lead to
>> use-after-free in case the device gets deleted.
>>
>> I haven't found a simple locking scheme that would avoid accessing a
>> freed device structure, the sysfs callback can happen at any time and
>> the structure can be freed already.
>>
>> So this means that btrfs_sysfs_dev_state_show cannot access it directly
>> (using offsetof(kobj, ...)). The safe (but not necessarily the best) way
>> I have so far is to track the device kobjects in the superblock and add
>> own lock for accessing this structure.
>>
>> This avoids increasing contention of device_list_mutex, each sysfs
>> callback needs to take the lock first, lookup the device and print the
>> value if it's found. Otherwise we know the device is gone.
> 
> 
> 
>> The lock is rwlock_t, sysfs callbacks take read-side, device deletion
>> takes write possibly outside of the device_list_mutex before the device
>> is actually going to be deleted. This relies on fairness of the lock so
>> the write will happen eventually (even if there are many readers).
>>
> 
>   Yeah this makes sense to me. I completely forgot about the %device
>   getting deleted while sysfs is reading. Let me fix in the patch 4/4.
> 

  Ah. No we don't need the lock. Sysfs kobject delete/put called by the
  delete thread waits for the sysfs read open to close. And after this
  operation we are freeing the %device. So its synchronized there is no
  race.

Thanks, Anand


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-09 14:05           ` Anand Jain
@ 2019-12-09 18:31             ` David Sterba
  2019-12-13 16:43             ` David Sterba
  1 sibling, 0 replies; 33+ messages in thread
From: David Sterba @ 2019-12-09 18:31 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs

On Mon, Dec 09, 2019 at 10:05:39PM +0800, Anand Jain wrote:
>   Looked into this further, actually we don't need any lock here
>   the device delete thread which calls kobject_put() makes sure
>   sysfs read is closed. So an existing sysfs read thread will have
>   to complete before device free.
> 
> 
>        CPU1                                   CPU2
> 
>   btrfs_rm_device
>                                            open file
>      btrfs_sysfs_rm_device_link
>                                            call read, access freed device
>        sysfs waits for the open file
>        to close.
>        btrfs_free_device
>          kfree(device)

Ok, as long as this holds and sysfs file is removed before the device is
freed, the locking is not necessary.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
                   ` (4 preceding siblings ...)
  2019-12-05 15:00 ` [PATCH 0/4] btrfs, sysfs cleanup and add dev_state David Sterba
@ 2019-12-09 22:48 ` Anand Jain
  2019-12-10 16:41   ` David Sterba
  2020-01-14 18:39 ` David Sterba
  6 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-09 22:48 UTC (permalink / raw)
  To: David Sterba; +Cc: linux-btrfs

On 12/5/19 7:27 PM, Anand Jain wrote:
> Patch 1/4 is a cleanup patch.
> Patch 2/4 adds the kobject devinfo which is a preparatory to patch 4/4.
> Patch 3/4 was sent before, no functional changes, change log is updated.
> Patch 4/4 creates the attribute dev_state.
> 
> Anand Jain (4):
>    btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
>    btrfs: sysfs, add UUID/devinfo kobject
>    btrfs: sysfs, rename device_link add,remove functions
>    btrfs: sysfs, add devid/dev_state kobject and attribute

David,

  Is there a chance that partly/fully this could get into 5.5-rc3?
  1,3 are cleanups and 2,4 are feature add.

Thanks, Anand

>   fs/btrfs/dev-replace.c |   4 +-
>   fs/btrfs/sysfs.c       | 130 +++++++++++++++++++++++++++++++----------
>   fs/btrfs/sysfs.h       |   4 +-
>   fs/btrfs/volumes.c     |   8 +--
>   fs/btrfs/volumes.h     |   3 +
>   5 files changed, 111 insertions(+), 38 deletions(-)
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-09 22:48 ` Anand Jain
@ 2019-12-10 16:41   ` David Sterba
  0 siblings, 0 replies; 33+ messages in thread
From: David Sterba @ 2019-12-10 16:41 UTC (permalink / raw)
  To: Anand Jain; +Cc: David Sterba, linux-btrfs

On Tue, Dec 10, 2019 at 06:48:56AM +0800, Anand Jain wrote:
> On 12/5/19 7:27 PM, Anand Jain wrote:
> > Patch 1/4 is a cleanup patch.
> > Patch 2/4 adds the kobject devinfo which is a preparatory to patch 4/4.
> > Patch 3/4 was sent before, no functional changes, change log is updated.
> > Patch 4/4 creates the attribute dev_state.
> > 
> > Anand Jain (4):
> >    btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
> >    btrfs: sysfs, add UUID/devinfo kobject
> >    btrfs: sysfs, rename device_link add,remove functions
> >    btrfs: sysfs, add devid/dev_state kobject and attribute
> 
>   Is there a chance that partly/fully this could get into 5.5-rc3?
>   1,3 are cleanups and 2,4 are feature add.

... which is exactly why this cannot be added to 5.5-rc3. The merge
target 5.5 is now only for regressions introduced in the new code or
reasonable fixes (eg. for stable trees).

https://www.kernel.org/doc/html/latest/process/2.Process.html#the-big-picture

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-09 14:05           ` Anand Jain
  2019-12-09 18:31             ` David Sterba
@ 2019-12-13 16:43             ` David Sterba
  2019-12-13 17:02               ` David Sterba
  1 sibling, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-13 16:43 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs

On Mon, Dec 09, 2019 at 10:05:39PM +0800, Anand Jain wrote:
> On 12/6/19 9:49 PM, Anand Jain wrote:
> > 
> > 
> > On 5/12/19 11:14 PM, David Sterba wrote:
> >> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
> >>>> So the values copy the device state macros, that's probably ok.
> >>>    Yep.
> >>
> >> Although, sysfs files should print one value per file, which makes sense
> >> in many cases, so eg. missing should exist separately too for quick
> >> checks for the most common device states. The dev_state reflects the
> >> internal state and is likely useful only for debugging.
> >>
> > 
> >   I agree. Its better to create an individual attribute for each of the
> >   device states. For instance..
> > 
> >     under the 'UUID/devinfo/<devid>' kobject
> >     attributes will be:
> >       missing
> >       in_fs_metadata
> >       replace_target
> > 
> >    cat missing
> >     0
> >    cat in_fs_metadata
> >     1
> > 
> >    ..etc
> > 
> >   which seems to be more or less standard for block devices.
> > 
> >   Will fix it in v2.
> 
> This is fixed in v2.
> 
> 
> > 
> >>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> >>>>> +                      struct kobj_attribute *a, char *buf)
> >>>>> +{
> >>>>> +    struct btrfs_device *device = container_of(kobj, struct 
> >>>>> btrfs_device,
> >>>>> +                           devid_kobj);
> >>>>> +
> >>>>> +    btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> >>>>
> >>>> The device access is unprotected, you need at least RCU but that still
> >>>> does not prevent from the device being freed by deletion.
> >>>
> >>>    We need RCU let me fix. Device being deleted is fine, there
> >>>    is nothing to lose, another directory lookup will show that
> >>>    UUID/devinfo/<devid> is gone is the device is deleted.
> >>
> >> The device can be gone from the list but the sysfs files can still
> >> exist.
> >>
> >>      CPU1                                   CPU2
> >>
> >> btrfs_rm_device
> >>                                          open file
> >>    btrfs_sysfs_rm_device_link
> >>      btrfs_free_device
> >>        kfree(device)
> >>                                          call read, access freed device
> >>
> > 
> >    I completely missed the sysfs synchronization with device delete.
> >    As in the other email discussion, I think a new rwlock shall suffice.
> >    And as its lock is only between device delete and sysfs so it shall
> >    be light weight without affecting the other device_list_mutex holders.
> 
>   Looked into this further, actually we don't need any lock here
>   the device delete thread which calls kobject_put() makes sure
>   sysfs read is closed. So an existing sysfs read thread will have
>   to complete before device free.
> 
> 
>        CPU1                                   CPU2
> 
>   btrfs_rm_device
>                                            open file
>      btrfs_sysfs_rm_device_link
>                                            call read, access freed device
>        sysfs waits for the open file
>        to close.

How exactly does sysfs wait for the device? Is it eg wait_event checking
number of references? If the file stays open by an evil process is it
going to block the device removal indefinitelly?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-13 16:43             ` David Sterba
@ 2019-12-13 17:02               ` David Sterba
  2019-12-14  0:26                 ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2019-12-13 17:02 UTC (permalink / raw)
  To: dsterba, Anand Jain, linux-btrfs

On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
> >   Looked into this further, actually we don't need any lock here
> >   the device delete thread which calls kobject_put() makes sure
> >   sysfs read is closed. So an existing sysfs read thread will have
> >   to complete before device free.
> > 
> > 
> >        CPU1                                   CPU2
> > 
> >   btrfs_rm_device
> >                                            open file
> >      btrfs_sysfs_rm_device_link
> >                                            call read, access freed device
> >        sysfs waits for the open file
> >        to close.
> 
> How exactly does sysfs wait for the device? Is it eg wait_event checking
> number of references? If the file stays open by an evil process is it
> going to block the device removal indefinitelly?

Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
that way too.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-13 17:02               ` David Sterba
@ 2019-12-14  0:26                 ` Anand Jain
  2020-01-06 16:00                   ` David Sterba
  0 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2019-12-14  0:26 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 14/12/19 1:02 AM, David Sterba wrote:
> On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
>>>    Looked into this further, actually we don't need any lock here
>>>    the device delete thread which calls kobject_put() makes sure
>>>    sysfs read is closed. So an existing sysfs read thread will have
>>>    to complete before device free.
>>>
>>>
>>>         CPU1                                   CPU2
>>>
>>>    btrfs_rm_device
>>>                                             open file
>>>       btrfs_sysfs_rm_device_link
>>>                                             call read, access freed device
>>>         sysfs waits for the open file
>>>         to close.
>>
>> How exactly does sysfs wait for the device? Is it eg wait_event checking
>> number of references? If the file stays open by an evil process is it
>> going to block the device removal indefinitelly?
> 
> Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
> that way too.
> 

  And similar to umount, I don't think we should return EBUSY
  for btrfs_rm_device if the device sysfs attribute is opened,
  as sysfs show attributes are non blocking and would be completed
  in the timely manner.

regards, Anand

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-09 14:06   ` [PATCH v2 " Anand Jain
@ 2019-12-19 10:41     ` Anand Jain
  0 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2019-12-19 10:41 UTC (permalink / raw)
  To: linux-btrfs

New sysfs attributes
  in_fs_metadata  missing  replace_target  writeable
are added under a new kobject
  UUID/devinfo/<devid>

These attributes reflects the state of the device from the kernel
fed by %btrfs_device::dev_state.
These attributes are born during mount and goes along with the dynamic
nature of the device add and delete, otherwise these attribute and kobject
gets deleted at unmount.

Sample output:
pwd
 /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo/1/
ls
  in_fs_metadata  missing  replace_target  writeable
cat missing
  0

The output from these attributes are 0 or 1. 0 indicates unset and 1
indicates set.

As of now these attributes are readonly.

It is observed that the device delete thread and sysfs read thread will
not race because the delete thread calls sysfs kobject_put() which in turn
waits for existing sysfs read to complete.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v3:
  Use optional groupid devid in BTRFS_ATTR(), it was blank in v2.

V2:
  Make the devinfo attribute to carry one parameter, so now
  instead of dev_state attribute, we create in_fs_metadata,
  writeable, missing and replace_target attributes.

 fs/btrfs/sysfs.c   | 127 +++++++++++++++++++++++++++++++++++++++++++----------
 fs/btrfs/volumes.h |   2 +
 2 files changed, 106 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 834f712ed60c..d414b98fb27f 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -978,29 +978,102 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	if (!fs_devices->devices_kobj)
 		return -EINVAL;
 
-	if (one_device && one_device->bdev) {
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	if (one_device) {
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
-	}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 
-	if (one_device)
 		return 0;
+	}
 
-	list_for_each_entry(one_device,
-			&fs_devices->devices, dev_list) {
-		if (!one_device->bdev)
-			continue;
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 	}
 
 	return 0;
 }
 
+static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,
+					  struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, writeable, btrfs_sysfs_writeable_show);
+
+static ssize_t btrfs_sysfs_in_fs_metadata_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+		       &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, in_fs_metadata, btrfs_sysfs_in_fs_metadata_show);
+
+static ssize_t btrfs_sysfs_missing_show(struct kobject *kobj,
+					struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, missing, btrfs_sysfs_missing_show);
+
+static ssize_t btrfs_sysfs_replace_target_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, replace_target, btrfs_sysfs_replace_target_show);
+
+static struct attribute *devid_attrs[] = {
+	BTRFS_ATTR_PTR(devid, writeable),
+	BTRFS_ATTR_PTR(devid, in_fs_metadata),
+	BTRFS_ATTR_PTR(devid, missing),
+	BTRFS_ATTR_PTR(devid, replace_target),
+	NULL
+};
+ATTRIBUTE_GROUPS(devid);
+
+static struct kobj_type devid_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = devid_groups,
+};
+
 int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 				 struct btrfs_device *one_device)
 {
@@ -1008,22 +1081,30 @@ int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *dev;
 
 	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		if (!dev->bdev)
-			continue;
 
 		if (one_device && one_device != dev)
 			continue;
 
-		disk = dev->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		if (dev->bdev) {
+			struct hd_struct *disk;
+			struct kobject *disk_kobj;
+
+			disk = dev->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
 
-		error = sysfs_create_link(fs_devices->devices_kobj,
-					  disk_kobj, disk_kobj->name);
-		if (error)
+			error = sysfs_create_link(fs_devices->devices_kobj,
+						  disk_kobj, disk_kobj->name);
+			if (error)
+				break;
+		}
+
+		error = kobject_init_and_add(&dev->devid_kobj, &devid_ktype,
+					     fs_devices->devinfo_kobj, "%llu",
+					     dev->devid);
+		if (error) {
+			kobject_put(&dev->devid_kobj);
 			break;
+		}
 	}
 
 	return error;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 38f2e8437b68..68021d1ee216 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,8 @@ struct btrfs_device {
 	atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
 
 	struct extent_io_tree alloc_state;
+
+	struct kobject devid_kobj;
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
                     ` (2 preceding siblings ...)
  2019-12-09 14:06   ` [PATCH v2 " Anand Jain
@ 2020-01-06 11:38   ` Anand Jain
  2020-01-09 15:20     ` [PATCH v4 4/4] " David Sterba
  3 siblings, 1 reply; 33+ messages in thread
From: Anand Jain @ 2020-01-06 11:38 UTC (permalink / raw)
  To: linux-btrfs

New sysfs attributes
  in_fs_metadata  missing  replace_target  writeable
are added under a new kobject
  UUID/devinfo/<devid>

These attributes reflects the state of the device from the kernel
fed by %btrfs_device::dev_state.
These attributes are born during mount and goes along with the dynamic
nature of the device add and delete, otherwise these attribute and kobject
gets deleted at unmount.

Sample output:
pwd
 /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo/1/
ls
  in_fs_metadata  missing  replace_target  writeable
cat missing
  0

The output from these attributes are 0 or 1. 0 indicates unset and 1
indicates set.

As of now these attributes are readonly.

It is observed that the device delete thread and sysfs read thread will
not race because the delete thread calls sysfs kobject_put() which in turn
waits for existing sysfs read to complete.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v4:
   after patch
   [PATCH v5 2/2] btrfs: reset device back to allocation state when removing
   in misc-next, the device::devid_kobj remains stale, fix it by using
   release.

v3:
  Use optional groupid devid in BTRFS_ATTR(), it was blank in v2.

V2:
  Make the devinfo attribute to carry one parameter, so now
  instead of dev_state attribute, we create in_fs_metadata,
  writeable, missing and replace_target attributes.

 fs/btrfs/sysfs.c   | 142 ++++++++++++++++++++++++++++++++++++++++++++---------
 fs/btrfs/volumes.h |   3 ++
 2 files changed, 122 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 834f712ed60c..18dac99188ce 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -978,29 +978,116 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	if (!fs_devices->devices_kobj)
 		return -EINVAL;
 
-	if (one_device && one_device->bdev) {
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	if (one_device) {
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
-	}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
+
+		wait_for_completion(&one_device->kobj_unregister);
 
-	if (one_device)
 		return 0;
+	}
 
-	list_for_each_entry(one_device,
-			&fs_devices->devices, dev_list) {
-		if (!one_device->bdev)
-			continue;
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
+
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		wait_for_completion(&one_device->kobj_unregister);
 	}
 
 	return 0;
 }
 
+static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,
+					  struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, writeable, btrfs_sysfs_writeable_show);
+
+static ssize_t btrfs_sysfs_in_fs_metadata_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+		       &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, in_fs_metadata, btrfs_sysfs_in_fs_metadata_show);
+
+static ssize_t btrfs_sysfs_missing_show(struct kobject *kobj,
+					struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, missing, btrfs_sysfs_missing_show);
+
+static ssize_t btrfs_sysfs_replace_target_show(struct kobject *kobj,
+					       struct kobj_attribute *a,
+					       char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state) ? 1 : 0;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+BTRFS_ATTR(devid, replace_target, btrfs_sysfs_replace_target_show);
+
+static struct attribute *devid_attrs[] = {
+	BTRFS_ATTR_PTR(devid, writeable),
+	BTRFS_ATTR_PTR(devid, in_fs_metadata),
+	BTRFS_ATTR_PTR(devid, missing),
+	BTRFS_ATTR_PTR(devid, replace_target),
+	NULL
+};
+ATTRIBUTE_GROUPS(devid);
+
+static void btrfs_release_devid_kobj(struct kobject *kobj)
+{
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	memset(&device->devid_kobj, 0, sizeof(struct kobject));
+	complete(&device->kobj_unregister);
+}
+
+static struct kobj_type devid_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = devid_groups,
+	.release = btrfs_release_devid_kobj,
+};
+
 int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 				 struct btrfs_device *one_device)
 {
@@ -1008,22 +1095,31 @@ int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *dev;
 
 	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		if (!dev->bdev)
-			continue;
 
 		if (one_device && one_device != dev)
 			continue;
 
-		disk = dev->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		if (dev->bdev) {
+			struct hd_struct *disk;
+			struct kobject *disk_kobj;
+
+			disk = dev->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
 
-		error = sysfs_create_link(fs_devices->devices_kobj,
-					  disk_kobj, disk_kobj->name);
-		if (error)
+			error = sysfs_create_link(fs_devices->devices_kobj,
+						  disk_kobj, disk_kobj->name);
+			if (error)
+				break;
+		}
+
+		init_completion(&dev->kobj_unregister);
+		error = kobject_init_and_add(&dev->devid_kobj, &devid_ktype,
+					     fs_devices->devinfo_kobj, "%llu",
+					     dev->devid);
+		if (error) {
+			kobject_put(&dev->devid_kobj);
 			break;
+		}
 	}
 
 	return error;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 38f2e8437b68..da9c6e1b843a 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,9 @@ struct btrfs_device {
 	atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
 
 	struct extent_io_tree alloc_state;
+
+	struct completion kobj_unregister;
+	struct kobject devid_kobj;
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2019-12-14  0:26                 ` Anand Jain
@ 2020-01-06 16:00                   ` David Sterba
  0 siblings, 0 replies; 33+ messages in thread
From: David Sterba @ 2020-01-06 16:00 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs

On Sat, Dec 14, 2019 at 08:26:24AM +0800, Anand Jain wrote:
> 
> 
> On 14/12/19 1:02 AM, David Sterba wrote:
> > On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
> >>>    Looked into this further, actually we don't need any lock here
> >>>    the device delete thread which calls kobject_put() makes sure
> >>>    sysfs read is closed. So an existing sysfs read thread will have
> >>>    to complete before device free.
> >>>
> >>>
> >>>         CPU1                                   CPU2
> >>>
> >>>    btrfs_rm_device
> >>>                                             open file
> >>>       btrfs_sysfs_rm_device_link
> >>>                                             call read, access freed device
> >>>         sysfs waits for the open file
> >>>         to close.
> >>
> >> How exactly does sysfs wait for the device? Is it eg wait_event checking
> >> number of references? If the file stays open by an evil process is it
> >> going to block the device removal indefinitelly?
> > 
> > Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
> > that way too.
> 
>   And similar to umount, I don't think we should return EBUSY
>   for btrfs_rm_device if the device sysfs attribute is opened,
>   as sysfs show attributes are non blocking and would be completed
>   in the timely manner.

While I don't think the blocking behaviour is totally OK, returning
EBUSY could be confusing without any other explanation. Also leaving
sysfs files open but not read is kind of strange on itself.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2020-01-06 11:38   ` [PATCH v4] " Anand Jain
@ 2020-01-09 15:20     ` David Sterba
  2020-01-10  1:03       ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2020-01-09 15:20 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Mon, Jan 06, 2020 at 07:38:31PM +0800, Anand Jain wrote:
> New sysfs attributes
>   in_fs_metadata  missing  replace_target  writeable
> are added under a new kobject
>   UUID/devinfo/<devid>
> 
> These attributes reflects the state of the device from the kernel
> fed by %btrfs_device::dev_state.
> These attributes are born during mount and goes along with the dynamic
> nature of the device add and delete, otherwise these attribute and kobject
> gets deleted at unmount.
> 
> Sample output:
> pwd
>  /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo/1/
> ls
>   in_fs_metadata  missing  replace_target  writeable
> cat missing
>   0
> 
> The output from these attributes are 0 or 1. 0 indicates unset and 1
> indicates set.
> 
> As of now these attributes are readonly.
> 
> It is observed that the device delete thread and sysfs read thread will
> not race because the delete thread calls sysfs kobject_put() which in turn
> waits for existing sysfs read to complete.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v4:
>    after patch
>    [PATCH v5 2/2] btrfs: reset device back to allocation state when removing
>    in misc-next, the device::devid_kobj remains stale, fix it by using
>    release.
> 
> v3:
>   Use optional groupid devid in BTRFS_ATTR(), it was blank in v2.
> 
> V2:
>   Make the devinfo attribute to carry one parameter, so now
>   instead of dev_state attribute, we create in_fs_metadata,
>   writeable, missing and replace_target attributes.
> 
>  fs/btrfs/sysfs.c   | 142 ++++++++++++++++++++++++++++++++++++++++++++---------
>  fs/btrfs/volumes.h |   3 ++
>  2 files changed, 122 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 834f712ed60c..18dac99188ce 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -978,29 +978,116 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
>  	if (!fs_devices->devices_kobj)
>  		return -EINVAL;
>  
> -	if (one_device && one_device->bdev) {
> -		disk = one_device->bdev->bd_part;
> -		disk_kobj = &part_to_dev(disk)->kobj;
> +	if (one_device) {
> +		if (one_device->bdev) {
> +			disk = one_device->bdev->bd_part;
> +			disk_kobj = &part_to_dev(disk)->kobj;
> +			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
> +		}
>  
> -		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
> -	}
> +		kobject_del(&one_device->devid_kobj);
> +		kobject_put(&one_device->devid_kobj);
> +
> +		wait_for_completion(&one_device->kobj_unregister);
>  
> -	if (one_device)
>  		return 0;
> +	}
>  
> -	list_for_each_entry(one_device,
> -			&fs_devices->devices, dev_list) {
> -		if (!one_device->bdev)
> -			continue;
> -		disk = one_device->bdev->bd_part;
> -		disk_kobj = &part_to_dev(disk)->kobj;
> +	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
> +
> +		if (one_device->bdev) {
> +			disk = one_device->bdev->bd_part;
> +			disk_kobj = &part_to_dev(disk)->kobj;
> +			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
> +		}
> +		kobject_del(&one_device->devid_kobj);
> +		kobject_put(&one_device->devid_kobj);
>  
> -		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
> +		wait_for_completion(&one_device->kobj_unregister);
>  	}
>  
>  	return 0;
>  }
>  
> +static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,

This could be btrfs_devinfo_writeable_show as the _sysfs_ part means
it's something generic and possibly exported for other parts to use.
Same for the other callbacks.

> +static struct attribute *devid_attrs[] = {
> +	BTRFS_ATTR_PTR(devid, writeable),
> +	BTRFS_ATTR_PTR(devid, in_fs_metadata),
> +	BTRFS_ATTR_PTR(devid, missing),
> +	BTRFS_ATTR_PTR(devid, replace_target),

Sorted alphabetically.

All will be fixed at commit time. The device state bits could be
interestig to user in some cases and they're probably going to stay so
we have some future guarantee of the sort-of-ABI for sysfs.

The first 3 patches are deep in misc-next so I'll reorder them so the
whole series is grouped together.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
  2020-01-09 15:20     ` [PATCH v4 4/4] " David Sterba
@ 2020-01-10  1:03       ` Anand Jain
  0 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2020-01-10  1:03 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 9/1/20 11:20 PM, David Sterba wrote:
> On Mon, Jan 06, 2020 at 07:38:31PM +0800, Anand Jain wrote:
>> New sysfs attributes
>>    in_fs_metadata  missing  replace_target  writeable
>> are added under a new kobject
>>    UUID/devinfo/<devid>
>>
>> These attributes reflects the state of the device from the kernel
>> fed by %btrfs_device::dev_state.
>> These attributes are born during mount and goes along with the dynamic
>> nature of the device add and delete, otherwise these attribute and kobject
>> gets deleted at unmount.
>>
>> Sample output:
>> pwd
>>   /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo/1/
>> ls
>>    in_fs_metadata  missing  replace_target  writeable
>> cat missing
>>    0
>>
>> The output from these attributes are 0 or 1. 0 indicates unset and 1
>> indicates set.
>>
>> As of now these attributes are readonly.
>>
>> It is observed that the device delete thread and sysfs read thread will
>> not race because the delete thread calls sysfs kobject_put() which in turn
>> waits for existing sysfs read to complete.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> v4:
>>     after patch
>>     [PATCH v5 2/2] btrfs: reset device back to allocation state when removing
>>     in misc-next, the device::devid_kobj remains stale, fix it by using
>>     release.
>>
>> v3:
>>    Use optional groupid devid in BTRFS_ATTR(), it was blank in v2.
>>
>> V2:
>>    Make the devinfo attribute to carry one parameter, so now
>>    instead of dev_state attribute, we create in_fs_metadata,
>>    writeable, missing and replace_target attributes.
>>
>>   fs/btrfs/sysfs.c   | 142 ++++++++++++++++++++++++++++++++++++++++++++---------
>>   fs/btrfs/volumes.h |   3 ++
>>   2 files changed, 122 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index 834f712ed60c..18dac99188ce 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -978,29 +978,116 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
>>   	if (!fs_devices->devices_kobj)
>>   		return -EINVAL;
>>   
>> -	if (one_device && one_device->bdev) {
>> -		disk = one_device->bdev->bd_part;
>> -		disk_kobj = &part_to_dev(disk)->kobj;
>> +	if (one_device) {
>> +		if (one_device->bdev) {
>> +			disk = one_device->bdev->bd_part;
>> +			disk_kobj = &part_to_dev(disk)->kobj;
>> +			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
>> +		}
>>   
>> -		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
>> -	}
>> +		kobject_del(&one_device->devid_kobj);
>> +		kobject_put(&one_device->devid_kobj);
>> +
>> +		wait_for_completion(&one_device->kobj_unregister);
>>   
>> -	if (one_device)
>>   		return 0;
>> +	}
>>   
>> -	list_for_each_entry(one_device,
>> -			&fs_devices->devices, dev_list) {
>> -		if (!one_device->bdev)
>> -			continue;
>> -		disk = one_device->bdev->bd_part;
>> -		disk_kobj = &part_to_dev(disk)->kobj;
>> +	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
>> +
>> +		if (one_device->bdev) {
>> +			disk = one_device->bdev->bd_part;
>> +			disk_kobj = &part_to_dev(disk)->kobj;
>> +			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
>> +		}
>> +		kobject_del(&one_device->devid_kobj);
>> +		kobject_put(&one_device->devid_kobj);
>>   
>> -		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
>> +		wait_for_completion(&one_device->kobj_unregister);
>>   	}
>>   
>>   	return 0;
>>   }
>>   
>> +static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,
> 
> This could be btrfs_devinfo_writeable_show as the _sysfs_ part means
> it's something generic and possibly exported for other parts to use.
> Same for the other callbacks.
> 
>> +static struct attribute *devid_attrs[] = {
>> +	BTRFS_ATTR_PTR(devid, writeable),
>> +	BTRFS_ATTR_PTR(devid, in_fs_metadata),
>> +	BTRFS_ATTR_PTR(devid, missing),
>> +	BTRFS_ATTR_PTR(devid, replace_target),
> 
> Sorted alphabetically.
> 
> All will be fixed at commit time. The device state bits could be
> interestig to user in some cases and they're probably going to stay so
> we have some future guarantee of the sort-of-ABI for sysfs.
> 
> The first 3 patches are deep in misc-next so I'll reorder them so the
> whole series is grouped together.
> 

  I agree with your suggestions, thanks for correcting them before commit.
Anand

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject
  2019-12-05 11:27 ` [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject Anand Jain
@ 2020-01-14 18:32   ` David Sterba
  2020-02-03 10:40     ` Anand Jain
  0 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2020-01-14 18:32 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Dec 05, 2019 at 07:27:04PM +0800, Anand Jain wrote:
> Preparatory patch creates kobject /sys/fs/btrfs/UUID/devinfo to hold
> btrfs_fs_devices::dev_state attribute.
> 
> This is being added in the mount context, that means we don't see
> UUID/devinfo before the mount (yet).
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>

There's a report about duplicate directory created:

btrfs/064

[ 1782.480622] sysfs: cannot create duplicate filename '/fs/btrfs/3f26615b-afcd-4008-8aed-44e714be257c/devices/0'                                                                                               │··
[ 1782.486942] CPU: 1 PID: 1983 Comm: btrfs Not tainted 5.5.0-rc5-default+ #932                                                                                                                                 │··
[ 1782.490425] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014                                                                               │··
[ 1782.494584] Call Trace:                                                                                                                                                                                      │··
[ 1782.495588]  dump_stack+0x71/0xa0                                                                                                                                                                            │··
[ 1782.497531]  sysfs_warn_dup.cold+0x17/0x2d                                                                                                                                                                   │··
[ 1782.499626]  sysfs_create_dir_ns+0xb6/0xd0                                                                                                                                                                   │··
[ 1782.501732]  kobject_add_internal+0xbb/0x2d0                                                                                                                                                                 │··
[ 1782.504346]  kobject_init_and_add+0x71/0xa0                                                                                                                                                                  │··
[ 1782.505694]  ? lockdep_init_map+0x43/0x1d0                                                                                                                                                                   │··
[ 1782.506800]  btrfs_sysfs_add_device_link+0xb5/0x100 [btrfs]                                                                                                                                                  │··
[ 1782.508356]  ? rwsem_wake.isra.0+0x6c/0x90                                                                                                                                                                   │··
[ 1782.510070]  btrfs_dev_replace_start+0x2cf/0x380 [btrfs]                                                                                                                                                     │··
[ 1782.511663]  btrfs_dev_replace_by_ioctl+0x35/0x60 [btrfs]                                                                                                                                                    │··
[ 1782.513177]  btrfs_ioctl+0x1d72/0x2560 [btrfs]                                                                                                                                                               │··
[ 1782.514375]  ? do_sigaction+0x5f/0x240                                                                                                                                                                       │··
[ 1782.515867]  ? do_vfs_ioctl+0x56e/0x770                                                                                                                                                                      │··
[ 1782.517276]  do_vfs_ioctl+0x56e/0x770                                                                                                                                                                        │··
[ 1782.518577]  ? do_sigaction+0xf8/0x240                                                                                                                                                                       │··
[ 1782.519898]  ksys_ioctl+0x3a/0x70                                                                                                                                                                            │··
[ 1782.521171]  ? trace_hardirqs_off_thunk+0x1a/0x1c                                                                                                                                                            │··
[ 1782.522783]  __x64_sys_ioctl+0x16/0x20                                                                                                                                                                       │··
[ 1782.524064]  do_syscall_64+0x50/0x210                                                                                                                                                                        │··
[ 1782.525313]  entry_SYSCALL_64_after_hwframe+0x49/0xbe                                                                                                                                                        │··
[ 1782.526818] RIP: 0033:0x7feafcbd5387                                                                                                                                                                         │··
[ 1782.528154] Code: 00 00 90 48 8b 05 f9 9a 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 9a 0c 00 f7 d8 64 89│··
 01 48                                                                                                                                                                                                          │··
[ 1782.533927] RSP: 002b:00007ffcb14eea88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010                                                                                                                           │··
[ 1782.536483] RAX: ffffffffffffffda RBX: 00007ffcb14f11b0 RCX: 00007feafcbd5387                                                                                                                                │··
[ 1782.538983] RDX: 00007ffcb14ef8f0 RSI: 00000000ca289435 RDI: 0000000000000003                                                                                                                                │··
[ 1782.541339] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000                                                                                                                                │··
[ 1782.543809] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000003                                                                                                                                │··
[ 1782.545906] R13: 0000000000000001 R14: 0000000000000000 R15: 00005641a71d02e0                                                                                                                                │··
[ 1782.548347] kobject_add_internal failed for 0 with -EEXIST, don't try to register things with the same name in the same directory.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
  2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
                   ` (5 preceding siblings ...)
  2019-12-09 22:48 ` Anand Jain
@ 2020-01-14 18:39 ` David Sterba
  2020-01-15  8:22   ` [PATCH] btrfs: update devid after replace Anand Jain
  6 siblings, 1 reply; 33+ messages in thread
From: David Sterba @ 2020-01-14 18:39 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Dec 05, 2019 at 07:27:02PM +0800, Anand Jain wrote:
> Patch 1/4 is a cleanup patch.
> Patch 2/4 adds the kobject devinfo which is a preparatory to patch 4/4.
> Patch 3/4 was sent before, no functional changes, change log is updated.
> Patch 4/4 creates the attribute dev_state.
> 
> Anand Jain (4):
>   btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid
>   btrfs: sysfs, add UUID/devinfo kobject
>   btrfs: sysfs, rename device_link add,remove functions
>   btrfs: sysfs, add devid/dev_state kobject and attribute

I'm not sure which patch causes that but there are more problems, from test
btrfs/064. The log is full of it, the create part for device id 0 and
when the filesystem is unmounted, refcount for the object is wrong and
sysfs complains.

[ 1839.514008] kobject: '(null)' (00000000167bb824): is not initialized, yet kobject_put() is being called.                                                                                                     │··
[ 1839.517437] WARNING: CPU: 3 PID: 2487 at lib/kobject.c:736 kobject_put+0x40/0xb0                                                                                                                             │··
[ 1839.520253] Modules linked in: dm_flakey dm_mod dax xxhash_generic btrfs blake2b_generic libcrc32c crc32c_intel xor zstd_decompress zstd_compress xxhash lzo_compress lzo_decompress raid6_pq loop           │··
[ 1839.524566] CPU: 3 PID: 2487 Comm: umount Tainted: G        W         5.5.0-rc5-default+ #932                                                                                                                │··
[ 1839.527075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014                                                                               │··
[ 1839.529898] RIP: 0010:kobject_put+0x40/0xb0                                                                                                                                                                  │··
 0c 80                                                                                                                                                                                                          │··
[ 1839.535471] RSP: 0018:ffffac040741fdc0 EFLAGS: 00010282                                                                                                                                                      │··
[ 1839.537134] RAX: 0000000000000000 RBX: ffff93422bc0f000 RCX: 0000000000000001                                                                                                                                │··
[ 1839.539236] RDX: 0000000000000000 RSI: ffffffff9f100899 RDI: ffffffff9f100971                                                                                                                                │··
[ 1839.541299] RBP: ffff93422bc0f2f8 R08: 0000000000000000 R09: 0000000000000000                                                                                                                                │··
[ 1839.543132] R10: 0000000000000000 R11: 0000000000000000 R12: ffff934239959a00                                                                                                                                │··
[ 1839.545114] R13: ffff934239959b18 R14: 0000000000000000 R15: ffff934235b23fc8                                                                                                                                │··
[ 1839.547131] FS:  00007f9d2c233800(0000) GS:ffff93423dc00000(0000) knlGS:0000000000000000                                                                                                                     │··
[ 1839.549219] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                                                │··
[ 1839.550380] CR2: 00007fff85c8cff8 CR3: 000000001ff3f006 CR4: 0000000000160ee0                                                                                                                                │··
[ 1839.551841] Call Trace:                                                                                                                                                                                      │··
[ 1839.553050]  btrfs_sysfs_rm_device_link+0xb8/0xe0 [btrfs]                                                                                                                                                    │··
[ 1839.554862]  close_ctree+0x23a/0x334 [btrfs]                                                                                                                                                                 │··
[ 1839.556429]  generic_shutdown_super+0x69/0x100                                                                                                                                                               │··
[ 1839.558030]  kill_anon_super+0x14/0x30                                                                                                                                                                       │··
[ 1839.559461]  btrfs_kill_super+0x12/0xa0 [btrfs]                                                                                                                                                              │··
[ 1839.560894]  deactivate_locked_super+0x2c/0x70                                                                                                                                                               │··
[ 1839.562375]  cleanup_mnt+0x100/0x160                                                                                                                                                                         │··
[ 1839.563682]  task_work_run+0x90/0xc0                                                                                                                                                                         │··
[ 1839.564991]  exit_to_usermode_loop+0x96/0xa0                                                                                                                                                                 │··
[ 1839.566438]  do_syscall_64+0x1df/0x210                                                                                                                                                                       │··
[ 1839.567792]  entry_SYSCALL_64_after_hwframe+0x49/0xbe                                                                                                                                                        │··
[ 1839.569416] RIP: 0033:0x7f9d2c4774e7                                                                                                                                                                         │··
 01 48                                                                                                                                                                                                          │··
[ 1839.576283] RSP: 002b:00007fff85c8e388 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6                                                                                                                           │··
[ 1839.578849] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f9d2c4774e7                                                                                                                                │··
[ 1839.581006] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055e08e9dab80                                                                                                                                │··
[ 1839.583132] RBP: 000055e08e9da970 R08: 0000000000000000 R09: 00007fff85c8d100                                                                                                                                │··
[ 1839.585271] R10: 000055e08e9daba0 R11: 0000000000000246 R12: 000055e08e9dab80                                                                                                                                │··
[ 1839.587420] R13: 0000000000000000 R14: 000055e08e9daa68 R15: 0000000000000000                                                                                                                                │··
[ 1839.589528] irq event stamp: 0                                                                                                                                                                               │··
[ 1839.590731] hardirqs last  enabled at (0): [<0000000000000000>] 0x0                                                                                                                                          │··
[ 1839.592667] hardirqs last disabled at (0): [<ffffffff9f07efb0>] copy_process+0x680/0x1b70                                                                                                                    │··
[ 1839.595447] softirqs last  enabled at (0): [<ffffffff9f07efb0>] copy_process+0x680/0x1b70                                                                                                                    │··
[ 1839.598216] softirqs last disabled at (0): [<0000000000000000>] 0x0                                                                                                                                          │··
[ 1839.600178] ---[ end trace d7fb083174aa2eaf ]---

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] btrfs: update devid after replace
  2020-01-14 18:39 ` David Sterba
@ 2020-01-15  8:22   ` Anand Jain
  2020-01-15 16:17     ` David Sterba
  2020-01-20 19:10     ` David Sterba
  0 siblings, 2 replies; 33+ messages in thread
From: Anand Jain @ 2020-01-15  8:22 UTC (permalink / raw)
  To: linux-btrfs

During the replace the target device temporarily assumes devid 0 before
assigning the devid of the soruce device.

In btrfs_dev_replace_finishing() we remove source sysfs devid using
the function btrfs_sysfs_remove_devices_attr(), so after that call
kobject_rename() to update the devid in the sysfs.
This adds and calls btrfs_sysfs_update_devid() helper function to update
the device id.

This patch must be squashed with the patch
 [PATCH v4] btrfs: sysfs, add devid/dev_state kobject and attribute
or its variant.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
David,
 I couldn't find the patch-series..
   [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
 in your misc-next. And I believe there were changes like
 function rename and attribute list reorder in your workspace. So I am
 sending a fix-patch which must be squashed to the patch v4 4/4.

 With this patch, the test case btrfs/064 runs fine. And volume group
 tests are still running.
 
 fs/btrfs/dev-replace.c |  1 +
 fs/btrfs/sysfs.c       | 11 +++++++++++
 fs/btrfs/sysfs.h       |  1 +
 3 files changed, 13 insertions(+)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 9a29d6de9017..ccdb486bd4c3 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -707,6 +707,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 
 	/* replace the sysfs entry */
 	btrfs_sysfs_remove_devices_attr(fs_info->fs_devices, src_device);
+	btrfs_sysfs_update_devid(tgt_device);
 	btrfs_rm_dev_replace_free_srcdev(src_device);
 
 	/* write back the superblocks */
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 0b615d99cedd..22971268e5b6 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1189,6 +1189,17 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	return 0;
 }
 
+void btrfs_sysfs_update_devid(struct btrfs_device *device)
+{
+	char tmp[64];
+
+	snprintf(tmp, sizeof(tmp), "%llu", device->devid);
+
+	if (kobject_rename(&device->devid_kobj, tmp))
+		btrfs_warn(device->fs_devices->fs_info,
+			   "sysfs: failed to update devid");
+}
+
 static ssize_t btrfs_sysfs_writeable_show(struct kobject *kobj,
 					  struct kobj_attribute *a, char *buf)
 {
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index 9d97b3c8db4e..ccf33eaf9e59 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -34,5 +34,6 @@ void btrfs_sysfs_add_block_group_type(struct btrfs_block_group *cache);
 int btrfs_sysfs_add_space_info_type(struct btrfs_fs_info *fs_info,
 				    struct btrfs_space_info *space_info);
 void btrfs_sysfs_remove_space_info(struct btrfs_space_info *space_info);
+void btrfs_sysfs_update_devid(struct btrfs_device *device);
 
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] btrfs: update devid after replace
  2020-01-15  8:22   ` [PATCH] btrfs: update devid after replace Anand Jain
@ 2020-01-15 16:17     ` David Sterba
  2020-01-20 19:10     ` David Sterba
  1 sibling, 0 replies; 33+ messages in thread
From: David Sterba @ 2020-01-15 16:17 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Wed, Jan 15, 2020 at 04:22:50PM +0800, Anand Jain wrote:
> During the replace the target device temporarily assumes devid 0 before
> assigning the devid of the soruce device.
> 
> In btrfs_dev_replace_finishing() we remove source sysfs devid using
> the function btrfs_sysfs_remove_devices_attr(), so after that call
> kobject_rename() to update the devid in the sysfs.
> This adds and calls btrfs_sysfs_update_devid() helper function to update
> the device id.
> 
> This patch must be squashed with the patch
>  [PATCH v4] btrfs: sysfs, add devid/dev_state kobject and attribute
> or its variant.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> David,
>  I couldn't find the patch-series..
>    [PATCH 0/4] btrfs, sysfs cleanup and add dev_state
>  in your misc-next. And I believe there were changes like
>  function rename and attribute list reorder in your workspace. So I am
>  sending a fix-patch which must be squashed to the patch v4 4/4.

I had to remove the patch from misc-next as it prevented further
testing. The whole patchset can be found in branch
ext/anand/sysfs-devinfo on github repo. Once I test this patch I'll add
it back.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] btrfs: update devid after replace
  2020-01-15  8:22   ` [PATCH] btrfs: update devid after replace Anand Jain
  2020-01-15 16:17     ` David Sterba
@ 2020-01-20 19:10     ` David Sterba
  1 sibling, 0 replies; 33+ messages in thread
From: David Sterba @ 2020-01-20 19:10 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Wed, Jan 15, 2020 at 04:22:50PM +0800, Anand Jain wrote:
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -1189,6 +1189,17 @@ int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
>  	return 0;
>  }
>  
> +void btrfs_sysfs_update_devid(struct btrfs_device *device)
> +{
> +	char tmp[64];

24 is enough

> +
> +	snprintf(tmp, sizeof(tmp), "%llu", device->devid);
> +
> +	if (kobject_rename(&device->devid_kobj, tmp))
> +		btrfs_warn(device->fs_devices->fs_info,
> +			   "sysfs: failed to update devid");

And printing for which devid the rename failed should be here

> +}
> +

Once the tests finish I'll add it to misc-next so the patchset is
complete.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject
  2020-01-14 18:32   ` David Sterba
@ 2020-02-03 10:40     ` Anand Jain
  0 siblings, 0 replies; 33+ messages in thread
From: Anand Jain @ 2020-02-03 10:40 UTC (permalink / raw)
  To: dsterba, linux-btrfs




On 15/1/20 2:32 AM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:04PM +0800, Anand Jain wrote:
>> Preparatory patch creates kobject /sys/fs/btrfs/UUID/devinfo to hold
>> btrfs_fs_devices::dev_state attribute.
>>
>> This is being added in the mount context, that means we don't see
>> UUID/devinfo before the mount (yet).
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> 


> There's a report about duplicate directory created:
> 
> btrfs/064
>
> 
> [ 1782.480622] sysfs: cannot create duplicate filename '/fs/btrfs/3f26615b-afcd-4008-8aed-44e714be257c/devices/0'                                                                                               

  Just for the clarity this patch did not cause this issue.

David,

  The patches 1/4,2/4,3/4 in this series are cleanups and preparatory
  which is missing in misc-next.

  So the devid is being created under <UUID>/devices instead of
  <UUID>/devinfo as planned.

  I am sending the whole series again with the patch 4/4 taken from
  misc-next, please help to integrate.

Thanks, Anand

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-02-03 10:40 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-05 11:27 [PATCH 0/4] btrfs, sysfs cleanup and add dev_state Anand Jain
2019-12-05 11:27 ` [PATCH 1/4] btrfs: sysfs, use btrfs_sysfs_remove_fsid in fail return in add_fsid Anand Jain
2019-12-05 11:27 ` [PATCH 2/4] btrfs: sysfs, add UUID/devinfo kobject Anand Jain
2020-01-14 18:32   ` David Sterba
2020-02-03 10:40     ` Anand Jain
2019-12-05 11:27 ` [PATCH v2 3/4] btrfs: sysfs, rename device_link add,remove functions Anand Jain
2019-12-05 11:27 ` [PATCH 4/4] btrfs: sysfs, add devid/dev_state kobject and attribute Anand Jain
2019-12-05 14:10   ` David Sterba
2019-12-05 14:38     ` Anand Jain
2019-12-05 14:21   ` David Sterba
2019-12-05 14:38     ` Anand Jain
2019-12-05 15:14       ` David Sterba
2019-12-06 13:49         ` Anand Jain
2019-12-09 14:05           ` Anand Jain
2019-12-09 18:31             ` David Sterba
2019-12-13 16:43             ` David Sterba
2019-12-13 17:02               ` David Sterba
2019-12-14  0:26                 ` Anand Jain
2020-01-06 16:00                   ` David Sterba
2019-12-09 14:06   ` [PATCH v2 " Anand Jain
2019-12-19 10:41     ` [PATCH v3 " Anand Jain
2020-01-06 11:38   ` [PATCH v4] " Anand Jain
2020-01-09 15:20     ` [PATCH v4 4/4] " David Sterba
2020-01-10  1:03       ` Anand Jain
2019-12-05 15:00 ` [PATCH 0/4] btrfs, sysfs cleanup and add dev_state David Sterba
2019-12-06 13:46   ` Anand Jain
2019-12-09 14:09     ` Anand Jain
2019-12-09 22:48 ` Anand Jain
2019-12-10 16:41   ` David Sterba
2020-01-14 18:39 ` David Sterba
2020-01-15  8:22   ` [PATCH] btrfs: update devid after replace Anand Jain
2020-01-15 16:17     ` David Sterba
2020-01-20 19:10     ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.