All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] btrfs: match device by dev_t
@ 2021-12-10 18:15 Anand Jain
  2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Anand Jain @ 2021-12-10 18:15 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef

Patch 1 is the actual bug fix and should go to stable 5.4 as well.
On 5.4 patch1 conflicts (outside of the changes in the patch),
so not yet marked for the stable.

Patch 2 simplifies calling lookup_bdev() in the device_matched()
by moving the same to the parent function two levels up.

Patch 2 is not merged with 1 because to keep the patch 1 changes local
to a function so that it can be easily backported to 5.4 and 5.10.

We should save the dev_t in struct btrfs_device with that may be
we could clean up a few more things, including fixing the below sparse
warning.

  sparse: sparse: incorrect type in argument 1 (different address spaces)

For using without rcu:

  error = lookup_bdev(device->name->str, &dev_old); 

Anand Jain (2):
  btrfs: harden identification of the stale device
  btrfs: redeclare btrfs_stale_devices arg1 to dev_t

 fs/btrfs/super.c   |  8 +++++-
 fs/btrfs/volumes.c | 72 +++++++++++++++++++++++++++++++++-------------
 fs/btrfs/volumes.h |  2 +-
 3 files changed, 60 insertions(+), 22 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] btrfs: harden identification of the stale device
  2021-12-10 18:15 [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
@ 2021-12-10 18:15 ` Anand Jain
  2021-12-13 15:04   ` Nikolay Borisov
  2022-01-04 18:56   ` David Sterba
  2021-12-10 18:15 ` [PATCH v2 2/2] btrfs: redeclare btrfs_stale_devices arg1 to dev_t Anand Jain
  2022-01-04  8:15 ` [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
  2 siblings, 2 replies; 11+ messages in thread
From: Anand Jain @ 2021-12-10 18:15 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef

Identifying and removing the stale device from the fs_uuids list is done
by the function btrfs_free_stale_devices().
btrfs_free_stale_devices() in turn depends on the function
device_path_matched() to check if the device repeats in more than one
btrfs_device structure.

The matching of the device happens by its path, the device path. However,
when dm mapper is in use, the dm device paths are nothing but a link to
the actual block device, which leads to the device_path_matched() failing
to match.

Fix this by matching the dev_t as provided by lookup_bdev() instead of
plain strcmp() the device paths.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---

v2: Fix 
     sparse: warning: incorrect type in argument 1 (different address spaces)
     For using device->name->str

    Fix Josef suggestion to pass dev_t instead of device-path in the
     patch 2/2.

 fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1b02c03a882c..559fdb0c4a0e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
 	return ret;
 }
 
-static bool device_path_matched(const char *path, struct btrfs_device *device)
+/*
+ * Check if the device in the 'path' matches with the device in the given
+ * struct btrfs_device '*device'.
+ * Returns:
+ *	0	If it is the same device.
+ *	1	If it is not the same device.
+ *	-errno	For error.
+ */
+static int device_matched(struct btrfs_device *device, const char *path)
 {
-	int found;
+	char *device_name;
+	dev_t dev_old;
+	dev_t dev_new;
+	int ret;
+
+	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
+	if (!device_name)
+		return -ENOMEM;
 
 	rcu_read_lock();
-	found = strcmp(rcu_str_deref(device->name), path);
+	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
 	rcu_read_unlock();
+	if (!ret) {
+		kfree(device_name);
+		return -EINVAL;
+	}
 
-	return found == 0;
+	ret = lookup_bdev(device_name, &dev_old);
+	kfree(device_name);
+	if (ret)
+		return ret;
+
+	ret = lookup_bdev(path, &dev_new);
+	if (ret)
+		return ret;
+
+	if (dev_old == dev_new)
+		return 0;
+
+	return 1;
 }
 
 /*
@@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
 				continue;
 			if (path && !device->name)
 				continue;
-			if (path && !device_path_matched(path, device))
+			if (path && device_matched(device, path) != 0)
 				continue;
 			if (fs_devices->opened) {
 				/* for an already deleted device return 0 */
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] btrfs: redeclare btrfs_stale_devices arg1 to dev_t
  2021-12-10 18:15 [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
  2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
@ 2021-12-10 18:15 ` Anand Jain
  2022-01-04  8:15 ` [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
  2 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2021-12-10 18:15 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef

After the commit cb57afa39796 ("btrfs: harden identification of the stale
device"), we don't have to match the device path anymore. Instead, we
match the dev_t. So pass in the dev_t instead of the device-path, in the call
chain btrfs_forget_devices()->btrfs_free_stale_devices().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/super.c   |  8 +++++++-
 fs/btrfs/volumes.c | 45 +++++++++++++++++++++++----------------------
 fs/btrfs/volumes.h |  2 +-
 3 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index a1c54a2c787c..985395085886 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2386,6 +2386,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 {
 	struct btrfs_ioctl_vol_args *vol;
 	struct btrfs_device *device = NULL;
+	dev_t devt = 0;
 	int ret = -ENOTTY;
 
 	if (!capable(CAP_SYS_ADMIN))
@@ -2405,7 +2406,12 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 		mutex_unlock(&uuid_mutex);
 		break;
 	case BTRFS_IOC_FORGET_DEV:
-		ret = btrfs_forget_devices(vol->name);
+		if (strlen(vol->name)) {
+			ret = lookup_bdev(vol->name, &devt);
+			if (ret)
+				break;
+		}
+		ret = btrfs_forget_devices(devt);
 		break;
 	case BTRFS_IOC_DEVICES_READY:
 		mutex_lock(&uuid_mutex);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 559fdb0c4a0e..fdf35dc561ab 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -542,11 +542,10 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
  *	1	If it is not the same device.
  *	-errno	For error.
  */
-static int device_matched(struct btrfs_device *device, const char *path)
+static int device_matched(struct btrfs_device *device, dev_t dev_new)
 {
 	char *device_name;
 	dev_t dev_old;
-	dev_t dev_new;
 	int ret;
 
 	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
@@ -566,10 +565,6 @@ static int device_matched(struct btrfs_device *device, const char *path)
 	if (ret)
 		return ret;
 
-	ret = lookup_bdev(path, &dev_new);
-	if (ret)
-		return ret;
-
 	if (dev_old == dev_new)
 		return 0;
 
@@ -579,16 +574,16 @@ static int device_matched(struct btrfs_device *device, const char *path)
 /*
  *  Search and remove all stale (devices which are not mounted) devices.
  *  When both inputs are NULL, it will search and release all stale devices.
- *  path:	Optional. When provided will it release all unmounted devices
- *		matching this path only.
+ *  devt:	Optional. When provided will it release all unmounted devices
+ *		matching this devt only.
  *  skip_dev:	Optional. Will skip this device when searching for the stale
  *		devices.
- *  Return:	0 for success or if @path is NULL.
- * 		-EBUSY if @path is a mounted device.
- * 		-ENOENT if @path does not match any device in the list.
+ *  Return:	0 for success or if @devt is 0.
+ *		-EBUSY if @devt is a mounted device.
+ *		-ENOENT if @devt does not match any device in the list.
  */
-static int btrfs_free_stale_devices(const char *path,
-				     struct btrfs_device *skip_device)
+static int btrfs_free_stale_devices(dev_t devt,
+				    struct btrfs_device *skip_device)
 {
 	struct btrfs_fs_devices *fs_devices, *tmp_fs_devices;
 	struct btrfs_device *device, *tmp_device;
@@ -596,7 +591,7 @@ static int btrfs_free_stale_devices(const char *path,
 
 	lockdep_assert_held(&uuid_mutex);
 
-	if (path)
+	if (devt)
 		ret = -ENOENT;
 
 	list_for_each_entry_safe(fs_devices, tmp_fs_devices, &fs_uuids, fs_list) {
@@ -606,13 +601,13 @@ static int btrfs_free_stale_devices(const char *path,
 					 &fs_devices->devices, dev_list) {
 			if (skip_device && skip_device == device)
 				continue;
-			if (path && !device->name)
+			if (devt && !device->name)
 				continue;
-			if (path && device_matched(device, path) != 0)
+			if (devt && device_matched(device, devt) != 0)
 				continue;
 			if (fs_devices->opened) {
 				/* for an already deleted device return 0 */
-				if (path && ret != 0)
+				if (devt && ret != 0)
 					ret = -EBUSY;
 				break;
 			}
@@ -1361,12 +1356,12 @@ static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev
 	return disk_super;
 }
 
-int btrfs_forget_devices(const char *path)
+int btrfs_forget_devices(dev_t devt)
 {
 	int ret;
 
 	mutex_lock(&uuid_mutex);
-	ret = btrfs_free_stale_devices(strlen(path) ? path : NULL, NULL);
+	ret = btrfs_free_stale_devices(devt, NULL);
 	mutex_unlock(&uuid_mutex);
 
 	return ret;
@@ -1414,8 +1409,12 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
 
 	device = device_list_add(path, disk_super, &new_device_added);
 	if (!IS_ERR(device)) {
-		if (new_device_added)
-			btrfs_free_stale_devices(path, device);
+		if (new_device_added) {
+			dev_t devt;
+
+			if (!lookup_bdev(path, &devt))
+				btrfs_free_stale_devices(devt, device);
+		}
 	}
 
 	btrfs_release_disk_super(disk_super);
@@ -2649,6 +2648,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	int ret = 0;
 	bool seeding_dev = false;
 	bool locked = false;
+	dev_t devt;
 
 	if (sb_rdonly(sb) && !fs_devices->seeding)
 		return -EROFS;
@@ -2843,7 +2843,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	 * We can ignore the return value as it typically returns -EINVAL and
 	 * only succeeds if the device was an alien.
 	 */
-	btrfs_forget_devices(device_path);
+	if (!lookup_bdev(device_path, &devt))
+		btrfs_forget_devices(devt);
 
 	/* Update ctime/mtime for blkid or udev */
 	update_dev_time(device_path);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 9cf1d93a3d66..1b644ee60d22 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -512,7 +512,7 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 		       fmode_t flags, void *holder);
 struct btrfs_device *btrfs_scan_one_device(const char *path,
 					   fmode_t flags, void *holder);
-int btrfs_forget_devices(const char *path);
+int btrfs_forget_devices(dev_t devt);
 void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
 void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices);
 void btrfs_assign_next_active_device(struct btrfs_device *device,
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
@ 2021-12-13 15:04   ` Nikolay Borisov
  2021-12-13 15:14     ` Nikolay Borisov
  2022-01-04 18:56   ` David Sterba
  1 sibling, 1 reply; 11+ messages in thread
From: Nikolay Borisov @ 2021-12-13 15:04 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs; +Cc: josef



On 10.12.21 г. 20:15, Anand Jain wrote:
> Identifying and removing the stale device from the fs_uuids list is done
> by the function btrfs_free_stale_devices().
> btrfs_free_stale_devices() in turn depends on the function
> device_path_matched() to check if the device repeats in more than one
> btrfs_device structure.
> 
> The matching of the device happens by its path, the device path. However,
> when dm mapper is in use, the dm device paths are nothing but a link to
> the actual block device, which leads to the device_path_matched() failing
> to match.
> 
> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> plain strcmp() the device paths.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> 
> v2: Fix 
>      sparse: warning: incorrect type in argument 1 (different address spaces)
>      For using device->name->str
> 
>     Fix Josef suggestion to pass dev_t instead of device-path in the
>      patch 2/2.
> 
>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1b02c03a882c..559fdb0c4a0e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>  	return ret;
>  }
>  
> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> +/*
> + * Check if the device in the 'path' matches with the device in the given
> + * struct btrfs_device '*device'.
> + * Returns:
> + *	0	If it is the same device.
> + *	1	If it is not the same device.
> + *	-errno	For error.

This convention is somewhat confusing. This function returns a boolean
meaniing if a device matched or not, yet the retval follows strcmp
convention of return values. That is make 1 mean "device matched" and
"0" mean device not matched. Because ultimately that's what we care for.

Furthermore you give it the ability to return an error which not
consumed at all. Simply make the function boolean and return false if an
error is encountered by some of the internal calls.

> + */
> +static int device_matched(struct btrfs_device *device, const char *path)
>  {
> -	int found;
> +	char *device_name;
> +	dev_t dev_old;
> +	dev_t dev_new;
> +	int ret;
> +
> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> +	if (!device_name)
> +		return -ENOMEM;
>  
>  	rcu_read_lock();
> -	found = strcmp(rcu_str_deref(device->name), path);
> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>  	rcu_read_unlock();
> +	if (!ret) {
> +		kfree(device_name);
> +		return -EINVAL;
> +	}
>  
> -	return found == 0;
> +	ret = lookup_bdev(device_name, &dev_old);

Instead of allocating memory for storing device->name and freeing it,
AFAICS lookup_bdev can be called under rcu read section so you can
simply call lookup_bdev under rcu_read_lock which simplifies memory
management.


In the end this function really boils down to making 2 calls to
lookup_bdev and comparing their values for equality, no need for
anything more fancy than that.


> +	kfree(device_name);
> +	if (ret)
> +		return ret;
> +
> +	ret = lookup_bdev(path, &dev_new);
> +	if (ret)
> +		return ret;
> +
> +	if (dev_old == dev_new)
> +		return 0;
> +
> +	return 1;
>  }
>  
>  /*
> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,

What's more lookinng at the body of free_stale_device I find the name of
the function somewhat confusing. What it does is really delete all
devices from all fs_devices which match a particular criterion for a
device path i.e the function's body doesn't deal with the concept of
"stale" at all. As such I think it should be renamed and given a more
generic name like btrfs_free_specific_device or something along those
lines.

>  				continue;
>  			if (path && !device->name)
>  				continue;
> -			if (path && !device_path_matched(path, device))
> +			if (path && device_matched(device, path) != 0)
>  				continue;
>  			if (fs_devices->opened) {
>  				/* for an already deleted device return 0 */
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2021-12-13 15:04   ` Nikolay Borisov
@ 2021-12-13 15:14     ` Nikolay Borisov
  2021-12-14 14:27       ` Anand Jain
  0 siblings, 1 reply; 11+ messages in thread
From: Nikolay Borisov @ 2021-12-13 15:14 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs; +Cc: josef



On 13.12.21 г. 17:04, Nikolay Borisov wrote:
> 
> 
> On 10.12.21 г. 20:15, Anand Jain wrote:
>> Identifying and removing the stale device from the fs_uuids list is done
>> by the function btrfs_free_stale_devices().
>> btrfs_free_stale_devices() in turn depends on the function
>> device_path_matched() to check if the device repeats in more than one
>> btrfs_device structure.
>>
>> The matching of the device happens by its path, the device path. However,
>> when dm mapper is in use, the dm device paths are nothing but a link to
>> the actual block device, which leads to the device_path_matched() failing
>> to match.
>>
>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>> plain strcmp() the device paths.
>>
>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>
>> v2: Fix 
>>      sparse: warning: incorrect type in argument 1 (different address spaces)
>>      For using device->name->str
>>
>>     Fix Josef suggestion to pass dev_t instead of device-path in the
>>      patch 2/2.
>>
>>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1b02c03a882c..559fdb0c4a0e 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>  	return ret;
>>  }
>>  
>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>> +/*
>> + * Check if the device in the 'path' matches with the device in the given
>> + * struct btrfs_device '*device'.
>> + * Returns:
>> + *	0	If it is the same device.
>> + *	1	If it is not the same device.
>> + *	-errno	For error.
> 
> This convention is somewhat confusing. This function returns a boolean
> meaniing if a device matched or not, yet the retval follows strcmp
> convention of return values. That is make 1 mean "device matched" and
> "0" mean device not matched. Because ultimately that's what we care for.
> 
> Furthermore you give it the ability to return an error which not
> consumed at all. Simply make the function boolean and return false if an
> error is encountered by some of the internal calls.
> 
>> + */
>> +static int device_matched(struct btrfs_device *device, const char *path)
>>  {
>> -	int found;
>> +	char *device_name;
>> +	dev_t dev_old;
>> +	dev_t dev_new;
>> +	int ret;
>> +
>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>> +	if (!device_name)
>> +		return -ENOMEM;
>>  
>>  	rcu_read_lock();
>> -	found = strcmp(rcu_str_deref(device->name), path);
>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>>  	rcu_read_unlock();
>> +	if (!ret) {
>> +		kfree(device_name);
>> +		return -EINVAL;
>> +	}
>>  
>> -	return found == 0;
>> +	ret = lookup_bdev(device_name, &dev_old);
> 
> Instead of allocating memory for storing device->name and freeing it,
> AFAICS lookup_bdev can be called under rcu read section so you can
> simply call lookup_bdev under rcu_read_lock which simplifies memory
> management.

lookup_bdev calls kern_path->filejame_lookup which does an initial try
to lookup the name via an RCU but if it gets a ESTALE/ECHILD it will
fallback to a full path resolution and that *might* sleep so actually
doing the dynamic memory allocation is necessary... Bummer.

> 
> 
> In the end this function really boils down to making 2 calls to
> lookup_bdev and comparing their values for equality, no need for
> anything more fancy than that.
> 
> 
>> +	kfree(device_name);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = lookup_bdev(path, &dev_new);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (dev_old == dev_new)
>> +		return 0;
>> +
>> +	return 1;
>>  }
>>  
>>  /*
>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
> 
> What's more lookinng at the body of free_stale_device I find the name of
> the function somewhat confusing. What it does is really delete all
> devices from all fs_devices which match a particular criterion for a
> device path i.e the function's body doesn't deal with the concept of
> "stale" at all. As such I think it should be renamed and given a more
> generic name like btrfs_free_specific_device or something along those
> lines.
> 
>>  				continue;
>>  			if (path && !device->name)
>>  				continue;
>> -			if (path && !device_path_matched(path, device))
>> +			if (path && device_matched(device, path) != 0)
>>  				continue;
>>  			if (fs_devices->opened) {
>>  				/* for an already deleted device return 0 */
>>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2021-12-13 15:14     ` Nikolay Borisov
@ 2021-12-14 14:27       ` Anand Jain
  0 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2021-12-14 14:27 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: josef



On 13/12/2021 23:14, Nikolay Borisov wrote:
> 
> 
> On 13.12.21 г. 17:04, Nikolay Borisov wrote:
>>
>>
>> On 10.12.21 г. 20:15, Anand Jain wrote:
>>> Identifying and removing the stale device from the fs_uuids list is done
>>> by the function btrfs_free_stale_devices().
>>> btrfs_free_stale_devices() in turn depends on the function
>>> device_path_matched() to check if the device repeats in more than one
>>> btrfs_device structure.
>>>
>>> The matching of the device happens by its path, the device path. However,
>>> when dm mapper is in use, the dm device paths are nothing but a link to
>>> the actual block device, which leads to the device_path_matched() failing
>>> to match.
>>>
>>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>>> plain strcmp() the device paths.
>>>
>>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>
>>> v2: Fix
>>>       sparse: warning: incorrect type in argument 1 (different address spaces)
>>>       For using device->name->str
>>>
>>>      Fix Josef suggestion to pass dev_t instead of device-path in the
>>>       patch 2/2.
>>>
>>>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 1b02c03a882c..559fdb0c4a0e 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>>   	return ret;
>>>   }
>>>   
>>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>>> +/*
>>> + * Check if the device in the 'path' matches with the device in the given
>>> + * struct btrfs_device '*device'.
>>> + * Returns:
>>> + *	0	If it is the same device.
>>> + *	1	If it is not the same device.
>>> + *	-errno	For error.
>>
>> This convention is somewhat confusing. This function returns a boolean
>> meaniing if a device matched or not, yet the retval follows strcmp
>> convention of return values. That is make 1 mean "device matched" and
>> "0" mean device not matched. Because ultimately that's what we care for.
>>
>> Furthermore you give it the ability to return an error which not
>> consumed at all. Simply make the function boolean and return false if an
>> error is encountered by some of the internal calls.
>>
>>> + */
>>> +static int device_matched(struct btrfs_device *device, const char *path)
>>>   {
>>> -	int found;
>>> +	char *device_name;
>>> +	dev_t dev_old;
>>> +	dev_t dev_new;
>>> +	int ret;
>>> +
>>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>>> +	if (!device_name)
>>> +		return -ENOMEM;
>>>   
>>>   	rcu_read_lock();
>>> -	found = strcmp(rcu_str_deref(device->name), path);
>>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>>>   	rcu_read_unlock();
>>> +	if (!ret) {
>>> +		kfree(device_name);
>>> +		return -EINVAL;
>>> +	}
>>>   
>>> -	return found == 0;
>>> +	ret = lookup_bdev(device_name, &dev_old);
>>
>> Instead of allocating memory for storing device->name and freeing it,
>> AFAICS lookup_bdev can be called under rcu read section so you can
>> simply call lookup_bdev under rcu_read_lock which simplifies memory
>> management.
> 
> lookup_bdev calls kern_path->filejame_lookup which does an initial try
> to lookup the name via an RCU but if it gets a ESTALE/ECHILD it will
> fallback to a full path resolution and that *might* sleep so actually
> doing the dynamic memory allocation is necessary... Bummer.
> 

Yep.
Also, the device_matched() function might go away in the long run, as I 
found it is a good idea to keep the dev_t in the struct btrfs_device 
when we open it.

Thanks, Anand

>>
>>
>> In the end this function really boils down to making 2 calls to
>> lookup_bdev and comparing their values for equality, no need for
>> anything more fancy than that.
>>
>>
>>> +	kfree(device_name);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	ret = lookup_bdev(path, &dev_new);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (dev_old == dev_new)
>>> +		return 0;
>>> +
>>> +	return 1;
>>>   }
>>>   
>>>   /*
>>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>>
>> What's more lookinng at the body of free_stale_device I find the name of
>> the function somewhat confusing. What it does is really delete all
>> devices from all fs_devices which match a particular criterion for a
>> device path i.e the function's body doesn't deal with the concept of
>> "stale" at all. As such I think it should be renamed and given a more
>> generic name like btrfs_free_specific_device or something along those
>> lines.
>>
>>>   				continue;
>>>   			if (path && !device->name)
>>>   				continue;
>>> -			if (path && !device_path_matched(path, device))
>>> +			if (path && device_matched(device, path) != 0)
>>>   				continue;
>>>   			if (fs_devices->opened) {
>>>   				/* for an already deleted device return 0 */
>>>
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/2] btrfs: match device by dev_t
  2021-12-10 18:15 [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
  2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
  2021-12-10 18:15 ` [PATCH v2 2/2] btrfs: redeclare btrfs_stale_devices arg1 to dev_t Anand Jain
@ 2022-01-04  8:15 ` Anand Jain
  2 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2022-01-04  8:15 UTC (permalink / raw)
  To: David Sterba; +Cc: josef, linux-btrfs


Gentle ping?

Thanks, Anand


On 11/12/2021 02:15, Anand Jain wrote:
> Patch 1 is the actual bug fix and should go to stable 5.4 as well.
> On 5.4 patch1 conflicts (outside of the changes in the patch),
> so not yet marked for the stable.
> 
> Patch 2 simplifies calling lookup_bdev() in the device_matched()
> by moving the same to the parent function two levels up.
> 
> Patch 2 is not merged with 1 because to keep the patch 1 changes local
> to a function so that it can be easily backported to 5.4 and 5.10.
> 
> We should save the dev_t in struct btrfs_device with that may be
> we could clean up a few more things, including fixing the below sparse
> warning.
> 
>    sparse: sparse: incorrect type in argument 1 (different address spaces)
> 
> For using without rcu:
> 
>    error = lookup_bdev(device->name->str, &dev_old);
> 
> Anand Jain (2):
>    btrfs: harden identification of the stale device
>    btrfs: redeclare btrfs_stale_devices arg1 to dev_t
> 
>   fs/btrfs/super.c   |  8 +++++-
>   fs/btrfs/volumes.c | 72 +++++++++++++++++++++++++++++++++-------------
>   fs/btrfs/volumes.h |  2 +-
>   3 files changed, 60 insertions(+), 22 deletions(-)
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
  2021-12-13 15:04   ` Nikolay Borisov
@ 2022-01-04 18:56   ` David Sterba
  2022-01-05 11:31     ` Anand Jain
  1 sibling, 1 reply; 11+ messages in thread
From: David Sterba @ 2022-01-04 18:56 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef

On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
> Identifying and removing the stale device from the fs_uuids list is done
> by the function btrfs_free_stale_devices().
> btrfs_free_stale_devices() in turn depends on the function
> device_path_matched() to check if the device repeats in more than one
> btrfs_device structure.
> 
> The matching of the device happens by its path, the device path. However,
> when dm mapper is in use, the dm device paths are nothing but a link to
> the actual block device, which leads to the device_path_matched() failing
> to match.
> 
> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> plain strcmp() the device paths.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> 
> v2: Fix 
>      sparse: warning: incorrect type in argument 1 (different address spaces)
>      For using device->name->str
> 
>     Fix Josef suggestion to pass dev_t instead of device-path in the
>      patch 2/2.
> 
>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1b02c03a882c..559fdb0c4a0e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>  	return ret;
>  }
>  
> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> +/*
> + * Check if the device in the 'path' matches with the device in the given
> + * struct btrfs_device '*device'.
> + * Returns:
> + *	0	If it is the same device.
> + *	1	If it is not the same device.
> + *	-errno	For error.
> + */
> +static int device_matched(struct btrfs_device *device, const char *path)
>  {
> -	int found;
> +	char *device_name;
> +	dev_t dev_old;
> +	dev_t dev_new;
> +	int ret;
> +
> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> +	if (!device_name)
> +		return -ENOMEM;
>  
>  	rcu_read_lock();
> -	found = strcmp(rcu_str_deref(device->name), path);
> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));

I wonder if the temporary allocation could be avoided, as it's the
exactly same path of the device, so it could be passed to lookup_bdev.

>  	rcu_read_unlock();
> +	if (!ret) {
> +		kfree(device_name);
> +		return -EINVAL;
> +	}
>  
> -	return found == 0;
> +	ret = lookup_bdev(device_name, &dev_old);
> +	kfree(device_name);
> +	if (ret)
> +		return ret;
> +
> +	ret = lookup_bdev(path, &dev_new);
> +	if (ret)
> +		return ret;
> +
> +	if (dev_old == dev_new)
> +		return 0;
> +
> +	return 1;
>  }
>  
>  /*
> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>  				continue;
>  			if (path && !device->name)
>  				continue;
> -			if (path && !device_path_matched(path, device))
> +			if (path && device_matched(device, path) != 0)

You've changed the fuction to return errors but it's not checked,
besides the memory allocation failure, the lookup functions could fail
for various reasons so I don't think it's safe to ignore the errors.

>  				continue;
>  			if (fs_devices->opened) {
>  				/* for an already deleted device return 0 */
> -- 
> 2.33.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2022-01-04 18:56   ` David Sterba
@ 2022-01-05 11:31     ` Anand Jain
  2022-01-06  6:05       ` Su Yue
  2022-01-07 14:48       ` David Sterba
  0 siblings, 2 replies; 11+ messages in thread
From: Anand Jain @ 2022-01-05 11:31 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef



On 05/01/2022 02:56, David Sterba wrote:
> On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
>> Identifying and removing the stale device from the fs_uuids list is done
>> by the function btrfs_free_stale_devices().
>> btrfs_free_stale_devices() in turn depends on the function
>> device_path_matched() to check if the device repeats in more than one
>> btrfs_device structure.
>>
>> The matching of the device happens by its path, the device path. However,
>> when dm mapper is in use, the dm device paths are nothing but a link to
>> the actual block device, which leads to the device_path_matched() failing
>> to match.
>>
>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>> plain strcmp() the device paths.
>>
>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>
>> v2: Fix
>>       sparse: warning: incorrect type in argument 1 (different address spaces)
>>       For using device->name->str
>>
>>      Fix Josef suggestion to pass dev_t instead of device-path in the
>>       patch 2/2.
>>
>>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1b02c03a882c..559fdb0c4a0e 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>   	return ret;
>>   }
>>   
>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>> +/*
>> + * Check if the device in the 'path' matches with the device in the given
>> + * struct btrfs_device '*device'.
>> + * Returns:
>> + *	0	If it is the same device.
>> + *	1	If it is not the same device.
>> + *	-errno	For error.
>> + */
>> +static int device_matched(struct btrfs_device *device, const char *path)
>>   {
>> -	int found;
>> +	char *device_name;
>> +	dev_t dev_old;
>> +	dev_t dev_new;
>> +	int ret;
>> +
>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>> +	if (!device_name)
>> +		return -ENOMEM;
>>   
>>   	rcu_read_lock();
>> -	found = strcmp(rcu_str_deref(device->name), path);
>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
> 
> I wonder if the temporary allocation could be avoided, as it's the
> exactly same path of the device, so it could be passed to lookup_bdev.

  Yeah, I tried but to no avail. Unless I drop the rcu read lock and
  use the str directly as below.

  lookup_bdev(device->name->str, &dev_old);

  We do skip rcu access for device->name at a few locations.

  Also, pls note lookup_bdev() can't be called within rcu_read_lock(),
  (calling sleep function warning).


>>   	rcu_read_unlock();
>> +	if (!ret) {
>> +		kfree(device_name);
>> +		return -EINVAL;
>> +	}
>>   
>> -	return found == 0;
>> +	ret = lookup_bdev(device_name, &dev_old);
>> +	kfree(device_name);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = lookup_bdev(path, &dev_new);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (dev_old == dev_new)
>> +		return 0;
>> +
>> +	return 1;
>>   }
>>   
>>   /*
>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>>   				continue;
>>   			if (path && !device->name)
>>   				continue;
>> -			if (path && !device_path_matched(path, device))
>> +			if (path && device_matched(device, path) != 0)
> 
> You've changed the fuction to return errors but it's not checked,
> besides the memory allocation failure, the lookup functions could fail
> for various reasons so I don't think it's safe to ignore the errors.

IMO there isn't much that the parent function should do even if the
device_matched() returns an error for the reasons you stated.

Mainly because btrfs_free_stale_devices() OR btrfs_forget_devices()
is used as a cleanup ops in the primary task functions such as
btrfs_scan_one_device() and btrfs_init_new_device(). Even if we don't
remove the stale there is no harm.

Further btrfs_forget_devices() is called from btrfs_control_ioctl()
which is a userland call for forget devices. So as we traverse the
device list, if a device lookup fails IMO, it is ok to skip to the next
device in the list and return the status of the device match.

Even more, IMO we should save the dev_t in the struct btrfs_device,
upon which the device_matched() will go away altogether. This change
is outside of the bug that we intended to fix here. I will clean that
up separately.

Thanks, Anand

>>   				continue;
>>   			if (fs_devices->opened) {
>>   				/* for an already deleted device return 0 */
>> -- 
>> 2.33.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2022-01-05 11:31     ` Anand Jain
@ 2022-01-06  6:05       ` Su Yue
  2022-01-07 14:48       ` David Sterba
  1 sibling, 0 replies; 11+ messages in thread
From: Su Yue @ 2022-01-06  6:05 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs, josef


On Wed 05 Jan 2022 at 19:31, Anand Jain <anand.jain@oracle.com> 
wrote:

> On 05/01/2022 02:56, David Sterba wrote:
>> On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
>>> Identifying and removing the stale device from the fs_uuids 
>>> list is done
>>> by the function btrfs_free_stale_devices().
>>> btrfs_free_stale_devices() in turn depends on the function
>>> device_path_matched() to check if the device repeats in more 
>>> than one
>>> btrfs_device structure.
>>>
>>> The matching of the device happens by its path, the device 
>>> path. However,
>>> when dm mapper is in use, the dm device paths are nothing but 
>>> a link to
>>> the actual block device, which leads to the 
>>> device_path_matched() failing
>>> to match.
>>>
>>> Fix this by matching the dev_t as provided by lookup_bdev() 
>>> instead of
>>> plain strcmp() the device paths.
>>>
>>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>
>>> v2: Fix
>>>       sparse: warning: incorrect type in argument 1 (different 
>>>       address spaces)
>>>       For using device->name->str
>>>
>>>      Fix Josef suggestion to pass dev_t instead of device-path 
>>>      in the
>>>       patch 2/2.
>>>
>>>   fs/btrfs/volumes.c | 41 
>>>   ++++++++++++++++++++++++++++++++++++-----
>>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 1b02c03a882c..559fdb0c4a0e 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char 
>>> *device_path, fmode_t flags, void *holder,
>>>   	return ret;
>>>   }
>>>   -static bool device_path_matched(const char *path, struct 
>>>   btrfs_device
>>> *device)
>>> +/*
>>> + * Check if the device in the 'path' matches with the device 
>>> in the given
>>> + * struct btrfs_device '*device'.
>>> + * Returns:
>>> + *	0	If it is the same device.
>>> + *	1	If it is not the same device.
>>> + *	-errno	For error.
>>> + */
>>> +static int device_matched(struct btrfs_device *device, const 
>>> char *path)
>>>   {
>>> -	int found;
>>> +	char *device_name;
>>> +	dev_t dev_old;
>>> +	dev_t dev_new;
>>> +	int ret;
>>> +
>>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>>> +	if (!device_name)
>>> +		return -ENOMEM;
>>>     	rcu_read_lock();
>>> -	found = strcmp(rcu_str_deref(device->name), path);
>>> +	ret = sprintf(device_name, "%s", 
>>> rcu_str_deref(device->name));
>> I wonder if the temporary allocation could be avoided, as it's 
>> the
>> exactly same path of the device, so it could be passed to 
>> lookup_bdev.
>
>  Yeah, I tried but to no avail. Unless I drop the rcu read lock 
>  and
>  use the str directly as below.
>
>  lookup_bdev(device->name->str, &dev_old);
>
>  We do skip rcu access for device->name at a few locations.
>
>  Also, pls note lookup_bdev() can't be called within 
>  rcu_read_lock(),
>  (calling sleep function warning).
>

Got it. And another evil and dirty solution is that open code the 
logic in
the only user btrfs_free_stale_devices(). Do the memory allocation 
and
lookup_bdev(path) before the big big loop so we won't be disturbed
error handling and save some times of lookup_bdev ;)

--
Su
>
>>>   	rcu_read_unlock();
>>> +	if (!ret) {
>>> +		kfree(device_name);
>>> +		return -EINVAL;
>>> +	}
>>>   -	return found == 0;
>>> +	ret = lookup_bdev(device_name, &dev_old);
>>> +	kfree(device_name);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	ret = lookup_bdev(path, &dev_new);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (dev_old == dev_new)
>>> +		return 0;
>>> +
>>> +	return 1;
>>>   }
>>>     /*
>>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const 
>>> char *path,
>>>   				continue;
>>>   			if (path && !device->name)
>>>   				continue;
>>> -			if (path && !device_path_matched(path, 
>>> device))
>>> +			if (path && device_matched(device, path) 
>>> != 0)
>> You've changed the fuction to return errors but it's not 
>> checked,
>> besides the memory allocation failure, the lookup functions 
>> could fail
>> for various reasons so I don't think it's safe to ignore the 
>> errors.
>
> IMO there isn't much that the parent function should do even if 
> the
> device_matched() returns an error for the reasons you stated.
>
> Mainly because btrfs_free_stale_devices() OR 
> btrfs_forget_devices()
> is used as a cleanup ops in the primary task functions such as
> btrfs_scan_one_device() and btrfs_init_new_device(). Even if we 
> don't
> remove the stale there is no harm.
>
> Further btrfs_forget_devices() is called from 
> btrfs_control_ioctl()
> which is a userland call for forget devices. So as we traverse 
> the
> device list, if a device lookup fails IMO, it is ok to skip to 
> the next
> device in the list and return the status of the device match.
>
> Even more, IMO we should save the dev_t in the struct 
> btrfs_device,
> upon which the device_matched() will go away altogether. This 
> change
> is outside of the bug that we intended to fix here. I will clean 
> that
> up separately.
>
> Thanks, Anand
>
>>>   				continue;
>>>   			if (fs_devices->opened) {
>>>   				/* for an already deleted device 
>>>   return 0 */
>>> -- 2.33.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] btrfs: harden identification of the stale device
  2022-01-05 11:31     ` Anand Jain
  2022-01-06  6:05       ` Su Yue
@ 2022-01-07 14:48       ` David Sterba
  1 sibling, 0 replies; 11+ messages in thread
From: David Sterba @ 2022-01-07 14:48 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs, josef

On Wed, Jan 05, 2022 at 07:31:31PM +0800, Anand Jain wrote:
> 
> 
> On 05/01/2022 02:56, David Sterba wrote:
> > On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
> >> Identifying and removing the stale device from the fs_uuids list is done
> >> by the function btrfs_free_stale_devices().
> >> btrfs_free_stale_devices() in turn depends on the function
> >> device_path_matched() to check if the device repeats in more than one
> >> btrfs_device structure.
> >>
> >> The matching of the device happens by its path, the device path. However,
> >> when dm mapper is in use, the dm device paths are nothing but a link to
> >> the actual block device, which leads to the device_path_matched() failing
> >> to match.
> >>
> >> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> >> plain strcmp() the device paths.
> >>
> >> Reported-by: Josef Bacik <josef@toxicpanda.com>
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>
> >> v2: Fix
> >>       sparse: warning: incorrect type in argument 1 (different address spaces)
> >>       For using device->name->str
> >>
> >>      Fix Josef suggestion to pass dev_t instead of device-path in the
> >>       patch 2/2.
> >>
> >>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
> >>   1 file changed, 36 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index 1b02c03a882c..559fdb0c4a0e 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
> >>   	return ret;
> >>   }
> >>   
> >> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> >> +/*
> >> + * Check if the device in the 'path' matches with the device in the given
> >> + * struct btrfs_device '*device'.
> >> + * Returns:
> >> + *	0	If it is the same device.
> >> + *	1	If it is not the same device.
> >> + *	-errno	For error.
> >> + */
> >> +static int device_matched(struct btrfs_device *device, const char *path)
> >>   {
> >> -	int found;
> >> +	char *device_name;
> >> +	dev_t dev_old;
> >> +	dev_t dev_new;
> >> +	int ret;
> >> +
> >> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> >> +	if (!device_name)
> >> +		return -ENOMEM;
> >>   
> >>   	rcu_read_lock();
> >> -	found = strcmp(rcu_str_deref(device->name), path);
> >> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
> > 
> > I wonder if the temporary allocation could be avoided, as it's the
> > exactly same path of the device, so it could be passed to lookup_bdev.
> 
>   Yeah, I tried but to no avail. Unless I drop the rcu read lock and
>   use the str directly as below.
> 
>   lookup_bdev(device->name->str, &dev_old);
> 
>   We do skip rcu access for device->name at a few locations.
> 
>   Also, pls note lookup_bdev() can't be called within rcu_read_lock(),
>   (calling sleep function warning).

I see, thank for checking it.

> 
> 
> >>   	rcu_read_unlock();
> >> +	if (!ret) {
> >> +		kfree(device_name);
> >> +		return -EINVAL;
> >> +	}
> >>   
> >> -	return found == 0;
> >> +	ret = lookup_bdev(device_name, &dev_old);
> >> +	kfree(device_name);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	ret = lookup_bdev(path, &dev_new);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	if (dev_old == dev_new)
> >> +		return 0;
> >> +
> >> +	return 1;
> >>   }
> >>   
> >>   /*
> >> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
> >>   				continue;
> >>   			if (path && !device->name)
> >>   				continue;
> >> -			if (path && !device_path_matched(path, device))
> >> +			if (path && device_matched(device, path) != 0)
> > 
> > You've changed the fuction to return errors but it's not checked,
> > besides the memory allocation failure, the lookup functions could fail
> > for various reasons so I don't think it's safe to ignore the errors.
> 
> IMO there isn't much that the parent function should do even if the
> device_matched() returns an error for the reasons you stated.
> 
> Mainly because btrfs_free_stale_devices() OR btrfs_forget_devices()
> is used as a cleanup ops in the primary task functions such as
> btrfs_scan_one_device() and btrfs_init_new_device(). Even if we don't
> remove the stale there is no harm.

Right, so a comment explaining why it's ok to ignore errors should be
sufficient.

> Further btrfs_forget_devices() is called from btrfs_control_ioctl()
> which is a userland call for forget devices. So as we traverse the
> device list, if a device lookup fails IMO, it is ok to skip to the next
> device in the list and return the status of the device match.
> 
> Even more, IMO we should save the dev_t in the struct btrfs_device,
> upon which the device_matched() will go away altogether. This change
> is outside of the bug that we intended to fix here. I will clean that
> up separately.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-01-07 14:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 18:15 [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain
2021-12-10 18:15 ` [PATCH v2 1/2] btrfs: harden identification of the stale device Anand Jain
2021-12-13 15:04   ` Nikolay Borisov
2021-12-13 15:14     ` Nikolay Borisov
2021-12-14 14:27       ` Anand Jain
2022-01-04 18:56   ` David Sterba
2022-01-05 11:31     ` Anand Jain
2022-01-06  6:05       ` Su Yue
2022-01-07 14:48       ` David Sterba
2021-12-10 18:15 ` [PATCH v2 2/2] btrfs: redeclare btrfs_stale_devices arg1 to dev_t Anand Jain
2022-01-04  8:15 ` [PATCH v2 0/2] btrfs: match device by dev_t Anand Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.