* [PATCH 1/4] btrfs: drop useless goto in open_fs_devices
2019-10-04 7:49 [PATCH 0/4] btrfs: fix issues due to alien device Anand Jain
@ 2019-10-04 7:50 ` Anand Jain
2019-10-04 8:12 ` Nikolay Borisov
2019-10-04 7:50 ` [PATCH 2/4] btrfs: delete identified alien device " Anand Jain
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2019-10-04 7:50 UTC (permalink / raw)
To: linux-btrfs
There is no need of goto out in open_fs_devices() as there is nothing
special done at %out:. So refactor it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
fs/btrfs/volumes.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e176346afed1..06ec3577c6b4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1330,7 +1330,6 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
{
struct btrfs_device *device;
struct btrfs_device *latest_dev = NULL;
- int ret = 0;
flags |= FMODE_EXCL;
@@ -1343,15 +1342,14 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
device->generation > latest_dev->generation)
latest_dev = device;
}
- if (fs_devices->open_devices == 0) {
- ret = -EINVAL;
- goto out;
- }
+ if (fs_devices->open_devices == 0)
+ return -EINVAL;
+
fs_devices->opened = 1;
fs_devices->latest_bdev = latest_dev->bdev;
fs_devices->total_rw_bytes = 0;
-out:
- return ret;
+
+ return 0;
}
static int devid_cmp(void *priv, struct list_head *a, struct list_head *b)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] btrfs: drop useless goto in open_fs_devices
2019-10-04 7:50 ` [PATCH 1/4] btrfs: drop useless goto in open_fs_devices Anand Jain
@ 2019-10-04 8:12 ` Nikolay Borisov
0 siblings, 0 replies; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-04 8:12 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 4.10.19 г. 10:50 ч., Anand Jain wrote:
> There is no need of goto out in open_fs_devices() as there is nothing
> special done at %out:. So refactor it.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/volumes.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index e176346afed1..06ec3577c6b4 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1330,7 +1330,6 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
> {
> struct btrfs_device *device;
> struct btrfs_device *latest_dev = NULL;
> - int ret = 0;
>
> flags |= FMODE_EXCL;
>
> @@ -1343,15 +1342,14 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
> device->generation > latest_dev->generation)
> latest_dev = device;
> }
> - if (fs_devices->open_devices == 0) {
> - ret = -EINVAL;
> - goto out;
> - }
> + if (fs_devices->open_devices == 0)
> + return -EINVAL;
> +
> fs_devices->opened = 1;
> fs_devices->latest_bdev = latest_dev->bdev;
> fs_devices->total_rw_bytes = 0;
> -out:
> - return ret;
> +
> + return 0;
> }
>
> static int devid_cmp(void *priv, struct list_head *a, struct list_head *b)
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] btrfs: delete identified alien device in open_fs_devices
2019-10-04 7:49 [PATCH 0/4] btrfs: fix issues due to alien device Anand Jain
2019-10-04 7:50 ` [PATCH 1/4] btrfs: drop useless goto in open_fs_devices Anand Jain
@ 2019-10-04 7:50 ` Anand Jain
2019-10-04 8:18 ` Nikolay Borisov
2019-10-04 7:50 ` [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev Anand Jain
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2019-10-04 7:50 UTC (permalink / raw)
To: linux-btrfs
In open_fs_devices() we identify alien device but we don't reset its
the device::name. So progs device list does not show the device missing
as shown in the script below.
mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
mkfs.btrfs -fq -draid1 -mraid1 /dev/sdc /dev/sdb
sleep 3 # avoid racing with udev's useless scans if needed
btrfs dev add -f /dev/sdb /btrfs
mount -o degraded /dev/sdc /btrfs1
No missing device:
btrfs fi show -m /btrfs1
Label: none uuid: 3eb7cd50-4594-458f-9d68-c243cc49954d
Total devices 2 FS bytes used 128.00KiB
devid 1 size 12.00GiB used 1.26GiB path /dev/sdc
devid 2 size 12.00GiB used 1.26GiB path /dev/sdb
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
PS: Fundamentally its wrong approach that btrfs-progs deduces the device
missing state in the userland instead of obtaining it from the kernel.
I objected on the patch, but still those patches got merged, this bug is
one of its side effects. Ironically I wrote patches to read device_state
from the kernel using ioctl, procfs and sysfs but didn't get the due
attention till a merger.
fs/btrfs/volumes.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 06ec3577c6b4..05ade8c7342b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -803,10 +803,10 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
disk_super = (struct btrfs_super_block *)bh->b_data;
devid = btrfs_stack_device_id(&disk_super->dev_item);
if (devid != device->devid)
- goto error_brelse;
+ goto free_alien;
if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
- goto error_brelse;
+ goto free_alien;
device->generation = btrfs_super_generation(disk_super);
@@ -845,6 +845,11 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
return 0;
+free_alien:
+ fs_devices->num_devices--;
+ list_del(&device->dev_list);
+ btrfs_free_device(device);
+
error_brelse:
brelse(bh);
blkdev_put(bdev, flags);
@@ -1329,11 +1334,13 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
fmode_t flags, void *holder)
{
struct btrfs_device *device;
+ struct btrfs_device *tmp_device;
struct btrfs_device *latest_dev = NULL;
flags |= FMODE_EXCL;
- list_for_each_entry(device, &fs_devices->devices, dev_list) {
+ list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
+ dev_list) {
/* Just open everything we can; ignore failures here */
if (btrfs_open_one_device(fs_devices, device, flags, holder))
continue;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] btrfs: delete identified alien device in open_fs_devices
2019-10-04 7:50 ` [PATCH 2/4] btrfs: delete identified alien device " Anand Jain
@ 2019-10-04 8:18 ` Nikolay Borisov
2019-10-06 2:47 ` [PATCH v2 " Anand Jain
2019-10-06 2:51 ` [PATCH " Anand Jain
0 siblings, 2 replies; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-04 8:18 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 4.10.19 г. 10:50 ч., Anand Jain wrote:
> In open_fs_devices() we identify alien device but we don't reset its
> the device::name. So progs device list does not show the device missing
> as shown in the script below.
>
> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdc /dev/sdb
> sleep 3 # avoid racing with udev's useless scans if needed
> btrfs dev add -f /dev/sdb /btrfs
> mount -o degraded /dev/sdc /btrfs1
>
> No missing device:
> btrfs fi show -m /btrfs1
> Label: none uuid: 3eb7cd50-4594-458f-9d68-c243cc49954d
> Total devices 2 FS bytes used 128.00KiB
> devid 1 size 12.00GiB used 1.26GiB path /dev/sdc
> devid 2 size 12.00GiB used 1.26GiB path /dev/sdb
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> PS: Fundamentally its wrong approach that btrfs-progs deduces the device
> missing state in the userland instead of obtaining it from the kernel.
> I objected on the patch, but still those patches got merged, this bug is
> one of its side effects. Ironically I wrote patches to read device_state
> from the kernel using ioctl, procfs and sysfs but didn't get the due
> attention till a merger.
>
> fs/btrfs/volumes.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 06ec3577c6b4..05ade8c7342b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -803,10 +803,10 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
> disk_super = (struct btrfs_super_block *)bh->b_data;
> devid = btrfs_stack_device_id(&disk_super->dev_item);
> if (devid != device->devid)
> - goto error_brelse;
> + goto free_alien;
>
> if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
> - goto error_brelse;
> + goto free_alien;
>
Imo a better approach is to return a particular error code and do the
deletion in open_fs_devices. Otherwise it's not apparent why you use
list_for_each_entry_safe in one function to delete something in a
different one (whose name by the way doesn't suggest a deletion is going
on). Looking at the error I think enodev/enxio is appropriate.
> device->generation = btrfs_super_generation(disk_super);
>
> @@ -845,6 +845,11 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>
> return 0;
>
> +free_alien:
> + fs_devices->num_devices--;
> + list_del(&device->dev_list);
> + btrfs_free_device(device);
> +
> error_brelse:
> brelse(bh);
> blkdev_put(bdev, flags);
> @@ -1329,11 +1334,13 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
> fmode_t flags, void *holder)
> {
> struct btrfs_device *device;
> + struct btrfs_device *tmp_device;
> struct btrfs_device *latest_dev = NULL;
>
> flags |= FMODE_EXCL;
>
> - list_for_each_entry(device, &fs_devices->devices, dev_list) {
> + list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
> + dev_list) {
> /* Just open everything we can; ignore failures here */
> if (btrfs_open_one_device(fs_devices, device, flags, holder))
> continue;
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 2/4] btrfs: delete identified alien device in open_fs_devices
2019-10-04 8:18 ` Nikolay Borisov
@ 2019-10-06 2:47 ` Anand Jain
2019-10-06 11:44 ` Nikolay Borisov
2019-10-06 2:51 ` [PATCH " Anand Jain
1 sibling, 1 reply; 15+ messages in thread
From: Anand Jain @ 2019-10-06 2:47 UTC (permalink / raw)
To: linux-btrfs; +Cc: nborisov
In open_fs_devices() we identify alien device but we don't reset its
the device::name. So progs device list does not show the device missing
as shown in the script below.
mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
mkfs.btrfs -fq -draid1 -mraid1 /dev/sdc /dev/sdb
sleep 3 # avoid racing with udev's useless scans if needed
btrfs dev add -f /dev/sdb /btrfs
mount -o degraded /dev/sdc /btrfs1
No missing device:
btrfs fi show -m /btrfs1
Label: none uuid: 3eb7cd50-4594-458f-9d68-c243cc49954d
Total devices 2 FS bytes used 128.00KiB
devid 1 size 12.00GiB used 1.26GiB path /dev/sdc
devid 2 size 12.00GiB used 1.26GiB path /dev/sdb
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v2: Move free alien part to its parent function btrfs_open_one_device.
Thanks Nikolay.
PS: Fundamentally its wrong approach that btrfs-progs deduces the device
missing state in the userland instead of obtaining it from the kernel.
I objected on the patch, but still those patches got merged, this bug is
one of its side effects. Ironically I wrote patches to read device_state
from the kernel using ioctl, procfs and sysfs but didn't get the due
attention till a merger.
fs/btrfs/volumes.c | 32 +++++++++++++++++++++++---------
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c223a8147bfd..21aaf64c59b2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -591,13 +591,18 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
if (ret)
return ret;
+ ret = -EINVAL;
disk_super = (struct btrfs_super_block *)bh->b_data;
devid = btrfs_stack_device_id(&disk_super->dev_item);
- if (devid != device->devid)
+ if (devid != device->devid) {
+ ret = -EUCLEAN;
goto error_brelse;
+ }
- if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
+ if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE)) {
+ ret = -EUCLEAN;
goto error_brelse;
+ }
device->generation = btrfs_super_generation(disk_super);
@@ -640,7 +645,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
brelse(bh);
blkdev_put(bdev, flags);
- return -EINVAL;
+ return ret;
}
/*
@@ -1121,19 +1126,28 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
fmode_t flags, void *holder)
{
+ int ret;
struct btrfs_device *device;
+ struct btrfs_device *tmp_device;
struct btrfs_device *latest_dev = NULL;
flags |= FMODE_EXCL;
- list_for_each_entry(device, &fs_devices->devices, dev_list) {
+ list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
+ dev_list) {
/* Just open everything we can; ignore failures here */
- if (btrfs_open_one_device(fs_devices, device, flags, holder))
- continue;
-
- if (!latest_dev ||
- device->generation > latest_dev->generation)
+ ret = btrfs_open_one_device(fs_devices, device, flags, holder);
+ if (ret == 0 && (!latest_dev ||
+ device->generation > latest_dev->generation)) {
latest_dev = device;
+ continue;
+ }
+ if (ret == -EUCLEAN) {
+ /* An alien device. Clean it up */
+ fs_devices->num_devices--;
+ list_del(&device->dev_list);
+ btrfs_free_device(device);
+ }
}
if (fs_devices->open_devices == 0)
return -EINVAL;
--
2.23.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] btrfs: delete identified alien device in open_fs_devices
2019-10-06 2:47 ` [PATCH v2 " Anand Jain
@ 2019-10-06 11:44 ` Nikolay Borisov
0 siblings, 0 replies; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-06 11:44 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 6.10.19 г. 5:47 ч., Anand Jain wrote:
> In open_fs_devices() we identify alien device but we don't reset its
> the device::name. So progs device list does not show the device missing
> as shown in the script below.
>
> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdc /dev/sdb
> sleep 3 # avoid racing with udev's useless scans if needed
> btrfs dev add -f /dev/sdb /btrfs
> mount -o degraded /dev/sdc /btrfs1
>
> No missing device:
> btrfs fi show -m /btrfs1
> Label: none uuid: 3eb7cd50-4594-458f-9d68-c243cc49954d
> Total devices 2 FS bytes used 128.00KiB
> devid 1 size 12.00GiB used 1.26GiB path /dev/sdc
> devid 2 size 12.00GiB used 1.26GiB path /dev/sdb
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v2: Move free alien part to its parent function btrfs_open_one_device.
> Thanks Nikolay.
>
> PS: Fundamentally its wrong approach that btrfs-progs deduces the device
> missing state in the userland instead of obtaining it from the kernel.
> I objected on the patch, but still those patches got merged, this bug is
> one of its side effects. Ironically I wrote patches to read device_state
> from the kernel using ioctl, procfs and sysfs but didn't get the due
> attention till a merger.
>
> fs/btrfs/volumes.c | 32 +++++++++++++++++++++++---------
> 1 file changed, 23 insertions(+), 9 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index c223a8147bfd..21aaf64c59b2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -591,13 +591,18 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
> if (ret)
> return ret;
>
> + ret = -EINVAL;
> disk_super = (struct btrfs_super_block *)bh->b_data;
> devid = btrfs_stack_device_id(&disk_super->dev_item);
> - if (devid != device->devid)
> + if (devid != device->devid) {
> + ret = -EUCLEAN;
> goto error_brelse;
> + }
>
> - if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
> + if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE)) {
> + ret = -EUCLEAN;
> goto error_brelse;
> + }
>
> device->generation = btrfs_super_generation(disk_super);
>
> @@ -640,7 +645,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
> brelse(bh);
> blkdev_put(bdev, flags);
>
> - return -EINVAL;
> + return ret;
> }
>
> /*
> @@ -1121,19 +1126,28 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
> static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
> fmode_t flags, void *holder)
> {
> + int ret;
> struct btrfs_device *device;
> + struct btrfs_device *tmp_device;
> struct btrfs_device *latest_dev = NULL;
>
> flags |= FMODE_EXCL;
>
> - list_for_each_entry(device, &fs_devices->devices, dev_list) {
> + list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
> + dev_list) {
> /* Just open everything we can; ignore failures here */
> - if (btrfs_open_one_device(fs_devices, device, flags, holder))
> - continue;
> -
> - if (!latest_dev ||
> - device->generation > latest_dev->generation)
> + ret = btrfs_open_one_device(fs_devices, device, flags, holder);
> + if (ret == 0 && (!latest_dev ||
> + device->generation > latest_dev->generation)) {
> latest_dev = device;
> + continue;
> + }
nit: Had you used if () {} else if {} you could have done away with the
continue.
> + if (ret == -EUCLEAN) {
> + /* An alien device. Clean it up */
> + fs_devices->num_devices--;
> + list_del(&device->dev_list);
> + btrfs_free_device(device);
> + }
> }
> if (fs_devices->open_devices == 0)
> return -EINVAL;
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] btrfs: delete identified alien device in open_fs_devices
2019-10-04 8:18 ` Nikolay Borisov
2019-10-06 2:47 ` [PATCH v2 " Anand Jain
@ 2019-10-06 2:51 ` Anand Jain
1 sibling, 0 replies; 15+ messages in thread
From: Anand Jain @ 2019-10-06 2:51 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 06ec3577c6b4..05ade8c7342b 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -803,10 +803,10 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>> disk_super = (struct btrfs_super_block *)bh->b_data;
>> devid = btrfs_stack_device_id(&disk_super->dev_item);
>> if (devid != device->devid)
>> - goto error_brelse;
>> + goto free_alien;
>>
>> if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
>> - goto error_brelse;
>> + goto free_alien;
>>
>
> Imo a better approach is to return a particular error code and do the
> deletion in open_fs_devices. Otherwise it's not apparent why you use
> list_for_each_entry_safe in one function to delete something in a
> different one (whose name by the way doesn't suggest a deletion is going
> on). Looking at the error I think enodev/enxio is appropriate.
Agreed. And used -EUCLEAN I think that's most appropriate.
Thanks, Anand
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev
2019-10-04 7:49 [PATCH 0/4] btrfs: fix issues due to alien device Anand Jain
2019-10-04 7:50 ` [PATCH 1/4] btrfs: drop useless goto in open_fs_devices Anand Jain
2019-10-04 7:50 ` [PATCH 2/4] btrfs: delete identified alien device " Anand Jain
@ 2019-10-04 7:50 ` Anand Jain
2019-10-04 8:11 ` Nikolay Borisov
2019-10-04 7:50 ` [PATCH 4/4] btrfs: free alien device due to device add Anand Jain
2019-10-04 8:19 ` [PATCH 0/4] btrfs: fix issues due to alien device Nikolay Borisov
4 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2019-10-04 7:50 UTC (permalink / raw)
To: linux-btrfs
btrfs_free_extra_devids() reorgs fs_devices::latest_bdev
to point to the bdev with greatest device::generation number.
For a typical-missing device the generation number is zero so
fs_devices::latest_bdev will never point to it.
But if the missing device is due to alienating [1], then
device::generation is not-zero and if it is >= to rest of
device::generation in the list, then fs_devices::latest_bdev
ends up pointing to the missing device and reports the error
like this [2]
[1]
mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc
sleep 3 # avoid racing with udev's useless scans if needed
btrfs dev add -f /dev/sdb /btrfs
mount -o degraded /dev/sdc /btrfs1
[2]
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
kernel: BTRFS warning (device sdc): devid 1 uuid 072a0192-675b-4d5a-8640-a5cf2b2c704d is missing
kernel: BTRFS error (device sdc): failed to read devices
kernel: BTRFS error (device sdc): open_ctree failed
Fix the root of the issue, by checking if the the device is not
missing before it can be a contender for the fs_devices::latest_bdev
title.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
fs/btrfs/volumes.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 05ade8c7342b..6c42048ec099 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1186,6 +1186,8 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step)
&device->dev_state)) {
if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
&device->dev_state) &&
+ !test_bit(BTRFS_DEV_STATE_MISSING,
+ &device->dev_state) &&
(!latest_dev ||
device->generation > latest_dev->generation)) {
latest_dev = device;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev
2019-10-04 7:50 ` [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev Anand Jain
@ 2019-10-04 8:11 ` Nikolay Borisov
2019-10-04 9:08 ` [Not TLS] " Graham Cobb
0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-04 8:11 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 4.10.19 г. 10:50 ч., Anand Jain wrote:
> btrfs_free_extra_devids() reorgs fs_devices::latest_bdev
> to point to the bdev with greatest device::generation number.
> For a typical-missing device the generation number is zero so
> fs_devices::latest_bdev will never point to it.
>
> But if the missing device is due to alienating [1], then
> device::generation is not-zero and if it is >= to rest of
> device::generation in the list, then fs_devices::latest_bdev
> ends up pointing to the missing device and reports the error
> like this [2]
>
> [1]
> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc
> sleep 3 # avoid racing with udev's useless scans if needed
> btrfs dev add -f /dev/sdb /btrfs
Hm, here I think the correct way is to refuse adding /dev/sdb to an
existing fs if it's detected to be part of a different one. I.e it
should require wipefs to be done.
>
> mount -o degraded /dev/sdc /btrfs1
>
> [2]
> mount: wrong fs type, bad option, bad superblock on /dev/sdc,
> missing codepage or helper program, or other error
>
> In some cases useful info is found in syslog - try
> dmesg | tail or so.
>
> kernel: BTRFS warning (device sdc): devid 1 uuid 072a0192-675b-4d5a-8640-a5cf2b2c704d is missing
> kernel: BTRFS error (device sdc): failed to read devices
> kernel: BTRFS error (device sdc): open_ctree failed
>
> Fix the root of the issue, by checking if the the device is not
> missing before it can be a contender for the fs_devices::latest_bdev
> title.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> fs/btrfs/volumes.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 05ade8c7342b..6c42048ec099 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1186,6 +1186,8 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step)
> &device->dev_state)) {
> if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
> &device->dev_state) &&
> + !test_bit(BTRFS_DEV_STATE_MISSING,
> + &device->dev_state) &&
> (!latest_dev ||
> device->generation > latest_dev->generation)) {
> latest_dev = device;
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Not TLS] Re: [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev
2019-10-04 8:11 ` Nikolay Borisov
@ 2019-10-04 9:08 ` Graham Cobb
2019-10-04 9:20 ` Nikolay Borisov
0 siblings, 1 reply; 15+ messages in thread
From: Graham Cobb @ 2019-10-04 9:08 UTC (permalink / raw)
To: Nikolay Borisov, Anand Jain, linux-btrfs
On 04/10/2019 09:11, Nikolay Borisov wrote:
>
>
> On 4.10.19 г. 10:50 ч., Anand Jain wrote:
>> btrfs_free_extra_devids() reorgs fs_devices::latest_bdev
>> to point to the bdev with greatest device::generation number.
>> For a typical-missing device the generation number is zero so
>> fs_devices::latest_bdev will never point to it.
>>
>> But if the missing device is due to alienating [1], then
>> device::generation is not-zero and if it is >= to rest of
>> device::generation in the list, then fs_devices::latest_bdev
>> ends up pointing to the missing device and reports the error
>> like this [2]
>>
>> [1]
>> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
>> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc
>> sleep 3 # avoid racing with udev's useless scans if needed
>> btrfs dev add -f /dev/sdb /btrfs
>
> Hm, here I think the correct way is to refuse adding /dev/sdb to an
> existing fs if it's detected to be part of a different one. I.e it
> should require wipefs to be done.
I disagree. -f means "force overwrite of existing filesystem on the
given disk(s)". It shouldn't be any different whether the existing fs is
btrfs or something else.
Graham
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Not TLS] Re: [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev
2019-10-04 9:08 ` [Not TLS] " Graham Cobb
@ 2019-10-04 9:20 ` Nikolay Borisov
0 siblings, 0 replies; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-04 9:20 UTC (permalink / raw)
To: Graham Cobb, Anand Jain, linux-btrfs
On 4.10.19 г. 12:08 ч., Graham Cobb wrote:
> On 04/10/2019 09:11, Nikolay Borisov wrote:
>>
>>
>> On 4.10.19 г. 10:50 ч., Anand Jain wrote:
>>> btrfs_free_extra_devids() reorgs fs_devices::latest_bdev
>>> to point to the bdev with greatest device::generation number.
>>> For a typical-missing device the generation number is zero so
>>> fs_devices::latest_bdev will never point to it.
>>>
>>> But if the missing device is due to alienating [1], then
>>> device::generation is not-zero and if it is >= to rest of
>>> device::generation in the list, then fs_devices::latest_bdev
>>> ends up pointing to the missing device and reports the error
>>> like this [2]
>>>
>>> [1]
>>> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
>>> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc
>>> sleep 3 # avoid racing with udev's useless scans if needed
>>> btrfs dev add -f /dev/sdb /btrfs
>>
>> Hm, here I think the correct way is to refuse adding /dev/sdb to an
>> existing fs if it's detected to be part of a different one. I.e it
>> should require wipefs to be done.
>
> I disagree. -f means "force overwrite of existing filesystem on the
> given disk(s)". It shouldn't be any different whether the existing fs is
> btrfs or something else.
You are right, looking at btrfs device add implementation though it
seems that wipefs is being done on the device to be added:
cmd_device_add
btrfs_prepare_device
btrfs_wipe_existing_sb
So if the device to be added is formatted and then it goes through
btrfs_init_new_device which commits the transaction which in turns
rewrites the sb then why is the error happening in the first place?
>
> Graham
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/4] btrfs: free alien device due to device add
2019-10-04 7:49 [PATCH 0/4] btrfs: fix issues due to alien device Anand Jain
` (2 preceding siblings ...)
2019-10-04 7:50 ` [PATCH 3/4] btrfs: include non-missing as a qualifier for the latest_bdev Anand Jain
@ 2019-10-04 7:50 ` Anand Jain
2019-10-04 8:19 ` [PATCH 0/4] btrfs: fix issues due to alien device Nikolay Borisov
4 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2019-10-04 7:50 UTC (permalink / raw)
To: linux-btrfs
When the old device has new fsid through btrfs device add -f <dev> our
fs_devices list has an alien device in one of the fs_devices.
By having an alien device in fs_devices, we have two issues so far
1. missing device is not shows as missing in the userland
Which is due to cracks in the function btrfs_open_one_device() and
hardened by patch
btrfs: delete identified alien device in open_fs_devices
2. mount of a degraded fs_devices fails
Which is due to cracks in the function btrfs_free_extra_devids() and
hardened by patch
btrfs: include non-missing as a qualifier for the latest_bdev
Now the reason for both of this issue is that there is an alien (does not
contain the intended fsid) device in the fs_devices.
We know a device can be scanned/added through
btrfs-control::BTRFS_IOC_SCAN_DEV|BTRFS_IOC_DEVICES_READY
or by
ioctl::BTRFS_IOC_ADD_DEV
And device coming through btrfs-control is checked against the all other
devices in btrfs kernel but not coming through BTRFS_IOC_ADD_DEV.
This patch checks if the device add is alienating any other scanned
device and deletes it.
In fact, this patch fixes both the issues 1 and 2 (above) by eliminating
the source of the issue, but still they have their own patch as well
because its the right way to harden the functions and fill the cracks.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
fs/btrfs/volumes.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6c42048ec099..f8b21e82df2a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2736,6 +2736,19 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
/* Update ctime/mtime for libblkid */
update_dev_time(device_path);
+
+ /*
+ * Now that we have written a new sb into this device, check all other
+ * fs_devices list if it alienates any scanned device.
+ */
+ mutex_lock(&uuid_mutex);
+ /*
+ * Ignore the return as we are successfull in the core task - to added
+ * the device
+ */
+ btrfs_free_stale_devices(device_path, NULL);
+ mutex_unlock(&uuid_mutex);
+
return ret;
error_sysfs:
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] btrfs: fix issues due to alien device
2019-10-04 7:49 [PATCH 0/4] btrfs: fix issues due to alien device Anand Jain
` (3 preceding siblings ...)
2019-10-04 7:50 ` [PATCH 4/4] btrfs: free alien device due to device add Anand Jain
@ 2019-10-04 8:19 ` Nikolay Borisov
2019-10-07 9:42 ` Anand Jain
4 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2019-10-04 8:19 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 4.10.19 г. 10:49 ч., Anand Jain wrote:
> Alien device is a device in fs_devices list having a different fsid than
> the expected fsid. This patch set fixes issues found due to the same.
Are you going to submit an fstests patch for this bug ?
>
> Patch1: is a cleanup patch, not related.
> Patch2: fixes the missing device not missing in the userland, by
> hardening the function btrfs_open_one_device().
> Patch3: fixes failing to mount a degraded RAID1 (but it can apply
> to RAID5/6/10 as well), by hardening the function
> btrfs_free_extra_devids().
> Patch4: eliminates the source of the alien device in the fs_devices.
>
> Anand Jain (4):
> btrfs: drop useless goto in open_fs_devices
> btrfs: delete identified alien device in open_fs_devices
> btrfs: include non-missing as a qualifier for the latest_bdev
> btrfs: free alien device due to device add
>
> fs/btrfs/volumes.c | 40 ++++++++++++++++++++++++++++++----------
> 1 file changed, 30 insertions(+), 10 deletions(-)
>
^ permalink raw reply [flat|nested] 15+ messages in thread