* [PATCH] btrfs: handle dynamically reappearing missing device
@ 2017-11-12 10:56 Anand Jain
2017-11-15 7:38 ` kbuild test robot
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-12 10:56 UTC (permalink / raw)
To: linux-btrfs
If the device is not present at the time of (-o degrade) mount
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted. So this
patch handles that case by going through the open_device steps
which this device missed and finally adds to the device alloc list.
So now with this patch, to bring back the missing device user can run,
btrfs dev scan <path-of-missing-device>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
This patch needs:
[PATCH 0/4] factor __btrfs_open_devices()
fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d24e966ee29f..e7dd996831f2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
rcu_string_free(device->name);
rcu_assign_pointer(device->name, name);
if (device->missing) {
- fs_devices->missing_devices--;
- device->missing = 0;
+ int ret;
+ struct btrfs_fs_info *fs_info = fs_devices->fs_info;
+ fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
+
+ if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
+ fmode &= ~FMODE_WRITE;
+
+ /*
+ * Missing can be set only when FS is mounted.
+ * So here its always fs_devices->opened > 0 and most
+ * of the struct device members are already updated by
+ * the mount process even if this device was missing, so
+ * now follow the normal open device procedure for this
+ * device. The scrub will take care of filling the
+ * missing stripes for raid56 and balance for raid1 and
+ * raid10.
+ */
+ ASSERT(fs_devices->opened);
+ mutex_lock(&fs_devices->device_list_mutex);
+ mutex_lock(&fs_info->chunk_mutex);
+ ret = btrfs_open_one_device(fs_devices, device, fmode,
+ fs_info->bdev_holder);
+ if (!ret) {
+ fs_devices->missing_devices--;
+ device->missing = 0;
+ btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
+ btrfs_warn(fs_info,
+ "BTRFS: device %s devid %llu uuid %pU joined\n",
+ path, devid, device->uuid);
+ }
+
+ if (device->writeable &&
+ !device->is_tgtdev_for_dev_replace) {
+ fs_devices->total_rw_bytes += device->total_bytes;
+ atomic64_add(device->total_bytes -
+ device->bytes_used,
+ &fs_info->free_chunk_space);
+ }
+ device->in_fs_metadata = 1;
+ mutex_unlock(&fs_devices->fs_info->chunk_mutex);
+ mutex_unlock(&fs_devices->device_list_mutex);
}
}
--
2.13.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
@ 2017-11-15 7:38 ` kbuild test robot
2017-11-15 10:18 ` Anand Jain
2017-11-16 19:08 ` Nikolay Borisov
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: kbuild test robot @ 2017-11-15 7:38 UTC (permalink / raw)
To: Anand Jain; +Cc: kbuild-all, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 7598 bytes --]
Hi Anand,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on btrfs/next]
[also build test ERROR on v4.14 next-20171114]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-handle-dynamically-reappearing-missing-device/20171115-143047
base: https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64
All errors (new ones prefixed by >>):
fs/btrfs/volumes.c: In function 'device_list_add':
>> fs/btrfs/volumes.c:732:10: error: implicit declaration of function 'btrfs_open_one_device'; did you mean 'btrfs_scan_one_device'? [-Werror=implicit-function-declaration]
ret = btrfs_open_one_device(fs_devices, device, fmode,
^~~~~~~~~~~~~~~~~~~~~
btrfs_scan_one_device
cc1: some warnings being treated as errors
vim +732 fs/btrfs/volumes.c
610
611 /*
612 * Add new device to list of registered devices
613 *
614 * Returns:
615 * 1 - first time device is seen
616 * 0 - device already known
617 * < 0 - error
618 */
619 static noinline int device_list_add(const char *path,
620 struct btrfs_super_block *disk_super,
621 u64 devid, struct btrfs_fs_devices **fs_devices_ret)
622 {
623 struct btrfs_device *device;
624 struct btrfs_fs_devices *fs_devices;
625 struct rcu_string *name;
626 int ret = 0;
627 u64 found_transid = btrfs_super_generation(disk_super);
628
629 fs_devices = find_fsid(disk_super->fsid);
630 if (!fs_devices) {
631 fs_devices = alloc_fs_devices(disk_super->fsid);
632 if (IS_ERR(fs_devices))
633 return PTR_ERR(fs_devices);
634
635 list_add(&fs_devices->list, &fs_uuids);
636
637 device = NULL;
638 } else {
639 device = __find_device(&fs_devices->devices, devid,
640 disk_super->dev_item.uuid);
641 }
642
643 if (!device) {
644 if (fs_devices->opened)
645 return -EBUSY;
646
647 device = btrfs_alloc_device(NULL, &devid,
648 disk_super->dev_item.uuid);
649 if (IS_ERR(device)) {
650 /* we can safely leave the fs_devices entry around */
651 return PTR_ERR(device);
652 }
653
654 name = rcu_string_strdup(path, GFP_NOFS);
655 if (!name) {
656 kfree(device);
657 return -ENOMEM;
658 }
659 rcu_assign_pointer(device->name, name);
660
661 mutex_lock(&fs_devices->device_list_mutex);
662 list_add_rcu(&device->dev_list, &fs_devices->devices);
663 fs_devices->num_devices++;
664 mutex_unlock(&fs_devices->device_list_mutex);
665
666 ret = 1;
667 device->fs_devices = fs_devices;
668 } else if (!device->name || strcmp(device->name->str, path)) {
669 /*
670 * When FS is already mounted.
671 * 1. If you are here and if the device->name is NULL that
672 * means this device was missing at time of FS mount.
673 * 2. If you are here and if the device->name is different
674 * from 'path' that means either
675 * a. The same device disappeared and reappeared with
676 * different name. or
677 * b. The missing-disk-which-was-replaced, has
678 * reappeared now.
679 *
680 * We must allow 1 and 2a above. But 2b would be a spurious
681 * and unintentional.
682 *
683 * Further in case of 1 and 2a above, the disk at 'path'
684 * would have missed some transaction when it was away and
685 * in case of 2a the stale bdev has to be updated as well.
686 * 2b must not be allowed at all time.
687 */
688
689 /*
690 * For now, we do allow update to btrfs_fs_device through the
691 * btrfs dev scan cli after FS has been mounted. We're still
692 * tracking a problem where systems fail mount by subvolume id
693 * when we reject replacement on a mounted FS.
694 */
695 if (!fs_devices->opened && found_transid < device->generation) {
696 /*
697 * That is if the FS is _not_ mounted and if you
698 * are here, that means there is more than one
699 * disk with same uuid and devid.We keep the one
700 * with larger generation number or the last-in if
701 * generation are equal.
702 */
703 return -EEXIST;
704 }
705
706 name = rcu_string_strdup(path, GFP_NOFS);
707 if (!name)
708 return -ENOMEM;
709 rcu_string_free(device->name);
710 rcu_assign_pointer(device->name, name);
711 if (device->missing) {
712 int ret;
713 struct btrfs_fs_info *fs_info = fs_devices->fs_info;
714 fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
715
716 if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
717 fmode &= ~FMODE_WRITE;
718
719 /*
720 * Missing can be set only when FS is mounted.
721 * So here its always fs_devices->opened > 0 and most
722 * of the struct device members are already updated by
723 * the mount process even if this device was missing, so
724 * now follow the normal open device procedure for this
725 * device. The scrub will take care of filling the
726 * missing stripes for raid56 and balance for raid1 and
727 * raid10.
728 */
729 ASSERT(fs_devices->opened);
730 mutex_lock(&fs_devices->device_list_mutex);
731 mutex_lock(&fs_info->chunk_mutex);
> 732 ret = btrfs_open_one_device(fs_devices, device, fmode,
733 fs_info->bdev_holder);
734 if (!ret) {
735 fs_devices->missing_devices--;
736 device->missing = 0;
737 btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
738 btrfs_warn(fs_info,
739 "BTRFS: device %s devid %llu uuid %pU joined\n",
740 path, devid, device->uuid);
741 }
742
743 if (device->writeable &&
744 !device->is_tgtdev_for_dev_replace) {
745 fs_devices->total_rw_bytes += device->total_bytes;
746 atomic64_add(device->total_bytes -
747 device->bytes_used,
748 &fs_info->free_chunk_space);
749 }
750 device->in_fs_metadata = 1;
751 mutex_unlock(&fs_devices->fs_info->chunk_mutex);
752 mutex_unlock(&fs_devices->device_list_mutex);
753 }
754 }
755
756 /*
757 * Unmount does not free the btrfs_device struct but would zero
758 * generation along with most of the other members. So just update
759 * it back. We need it to pick the disk with largest generation
760 * (as above).
761 */
762 if (!fs_devices->opened)
763 device->generation = found_transid;
764
765 /*
766 * if there is new btrfs on an already registered device,
767 * then remove the stale device entry.
768 */
769 if (ret > 0)
770 btrfs_free_stale_device(device);
771
772 *fs_devices_ret = fs_devices;
773
774 return ret;
775 }
776
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51512 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-15 7:38 ` kbuild test robot
@ 2017-11-15 10:18 ` Anand Jain
0 siblings, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-15 10:18 UTC (permalink / raw)
To: kbuild test robot; +Cc: kbuild-all, linux-btrfs
On 11/15/2017 03:38 PM, kbuild test robot wrote:
> Hi Anand,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on btrfs/next]
> [also build test ERROR on v4.14 next-20171114]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url: https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-handle-dynamically-reappearing-missing-device/20171115-143047
> base: https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
> config: sparc64-allyesconfig (attached as .config)
> compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=sparc64
>
> All errors (new ones prefixed by >>):
>
> fs/btrfs/volumes.c: In function 'device_list_add':
>>> fs/btrfs/volumes.c:732:10: error: implicit declaration of function 'btrfs_open_one_device'; did you mean 'btrfs_scan_one_device'? [-Werror=implicit-function-declaration]
> ret = btrfs_open_one_device(fs_devices, device, fmode,
> ^~~~~~~~~~~~~~~~~~~~~
> btrfs_scan_one_device
> cc1: some warnings being treated as errors
Its missing this patch set in the ML which was sent separately.
[PATCH 0/4] factor __btrfs_open_devices()
Thanks, Anand
> vim +732 fs/btrfs/volumes.c
>
> 610
> 611 /*
> 612 * Add new device to list of registered devices
> 613 *
> 614 * Returns:
> 615 * 1 - first time device is seen
> 616 * 0 - device already known
> 617 * < 0 - error
> 618 */
> 619 static noinline int device_list_add(const char *path,
> 620 struct btrfs_super_block *disk_super,
> 621 u64 devid, struct btrfs_fs_devices **fs_devices_ret)
> 622 {
> 623 struct btrfs_device *device;
> 624 struct btrfs_fs_devices *fs_devices;
> 625 struct rcu_string *name;
> 626 int ret = 0;
> 627 u64 found_transid = btrfs_super_generation(disk_super);
> 628
> 629 fs_devices = find_fsid(disk_super->fsid);
> 630 if (!fs_devices) {
> 631 fs_devices = alloc_fs_devices(disk_super->fsid);
> 632 if (IS_ERR(fs_devices))
> 633 return PTR_ERR(fs_devices);
> 634
> 635 list_add(&fs_devices->list, &fs_uuids);
> 636
> 637 device = NULL;
> 638 } else {
> 639 device = __find_device(&fs_devices->devices, devid,
> 640 disk_super->dev_item.uuid);
> 641 }
> 642
> 643 if (!device) {
> 644 if (fs_devices->opened)
> 645 return -EBUSY;
> 646
> 647 device = btrfs_alloc_device(NULL, &devid,
> 648 disk_super->dev_item.uuid);
> 649 if (IS_ERR(device)) {
> 650 /* we can safely leave the fs_devices entry around */
> 651 return PTR_ERR(device);
> 652 }
> 653
> 654 name = rcu_string_strdup(path, GFP_NOFS);
> 655 if (!name) {
> 656 kfree(device);
> 657 return -ENOMEM;
> 658 }
> 659 rcu_assign_pointer(device->name, name);
> 660
> 661 mutex_lock(&fs_devices->device_list_mutex);
> 662 list_add_rcu(&device->dev_list, &fs_devices->devices);
> 663 fs_devices->num_devices++;
> 664 mutex_unlock(&fs_devices->device_list_mutex);
> 665
> 666 ret = 1;
> 667 device->fs_devices = fs_devices;
> 668 } else if (!device->name || strcmp(device->name->str, path)) {
> 669 /*
> 670 * When FS is already mounted.
> 671 * 1. If you are here and if the device->name is NULL that
> 672 * means this device was missing at time of FS mount.
> 673 * 2. If you are here and if the device->name is different
> 674 * from 'path' that means either
> 675 * a. The same device disappeared and reappeared with
> 676 * different name. or
> 677 * b. The missing-disk-which-was-replaced, has
> 678 * reappeared now.
> 679 *
> 680 * We must allow 1 and 2a above. But 2b would be a spurious
> 681 * and unintentional.
> 682 *
> 683 * Further in case of 1 and 2a above, the disk at 'path'
> 684 * would have missed some transaction when it was away and
> 685 * in case of 2a the stale bdev has to be updated as well.
> 686 * 2b must not be allowed at all time.
> 687 */
> 688
> 689 /*
> 690 * For now, we do allow update to btrfs_fs_device through the
> 691 * btrfs dev scan cli after FS has been mounted. We're still
> 692 * tracking a problem where systems fail mount by subvolume id
> 693 * when we reject replacement on a mounted FS.
> 694 */
> 695 if (!fs_devices->opened && found_transid < device->generation) {
> 696 /*
> 697 * That is if the FS is _not_ mounted and if you
> 698 * are here, that means there is more than one
> 699 * disk with same uuid and devid.We keep the one
> 700 * with larger generation number or the last-in if
> 701 * generation are equal.
> 702 */
> 703 return -EEXIST;
> 704 }
> 705
> 706 name = rcu_string_strdup(path, GFP_NOFS);
> 707 if (!name)
> 708 return -ENOMEM;
> 709 rcu_string_free(device->name);
> 710 rcu_assign_pointer(device->name, name);
> 711 if (device->missing) {
> 712 int ret;
> 713 struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> 714 fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> 715
> 716 if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> 717 fmode &= ~FMODE_WRITE;
> 718
> 719 /*
> 720 * Missing can be set only when FS is mounted.
> 721 * So here its always fs_devices->opened > 0 and most
> 722 * of the struct device members are already updated by
> 723 * the mount process even if this device was missing, so
> 724 * now follow the normal open device procedure for this
> 725 * device. The scrub will take care of filling the
> 726 * missing stripes for raid56 and balance for raid1 and
> 727 * raid10.
> 728 */
> 729 ASSERT(fs_devices->opened);
> 730 mutex_lock(&fs_devices->device_list_mutex);
> 731 mutex_lock(&fs_info->chunk_mutex);
> > 732 ret = btrfs_open_one_device(fs_devices, device, fmode,
> 733 fs_info->bdev_holder);
> 734 if (!ret) {
> 735 fs_devices->missing_devices--;
> 736 device->missing = 0;
> 737 btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
> 738 btrfs_warn(fs_info,
> 739 "BTRFS: device %s devid %llu uuid %pU joined\n",
> 740 path, devid, device->uuid);
> 741 }
> 742
> 743 if (device->writeable &&
> 744 !device->is_tgtdev_for_dev_replace) {
> 745 fs_devices->total_rw_bytes += device->total_bytes;
> 746 atomic64_add(device->total_bytes -
> 747 device->bytes_used,
> 748 &fs_info->free_chunk_space);
> 749 }
> 750 device->in_fs_metadata = 1;
> 751 mutex_unlock(&fs_devices->fs_info->chunk_mutex);
> 752 mutex_unlock(&fs_devices->device_list_mutex);
> 753 }
> 754 }
> 755
> 756 /*
> 757 * Unmount does not free the btrfs_device struct but would zero
> 758 * generation along with most of the other members. So just update
> 759 * it back. We need it to pick the disk with largest generation
> 760 * (as above).
> 761 */
> 762 if (!fs_devices->opened)
> 763 device->generation = found_transid;
> 764
> 765 /*
> 766 * if there is new btrfs on an already registered device,
> 767 * then remove the stale device entry.
> 768 */
> 769 if (ret > 0)
> 770 btrfs_free_stale_device(device);
> 771
> 772 *fs_devices_ret = fs_devices;
> 773
> 774 return ret;
> 775 }
> 776
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
2017-11-15 7:38 ` kbuild test robot
@ 2017-11-16 19:08 ` Nikolay Borisov
2017-11-17 7:46 ` Anand Jain
2017-11-16 23:28 ` Liu Bo
2017-11-17 12:36 ` [PATCH v2] " Anand Jain
3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Borisov @ 2017-11-16 19:08 UTC (permalink / raw)
To: Anand Jain, linux-btrfs
On 12.11.2017 12:56, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted. So this
> patch handles that case by going through the open_device steps
> which this device missed and finally adds to the device alloc list.
>
> So now with this patch, to bring back the missing device user can run,
>
> btrfs dev scan <path-of-missing-device>
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> This patch needs:
> [PATCH 0/4] factor __btrfs_open_devices()
>
> fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d24e966ee29f..e7dd996831f2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
> rcu_string_free(device->name);
> rcu_assign_pointer(device->name, name);
> if (device->missing) {
> - fs_devices->missing_devices--;
> - device->missing = 0;
> + int ret;
> + struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> + fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> +
> + if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> + fmode &= ~FMODE_WRITE;
> +
> + /*
> + * Missing can be set only when FS is mounted.
> + * So here its always fs_devices->opened > 0 and most
> + * of the struct device members are already updated by
> + * the mount process even if this device was missing, so
> + * now follow the normal open device procedure for this
> + * device. The scrub will take care of filling the
> + * missing stripes for raid56 and balance for raid1 and
> + * raid10.
> + */
> + ASSERT(fs_devices->opened);
> + mutex_lock(&fs_devices->device_list_mutex);
> + mutex_lock(&fs_info->chunk_mutex);
> + ret = btrfs_open_one_device(fs_devices, device, fmode,
> + fs_info->bdev_holder);
> + if (!ret) {
> + fs_devices->missing_devices--;
> + device->missing = 0;
> + btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
> + btrfs_warn(fs_info,
> + "BTRFS: device %s devid %llu uuid %pU joined\n",
> + path, devid, device->uuid);
> + }
> +
> + if (device->writeable &&
> + !device->is_tgtdev_for_dev_replace) {
> + fs_devices->total_rw_bytes += device->total_bytes;
> + atomic64_add(device->total_bytes -
> + device->bytes_used,
> + &fs_info->free_chunk_space);
> + }
> + device->in_fs_metadata = 1;
> + mutex_unlock(&fs_devices->fs_info->chunk_mutex);
> + mutex_unlock(&fs_devices->device_list_mutex);
nit: You did add the fs_info local var, so for consistency's sake use that
> }
> }
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
2017-11-15 7:38 ` kbuild test robot
2017-11-16 19:08 ` Nikolay Borisov
@ 2017-11-16 23:28 ` Liu Bo
2017-11-17 11:53 ` Anand Jain
2017-11-17 12:36 ` [PATCH v2] " Anand Jain
3 siblings, 1 reply; 14+ messages in thread
From: Liu Bo @ 2017-11-16 23:28 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs
On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted.
This commit log doesn't explain what would happen when the missing
device re-appears.
> So this
> patch handles that case by going through the open_device steps
> which this device missed and finally adds to the device alloc list.
>
> So now with this patch, to bring back the missing device user can run,
>
> btrfs dev scan <path-of-missing-device>
Most common distros already have some udev rules of btrfs to process a
reappeared disk.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> This patch needs:
> [PATCH 0/4] factor __btrfs_open_devices()
>
> fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d24e966ee29f..e7dd996831f2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
> rcu_string_free(device->name);
> rcu_assign_pointer(device->name, name);
> if (device->missing) {
> - fs_devices->missing_devices--;
> - device->missing = 0;
> + int ret;
> + struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> + fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> +
> + if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> + fmode &= ~FMODE_WRITE;
> +
> + /*
> + * Missing can be set only when FS is mounted.
> + * So here its always fs_devices->opened > 0 and most
> + * of the struct device members are already updated by
> + * the mount process even if this device was missing, so
> + * now follow the normal open device procedure for this
> + * device. The scrub will take care of filling the
> + * missing stripes for raid56 and balance for raid1 and
> + * raid10.
> + */
The idea looks good to me.
What if users skip balance/scrub given both could take a while?
Then btrfs takes the disks back and reads content from it, and it's OK
for raid1/10 as they may have a good copy, but in case of raid56, it
might hit the reconstruction bug if two disks are missing and added
back, which I tried to address recently.
At least ->writable should not be set until balance/scrub completes
the re-sync job.
Thanks,
-liubo
> + ASSERT(fs_devices->opened);
> + mutex_lock(&fs_devices->device_list_mutex);
> + mutex_lock(&fs_info->chunk_mutex);
> + ret = btrfs_open_one_device(fs_devices, device, fmode,
> + fs_info->bdev_holder);
> + if (!ret) {
> + fs_devices->missing_devices--;
> + device->missing = 0;
> + btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
> + btrfs_warn(fs_info,
> + "BTRFS: device %s devid %llu uuid %pU joined\n",
> + path, devid, device->uuid);
> + }
> +
> + if (device->writeable &&
> + !device->is_tgtdev_for_dev_replace) {
> + fs_devices->total_rw_bytes += device->total_bytes;
> + atomic64_add(device->total_bytes -
> + device->bytes_used,
> + &fs_info->free_chunk_space);
> + }
> + device->in_fs_metadata = 1;
> + mutex_unlock(&fs_devices->fs_info->chunk_mutex);
> + mutex_unlock(&fs_devices->device_list_mutex);
> }
> }
>
> --
> 2.13.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-16 19:08 ` Nikolay Borisov
@ 2017-11-17 7:46 ` Anand Jain
0 siblings, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-17 7:46 UTC (permalink / raw)
To: Nikolay Borisov, linux-btrfs
On 11/17/2017 03:08 AM, Nikolay Borisov wrote:
>
>
> On 12.11.2017 12:56, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted. So this
>> patch handles that case by going through the open_device steps
>> which this device missed and finally adds to the device alloc list.
>>
>> So now with this patch, to bring back the missing device user can run,
>>
>> btrfs dev scan <path-of-missing-device>
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> This patch needs:
>> [PATCH 0/4] factor __btrfs_open_devices()
>>
>> fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index d24e966ee29f..e7dd996831f2 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>> rcu_string_free(device->name);
>> rcu_assign_pointer(device->name, name);
>> if (device->missing) {
>> - fs_devices->missing_devices--;
>> - device->missing = 0;
>> + int ret;
>> + struct btrfs_fs_info *fs_info = fs_devices->fs_info;
>> + fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
>> +
>> + if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
>> + fmode &= ~FMODE_WRITE;
>> +
>> + /*
>> + * Missing can be set only when FS is mounted.
>> + * So here its always fs_devices->opened > 0 and most
>> + * of the struct device members are already updated by
>> + * the mount process even if this device was missing, so
>> + * now follow the normal open device procedure for this
>> + * device. The scrub will take care of filling the
>> + * missing stripes for raid56 and balance for raid1 and
>> + * raid10.
>> + */
>> + ASSERT(fs_devices->opened);
>> + mutex_lock(&fs_devices->device_list_mutex);
>> + mutex_lock(&fs_info->chunk_mutex);
>> + ret = btrfs_open_one_device(fs_devices, device, fmode,
>> + fs_info->bdev_holder);
>> + if (!ret) {
>> + fs_devices->missing_devices--;
>> + device->missing = 0;
>> + btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
>> + btrfs_warn(fs_info,
>> + "BTRFS: device %s devid %llu uuid %pU joined\n",
>> + path, devid, device->uuid);
>> + }
>> +
>> + if (device->writeable &&
>> + !device->is_tgtdev_for_dev_replace) {
>> + fs_devices->total_rw_bytes += device->total_bytes;
>> + atomic64_add(device->total_bytes -
>> + device->bytes_used,
>> + &fs_info->free_chunk_space);
>> + }
>> + device->in_fs_metadata = 1;
>> + mutex_unlock(&fs_devices->fs_info->chunk_mutex);
>> + mutex_unlock(&fs_devices->device_list_mutex);
>
> nit: You did add the fs_info local var, so for consistency's sake use that
Thanks. Will fix it.
-Anand
>> }
>> }
>>
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-16 23:28 ` Liu Bo
@ 2017-11-17 11:53 ` Anand Jain
2017-11-28 22:52 ` Liu Bo
0 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-17 11:53 UTC (permalink / raw)
To: bo.li.liu; +Cc: linux-btrfs
On 11/17/2017 07:28 AM, Liu Bo wrote:
> On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted.
> This commit log doesn't explain what would happen when the missing
> device re-appears.
Hm. You mean in the current design without this patch.?
Just nothing it gives a false impression to the user by
'btrfs dev ready' and 'btrfs fi show' that missing device
has been included. Will update change log.
>> So this
>> patch handles that case by going through the open_device steps
>> which this device missed and finally adds to the device alloc list.
>>
>> So now with this patch, to bring back the missing device user can run,
>>
>> btrfs dev scan <path-of-missing-device>
>
> Most common distros already have some udev rules of btrfs to process a
> reappeared disk.
Right. Do you see anything that will break ?
Planning to add comments [1] in v2 for more clarity around this.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> This patch needs:
>> [PATCH 0/4] factor __btrfs_open_devices()
>>
>> fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index d24e966ee29f..e7dd996831f2 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>> rcu_string_free(device->name);
>> rcu_assign_pointer(device->name, name);
>> if (device->missing) {
>> - fs_devices->missing_devices--;
>> - device->missing = 0;
>> + int ret;
>> + struct btrfs_fs_info *fs_info = fs_devices->fs_info;
>> + fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
>> +
>> + if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
>> + fmode &= ~FMODE_WRITE;
>> +
>> + /*
>> + * Missing can be set only when FS is mounted.
>> + * So here its always fs_devices->opened > 0 and most
>> + * of the struct device members are already updated by
>> + * the mount process even if this device was missing, so
>> + * now follow the normal open device procedure for this
>> + * device. The scrub will take care of filling the
>> + * missing stripes for raid56 and balance for raid1 and
>> + * raid10.
>> + */
>
>
> The idea looks good to me.
>
> What if users skip balance/scrub given both could take a while?
>
> Then btrfs takes the disks back and reads content from it, and it's OK
> for raid1/10 as they may have a good copy,
Yes, except for the split brain situation for which patches are
in the ML, review comments appreciated.
> but in case of raid56, it
> might hit the reconstruction bug if two disks are missing and added
> back, which I tried to address recently.
> At least ->writable should not be set until balance/scrub completes
> the re-sync job.
Hm. Why ?
> Thanks,
>
> -liubo
>
>> + ASSERT(fs_devices->opened);
>> + mutex_lock(&fs_devices->device_list_mutex);
>> + mutex_lock(&fs_info->chunk_mutex);
[1]
/*
* By not failing for the
* reason that btrfs_open_one_device() could
* fail, it will keep the original behaviors as
* it is for now. That's fix me for later.
*/
>> + ret = btrfs_open_one_device(fs_devices, device, fmode,
>> + fs_info->bdev_holder);
>> + if (!ret) {
/*
* Making sure missing is set to 0
* only when bdev != NULL is as expected
* at most of the places in the code.
* Further, what if user fixes the
* dev and reruns dev scan, then again
* we need to come here.
*/
>> + fs_devices->missing_devices--;
>> + device->missing = 0;
>> + btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
>> + btrfs_warn(fs_info,
>> + "BTRFS: device %s devid %llu uuid %pU joined\n",
>> + path, devid, device->uuid);
>> + }
Also added check for missing as below in v2.
--------------
@@ -725,7 +725,9 @@ static noinline int device_list_add(const char *path,
ret = 1;
device->fs_devices = fs_devices;
- } else if (!device->name || strcmp(device->name->str, path)) {
+ } else if (!device->name ||
+ strcmp(device->name->str, path) ||
+ device->missing) {
/*
* When FS is already mounted.
* 1. If you are here and if the device->name is NULL that
------------
Thanks, Anand
>> +
>> + if (device->writeable &&
>> + !device->is_tgtdev_for_dev_replace) {
>> + fs_devices->total_rw_bytes += device->total_bytes;
>> + atomic64_add(device->total_bytes -
>> + device->bytes_used,
>> + &fs_info->free_chunk_space);
>> + }
>> + device->in_fs_metadata = 1;
>> + mutex_unlock(&fs_devices->fs_info->chunk_mutex);
>> + mutex_unlock(&fs_devices->device_list_mutex);
>> }
>> }
>>
>> --
>> 2.13.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
` (2 preceding siblings ...)
2017-11-16 23:28 ` Liu Bo
@ 2017-11-17 12:36 ` Anand Jain
2017-11-18 13:52 ` Goffredo Baroncelli
3 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-17 12:36 UTC (permalink / raw)
To: linux-btrfs; +Cc: Anand Jain
If the device is not present at the time of (-o degrade) mount,
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted and
then device is included in the device list but it missed the
open_device part. So this patch handles that case by going
through the open_device steps which this device missed and finally
adds to the device alloc list.
So now with this patch, to bring back the missing device user can run,
btrfs dev scan <path-of-missing-device>
Without this kernel patch, even though 'btrfs fi show' and 'btrfs
dev ready' would tell you that missing device has reappeared
successfully but actually in kernel FS layer it didn't.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
This patch needs:
[PATCH 0/4] factor __btrfs_open_devices()
v2:
Add more comments.
Add more change log.
Add to check if device missing is set, to handle the case
dev open fail and user will rerun the dev scan.
fs/btrfs/volumes.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 60 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5bd73edc2602..b3224baa6db4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -725,7 +725,9 @@ static noinline int device_list_add(const char *path,
ret = 1;
device->fs_devices = fs_devices;
- } else if (!device->name || strcmp(device->name->str, path)) {
+ } else if (!device->name ||
+ strcmp(device->name->str, path) ||
+ device->missing) {
/*
* When FS is already mounted.
* 1. If you are here and if the device->name is NULL that
@@ -769,8 +771,63 @@ static noinline int device_list_add(const char *path,
rcu_string_free(device->name);
rcu_assign_pointer(device->name, name);
if (device->missing) {
- fs_devices->missing_devices--;
- device->missing = 0;
+ int ret;
+ struct btrfs_fs_info *fs_info = fs_devices->fs_info;
+ fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
+
+ if (btrfs_super_flags(disk_super) &
+ BTRFS_SUPER_FLAG_SEEDING)
+ fmode &= ~FMODE_WRITE;
+
+ /*
+ * Missing can be set only when FS is mounted.
+ * So here its always fs_devices->opened > 0 and most
+ * of the struct device members are already updated by
+ * the mount process even if this device was missing, so
+ * now follow the normal open device procedure for this
+ * device. The scrub will take care of filling the
+ * missing stripes for raid56 and balance for raid1 and
+ * raid10.
+ */
+ ASSERT(fs_devices->opened);
+ mutex_lock(&fs_devices->device_list_mutex);
+ mutex_lock(&fs_info->chunk_mutex);
+ /*
+ * By not failing for the reason that
+ * btrfs_open_one_device() could fail, it will
+ * keep the original behaviors as it is for now.
+ * That's fix me for later.
+ */
+ ret = btrfs_open_one_device(fs_devices, device, fmode,
+ fs_info->bdev_holder);
+ if (!ret) {
+ /*
+ * Making sure missing is set to 0
+ * only when bdev != NULL is as expected
+ * at most of the places in the code.
+ * Further, what if user fixes the
+ * dev and reruns the dev scan, then again
+ * we need to come here to reset missing.
+ */
+ fs_devices->missing_devices--;
+ device->missing = 0;
+ btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
+ btrfs_warn(fs_info,
+ "BTRFS: device %s devid %llu uuid %pU joined\n",
+ path, devid, device->uuid);
+ }
+
+ if (device->writeable &&
+ !device->is_tgtdev_for_dev_replace) {
+ fs_devices->total_rw_bytes +=
+ device->total_bytes;
+ atomic64_add(device->total_bytes -
+ device->bytes_used,
+ &fs_info->free_chunk_space);
+ }
+ device->in_fs_metadata = 1;
+ mutex_unlock(&fs_info->chunk_mutex);
+ mutex_unlock(&fs_devices->device_list_mutex);
}
}
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-17 12:36 ` [PATCH v2] " Anand Jain
@ 2017-11-18 13:52 ` Goffredo Baroncelli
2017-11-20 8:19 ` Anand Jain
0 siblings, 1 reply; 14+ messages in thread
From: Goffredo Baroncelli @ 2017-11-18 13:52 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, Liu Bo
On 11/17/2017 01:36 PM, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount,
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted and
> then device is included in the device list but it missed the
> open_device part. So this patch handles that case by going
> through the open_device steps which this device missed and finally
> adds to the device alloc list.
What happens if the first devices got writes before the "last device" is joined ?
Supposing to have a raid1 pair of devices: sda, sdb
- sda is dissappeared (usb detached ?)
- the filesystem is mounted as
mount -o degraded /dev/sdb
- some writes happens on /dev/sdb
- the user reattach /dev/sda
- udev run "btrfs dev scan"
- the system joins /dev/sda to the filesystem disks pool
Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
Am I missing something ?
BR
G.Baroncelli
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-18 13:52 ` Goffredo Baroncelli
@ 2017-11-20 8:19 ` Anand Jain
2017-11-20 19:28 ` Goffredo Baroncelli
0 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-20 8:19 UTC (permalink / raw)
To: kreijack; +Cc: linux-btrfs, Liu Bo
On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
> On 11/17/2017 01:36 PM, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount,
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted and
>> then device is included in the device list but it missed the
>> open_device part. So this patch handles that case by going
>> through the open_device steps which this device missed and finally
>> adds to the device alloc list.
>
> What happens if the first devices got writes before the "last device" is joined ?
>
> Supposing to have a raid1 pair of devices: sda, sdb
>
> - sda is dissappeared (usb detached ?)
> - the filesystem is mounted as
> mount -o degraded /dev/sdb
> - some writes happens on /dev/sdb
> - the user reattach /dev/sda
> - udev run "btrfs dev scan"
> - the system joins /dev/sda to the filesystem disks pool
>
> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
Thanks for the test scenario, this case is fine as its read from
the disk having highest generation number, so we pick sdb in the
case above.
Thanks, Anand
> Am I missing something ?
>
> BR
> G.Baroncelli
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-20 8:19 ` Anand Jain
@ 2017-11-20 19:28 ` Goffredo Baroncelli
2017-11-20 21:14 ` Anand Jain
2017-12-04 7:09 ` Anand Jain
0 siblings, 2 replies; 14+ messages in thread
From: Goffredo Baroncelli @ 2017-11-20 19:28 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, Liu Bo
On 11/20/2017 09:19 AM, Anand Jain wrote:
>
>
> On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
>> On 11/17/2017 01:36 PM, Anand Jain wrote:
>>> If the device is not present at the time of (-o degrade) mount,
>>> the mount context will create a dummy missing struct btrfs_device.
>>> Later this device may reappear after the FS is mounted and
>>> then device is included in the device list but it missed the
>>> open_device part. So this patch handles that case by going
>>> through the open_device steps which this device missed and finally
>>> adds to the device alloc list.
>>
>> What happens if the first devices got writes before the "last device" is joined ?
>>
>> Supposing to have a raid1 pair of devices: sda, sdb
>>
>> - sda is dissappeared (usb detached ?)
>> - the filesystem is mounted as
>> mount -o degraded /dev/sdb
>> - some writes happens on /dev/sdb
>> - the user reattach /dev/sda
>> - udev run "btrfs dev scan"
>> - the system joins /dev/sda to the filesystem disks pool
>>
>> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
>
> Thanks for the test scenario, this case is fine as its read from
> the disk having highest generation number, so we pick sdb in the
> case above.
Ok, so in this case which is the benefit to add a disk ? With a lover number generation, the added will be used at all ?
In this case, would be better an explicit user intervention instead of an automatic one ?
>
> Thanks, Anand
>
>
>> Am I missing something ?
>>
>> BR
>> G.Baroncelli
>>
>
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-20 19:28 ` Goffredo Baroncelli
@ 2017-11-20 21:14 ` Anand Jain
2017-12-04 7:09 ` Anand Jain
1 sibling, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-20 21:14 UTC (permalink / raw)
To: kreijack; +Cc: linux-btrfs, Liu Bo
On 11/21/2017 03:28 AM, Goffredo Baroncelli wrote:
> On 11/20/2017 09:19 AM, Anand Jain wrote:
>>
>>
>> On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
>>> On 11/17/2017 01:36 PM, Anand Jain wrote:
>>>> If the device is not present at the time of (-o degrade) mount,
>>>> the mount context will create a dummy missing struct btrfs_device.
>>>> Later this device may reappear after the FS is mounted and
>>>> then device is included in the device list but it missed the
>>>> open_device part. So this patch handles that case by going
>>>> through the open_device steps which this device missed and finally
>>>> adds to the device alloc list.
>>>
>>> What happens if the first devices got writes before the "last device" is joined ?
>>>
>>> Supposing to have a raid1 pair of devices: sda, sdb
>>>
>>> - sda is dissappeared (usb detached ?)
>>> - the filesystem is mounted as
>>> mount -o degraded /dev/sdb
>>> - some writes happens on /dev/sdb
>>> - the user reattach /dev/sda
>>> - udev run "btrfs dev scan"
>>> - the system joins /dev/sda to the filesystem disks pool
>>>
>>> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
>>
>> Thanks for the test scenario, this case is fine as its read from
>> the disk having highest generation number, so we pick sdb in the
>> case above.
>
> Ok, so in this case which is the benefit to add a disk ?
> With a lover number generation, the added will be used at all ?
It will be used for new writes, it will stripe across both the disks.
Balance is only for the older single chunks, which in the long term
would go away when we are able to create a degraded raid1 chunks.
> In this case, would be better an explicit user intervention instead of an automatic one ?
Without this patch the disk already joins the raid group.
But it had a bug that it missed joining the alloc group.
Thanks, Anand
>
>>
>> Thanks, Anand
>>
>>
>>> Am I missing something ?
>>>
>>> BR
>>> G.Baroncelli
>>>
>>
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] btrfs: handle dynamically reappearing missing device
2017-11-17 11:53 ` Anand Jain
@ 2017-11-28 22:52 ` Liu Bo
0 siblings, 0 replies; 14+ messages in thread
From: Liu Bo @ 2017-11-28 22:52 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs
On Fri, Nov 17, 2017 at 07:53:29PM +0800, Anand Jain wrote:
>
>
> On 11/17/2017 07:28 AM, Liu Bo wrote:
> > On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
> > > If the device is not present at the time of (-o degrade) mount
> > > the mount context will create a dummy missing struct btrfs_device.
> > > Later this device may reappear after the FS is mounted.
>
> > This commit log doesn't explain what would happen when the missing
> > device re-appears.
>
> Hm. You mean in the current design without this patch.?
> Just nothing it gives a false impression to the user by
> 'btrfs dev ready' and 'btrfs fi show' that missing device
> has been included. Will update change log.
>
> > > So this
> > > patch handles that case by going through the open_device steps
>
> > > which this device missed and finally adds to the device alloc list.
> > >
> > > So now with this patch, to bring back the missing device user can run,
> > >
> > > btrfs dev scan <path-of-missing-device>
> >
> > Most common distros already have some udev rules of btrfs to process a
> > reappeared disk.
>
> Right. Do you see anything that will break ?
> Planning to add comments [1] in v2 for more clarity around this.
>
> > >
> > > Signed-off-by: Anand Jain <anand.jain@oracle.com>
> > > ---
> > > This patch needs:
> > > [PATCH 0/4] factor __btrfs_open_devices()
> > >
> > > fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 41 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > index d24e966ee29f..e7dd996831f2 100644
> > > --- a/fs/btrfs/volumes.c
> > > +++ b/fs/btrfs/volumes.c
> > > @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
> > > rcu_string_free(device->name);
> > > rcu_assign_pointer(device->name, name);
> > > if (device->missing) {
> > > - fs_devices->missing_devices--;
> > > - device->missing = 0;
> > > + int ret;
> > > + struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> > > + fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> > > +
> > > + if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> > > + fmode &= ~FMODE_WRITE;
> > > +
> > > + /*
> > > + * Missing can be set only when FS is mounted.
> > > + * So here its always fs_devices->opened > 0 and most
> > > + * of the struct device members are already updated by
> > > + * the mount process even if this device was missing, so
> > > + * now follow the normal open device procedure for this
> > > + * device. The scrub will take care of filling the
> > > + * missing stripes for raid56 and balance for raid1 and
> > > + * raid10.
> > > + */
> >
> >
> > The idea looks good to me.
> >
> > What if users skip balance/scrub given both could take a while?
> >
> > Then btrfs takes the disks back and reads content from it, and it's OK
> > for raid1/10 as they may have a good copy,
>
> Yes, except for the split brain situation for which patches are
> in the ML, review comments appreciated.
>
> > but in case of raid56, it
> > might hit the reconstruction bug if two disks are missing and added
> > back, which I tried to address recently.
>
> > At least ->writable should not be set until balance/scrub completes
> > the re-sync job.
>
> Hm. Why ?
>
(Sorry for the late reply.)
On second thought, write is OK since parity will be calculated along
writes.
Will check the v2.
Thanks,
-liubo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
2017-11-20 19:28 ` Goffredo Baroncelli
2017-11-20 21:14 ` Anand Jain
@ 2017-12-04 7:09 ` Anand Jain
1 sibling, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-12-04 7:09 UTC (permalink / raw)
To: kreijack; +Cc: linux-btrfs, Liu Bo
> [..] would be better an explicit user intervention instead of an automatic one ?
What is the user intervention method steps that you have in mind ?
Just curious. Pls remember downtime is not a choice of recovery
from this context which means FS should be available for the
applications to perform read/write.
-o remount was one this which I had skipped for sometime as
theoretically it can't solve the problem (to bring back missing
device) as well, but now I have verified it, it won't.
There is no record of what is the original idea to perform this
basic volume manager step.
Thanks, Anand
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-12-04 7:09 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
2017-11-15 7:38 ` kbuild test robot
2017-11-15 10:18 ` Anand Jain
2017-11-16 19:08 ` Nikolay Borisov
2017-11-17 7:46 ` Anand Jain
2017-11-16 23:28 ` Liu Bo
2017-11-17 11:53 ` Anand Jain
2017-11-28 22:52 ` Liu Bo
2017-11-17 12:36 ` [PATCH v2] " Anand Jain
2017-11-18 13:52 ` Goffredo Baroncelli
2017-11-20 8:19 ` Anand Jain
2017-11-20 19:28 ` Goffredo Baroncelli
2017-11-20 21:14 ` Anand Jain
2017-12-04 7:09 ` Anand Jain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).