linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: handle dynamically reappearing missing device
@ 2017-11-12 10:56 Anand Jain
  2017-11-15  7:38 ` kbuild test robot
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-12 10:56 UTC (permalink / raw)
  To: linux-btrfs

If the device is not present at the time of (-o degrade) mount
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted. So this
patch handles that case by going through the open_device steps
which this device missed and finally adds to the device alloc list.

So now with this patch, to bring back the missing device user can run,

   btrfs dev scan <path-of-missing-device>

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
This patch needs:
 [PATCH 0/4]  factor __btrfs_open_devices()

 fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d24e966ee29f..e7dd996831f2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
 		rcu_string_free(device->name);
 		rcu_assign_pointer(device->name, name);
 		if (device->missing) {
-			fs_devices->missing_devices--;
-			device->missing = 0;
+			int ret;
+			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
+			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
+
+			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
+				fmode &= ~FMODE_WRITE;
+
+			/*
+			 * Missing can be set only when FS is mounted.
+			 * So here its always fs_devices->opened > 0 and most
+			 * of the struct device members are already updated by
+			 * the mount process even if this device was missing, so
+			 * now follow the normal open device procedure for this
+			 * device. The scrub will take care of filling the
+			 * missing stripes for raid56 and balance for raid1 and
+			 * raid10.
+			 */
+			ASSERT(fs_devices->opened);
+			mutex_lock(&fs_devices->device_list_mutex);
+			mutex_lock(&fs_info->chunk_mutex);
+			ret = btrfs_open_one_device(fs_devices, device, fmode,
+							fs_info->bdev_holder);
+			if (!ret) {
+				fs_devices->missing_devices--;
+				device->missing = 0;
+				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
+				btrfs_warn(fs_info,
+					"BTRFS: device %s devid %llu uuid %pU joined\n",
+					path, devid, device->uuid);
+			}
+
+			if (device->writeable &&
+					!device->is_tgtdev_for_dev_replace) {
+				fs_devices->total_rw_bytes += device->total_bytes;
+				atomic64_add(device->total_bytes -
+						device->bytes_used,
+						&fs_info->free_chunk_space);
+			}
+			device->in_fs_metadata = 1;
+			mutex_unlock(&fs_devices->fs_info->chunk_mutex);
+			mutex_unlock(&fs_devices->device_list_mutex);
 		}
 	}
 
-- 
2.13.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
@ 2017-11-15  7:38 ` kbuild test robot
  2017-11-15 10:18   ` Anand Jain
  2017-11-16 19:08 ` Nikolay Borisov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: kbuild test robot @ 2017-11-15  7:38 UTC (permalink / raw)
  To: Anand Jain; +Cc: kbuild-all, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 7598 bytes --]

Hi Anand,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on btrfs/next]
[also build test ERROR on v4.14 next-20171114]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-handle-dynamically-reappearing-missing-device/20171115-143047
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   fs/btrfs/volumes.c: In function 'device_list_add':
>> fs/btrfs/volumes.c:732:10: error: implicit declaration of function 'btrfs_open_one_device'; did you mean 'btrfs_scan_one_device'? [-Werror=implicit-function-declaration]
       ret = btrfs_open_one_device(fs_devices, device, fmode,
             ^~~~~~~~~~~~~~~~~~~~~
             btrfs_scan_one_device
   cc1: some warnings being treated as errors

vim +732 fs/btrfs/volumes.c

   610	
   611	/*
   612	 * Add new device to list of registered devices
   613	 *
   614	 * Returns:
   615	 * 1   - first time device is seen
   616	 * 0   - device already known
   617	 * < 0 - error
   618	 */
   619	static noinline int device_list_add(const char *path,
   620				   struct btrfs_super_block *disk_super,
   621				   u64 devid, struct btrfs_fs_devices **fs_devices_ret)
   622	{
   623		struct btrfs_device *device;
   624		struct btrfs_fs_devices *fs_devices;
   625		struct rcu_string *name;
   626		int ret = 0;
   627		u64 found_transid = btrfs_super_generation(disk_super);
   628	
   629		fs_devices = find_fsid(disk_super->fsid);
   630		if (!fs_devices) {
   631			fs_devices = alloc_fs_devices(disk_super->fsid);
   632			if (IS_ERR(fs_devices))
   633				return PTR_ERR(fs_devices);
   634	
   635			list_add(&fs_devices->list, &fs_uuids);
   636	
   637			device = NULL;
   638		} else {
   639			device = __find_device(&fs_devices->devices, devid,
   640					       disk_super->dev_item.uuid);
   641		}
   642	
   643		if (!device) {
   644			if (fs_devices->opened)
   645				return -EBUSY;
   646	
   647			device = btrfs_alloc_device(NULL, &devid,
   648						    disk_super->dev_item.uuid);
   649			if (IS_ERR(device)) {
   650				/* we can safely leave the fs_devices entry around */
   651				return PTR_ERR(device);
   652			}
   653	
   654			name = rcu_string_strdup(path, GFP_NOFS);
   655			if (!name) {
   656				kfree(device);
   657				return -ENOMEM;
   658			}
   659			rcu_assign_pointer(device->name, name);
   660	
   661			mutex_lock(&fs_devices->device_list_mutex);
   662			list_add_rcu(&device->dev_list, &fs_devices->devices);
   663			fs_devices->num_devices++;
   664			mutex_unlock(&fs_devices->device_list_mutex);
   665	
   666			ret = 1;
   667			device->fs_devices = fs_devices;
   668		} else if (!device->name || strcmp(device->name->str, path)) {
   669			/*
   670			 * When FS is already mounted.
   671			 * 1. If you are here and if the device->name is NULL that
   672			 *    means this device was missing at time of FS mount.
   673			 * 2. If you are here and if the device->name is different
   674			 *    from 'path' that means either
   675			 *      a. The same device disappeared and reappeared with
   676			 *         different name. or
   677			 *      b. The missing-disk-which-was-replaced, has
   678			 *         reappeared now.
   679			 *
   680			 * We must allow 1 and 2a above. But 2b would be a spurious
   681			 * and unintentional.
   682			 *
   683			 * Further in case of 1 and 2a above, the disk at 'path'
   684			 * would have missed some transaction when it was away and
   685			 * in case of 2a the stale bdev has to be updated as well.
   686			 * 2b must not be allowed at all time.
   687			 */
   688	
   689			/*
   690			 * For now, we do allow update to btrfs_fs_device through the
   691			 * btrfs dev scan cli after FS has been mounted.  We're still
   692			 * tracking a problem where systems fail mount by subvolume id
   693			 * when we reject replacement on a mounted FS.
   694			 */
   695			if (!fs_devices->opened && found_transid < device->generation) {
   696				/*
   697				 * That is if the FS is _not_ mounted and if you
   698				 * are here, that means there is more than one
   699				 * disk with same uuid and devid.We keep the one
   700				 * with larger generation number or the last-in if
   701				 * generation are equal.
   702				 */
   703				return -EEXIST;
   704			}
   705	
   706			name = rcu_string_strdup(path, GFP_NOFS);
   707			if (!name)
   708				return -ENOMEM;
   709			rcu_string_free(device->name);
   710			rcu_assign_pointer(device->name, name);
   711			if (device->missing) {
   712				int ret;
   713				struct btrfs_fs_info *fs_info = fs_devices->fs_info;
   714				fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
   715	
   716				if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
   717					fmode &= ~FMODE_WRITE;
   718	
   719				/*
   720				 * Missing can be set only when FS is mounted.
   721				 * So here its always fs_devices->opened > 0 and most
   722				 * of the struct device members are already updated by
   723				 * the mount process even if this device was missing, so
   724				 * now follow the normal open device procedure for this
   725				 * device. The scrub will take care of filling the
   726				 * missing stripes for raid56 and balance for raid1 and
   727				 * raid10.
   728				 */
   729				ASSERT(fs_devices->opened);
   730				mutex_lock(&fs_devices->device_list_mutex);
   731				mutex_lock(&fs_info->chunk_mutex);
 > 732				ret = btrfs_open_one_device(fs_devices, device, fmode,
   733								fs_info->bdev_holder);
   734				if (!ret) {
   735					fs_devices->missing_devices--;
   736					device->missing = 0;
   737					btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
   738					btrfs_warn(fs_info,
   739						"BTRFS: device %s devid %llu uuid %pU joined\n",
   740						path, devid, device->uuid);
   741				}
   742	
   743				if (device->writeable &&
   744						!device->is_tgtdev_for_dev_replace) {
   745					fs_devices->total_rw_bytes += device->total_bytes;
   746					atomic64_add(device->total_bytes -
   747							device->bytes_used,
   748							&fs_info->free_chunk_space);
   749				}
   750				device->in_fs_metadata = 1;
   751				mutex_unlock(&fs_devices->fs_info->chunk_mutex);
   752				mutex_unlock(&fs_devices->device_list_mutex);
   753			}
   754		}
   755	
   756		/*
   757		 * Unmount does not free the btrfs_device struct but would zero
   758		 * generation along with most of the other members. So just update
   759		 * it back. We need it to pick the disk with largest generation
   760		 * (as above).
   761		 */
   762		if (!fs_devices->opened)
   763			device->generation = found_transid;
   764	
   765		/*
   766		 * if there is new btrfs on an already registered device,
   767		 * then remove the stale device entry.
   768		 */
   769		if (ret > 0)
   770			btrfs_free_stale_device(device);
   771	
   772		*fs_devices_ret = fs_devices;
   773	
   774		return ret;
   775	}
   776	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51512 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-15  7:38 ` kbuild test robot
@ 2017-11-15 10:18   ` Anand Jain
  0 siblings, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-15 10:18 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, linux-btrfs



On 11/15/2017 03:38 PM, kbuild test robot wrote:
> Hi Anand,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on btrfs/next]
> [also build test ERROR on v4.14 next-20171114]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-handle-dynamically-reappearing-missing-device/20171115-143047
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
> config: sparc64-allyesconfig (attached as .config)
> compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>          chmod +x ~/bin/make.cross
>          # save the attached .config to linux build tree
>          make.cross ARCH=sparc64
> 
> All errors (new ones prefixed by >>):
> 
>     fs/btrfs/volumes.c: In function 'device_list_add':
>>> fs/btrfs/volumes.c:732:10: error: implicit declaration of function 'btrfs_open_one_device'; did you mean 'btrfs_scan_one_device'? [-Werror=implicit-function-declaration]
>         ret = btrfs_open_one_device(fs_devices, device, fmode,
>               ^~~~~~~~~~~~~~~~~~~~~
>               btrfs_scan_one_device
>     cc1: some warnings being treated as errors


Its missing this patch set in the ML which was sent separately.
  [PATCH 0/4] factor __btrfs_open_devices()

Thanks, Anand


> vim +732 fs/btrfs/volumes.c
> 
>     610	
>     611	/*
>     612	 * Add new device to list of registered devices
>     613	 *
>     614	 * Returns:
>     615	 * 1   - first time device is seen
>     616	 * 0   - device already known
>     617	 * < 0 - error
>     618	 */
>     619	static noinline int device_list_add(const char *path,
>     620				   struct btrfs_super_block *disk_super,
>     621				   u64 devid, struct btrfs_fs_devices **fs_devices_ret)
>     622	{
>     623		struct btrfs_device *device;
>     624		struct btrfs_fs_devices *fs_devices;
>     625		struct rcu_string *name;
>     626		int ret = 0;
>     627		u64 found_transid = btrfs_super_generation(disk_super);
>     628	
>     629		fs_devices = find_fsid(disk_super->fsid);
>     630		if (!fs_devices) {
>     631			fs_devices = alloc_fs_devices(disk_super->fsid);
>     632			if (IS_ERR(fs_devices))
>     633				return PTR_ERR(fs_devices);
>     634	
>     635			list_add(&fs_devices->list, &fs_uuids);
>     636	
>     637			device = NULL;
>     638		} else {
>     639			device = __find_device(&fs_devices->devices, devid,
>     640					       disk_super->dev_item.uuid);
>     641		}
>     642	
>     643		if (!device) {
>     644			if (fs_devices->opened)
>     645				return -EBUSY;
>     646	
>     647			device = btrfs_alloc_device(NULL, &devid,
>     648						    disk_super->dev_item.uuid);
>     649			if (IS_ERR(device)) {
>     650				/* we can safely leave the fs_devices entry around */
>     651				return PTR_ERR(device);
>     652			}
>     653	
>     654			name = rcu_string_strdup(path, GFP_NOFS);
>     655			if (!name) {
>     656				kfree(device);
>     657				return -ENOMEM;
>     658			}
>     659			rcu_assign_pointer(device->name, name);
>     660	
>     661			mutex_lock(&fs_devices->device_list_mutex);
>     662			list_add_rcu(&device->dev_list, &fs_devices->devices);
>     663			fs_devices->num_devices++;
>     664			mutex_unlock(&fs_devices->device_list_mutex);
>     665	
>     666			ret = 1;
>     667			device->fs_devices = fs_devices;
>     668		} else if (!device->name || strcmp(device->name->str, path)) {
>     669			/*
>     670			 * When FS is already mounted.
>     671			 * 1. If you are here and if the device->name is NULL that
>     672			 *    means this device was missing at time of FS mount.
>     673			 * 2. If you are here and if the device->name is different
>     674			 *    from 'path' that means either
>     675			 *      a. The same device disappeared and reappeared with
>     676			 *         different name. or
>     677			 *      b. The missing-disk-which-was-replaced, has
>     678			 *         reappeared now.
>     679			 *
>     680			 * We must allow 1 and 2a above. But 2b would be a spurious
>     681			 * and unintentional.
>     682			 *
>     683			 * Further in case of 1 and 2a above, the disk at 'path'
>     684			 * would have missed some transaction when it was away and
>     685			 * in case of 2a the stale bdev has to be updated as well.
>     686			 * 2b must not be allowed at all time.
>     687			 */
>     688	
>     689			/*
>     690			 * For now, we do allow update to btrfs_fs_device through the
>     691			 * btrfs dev scan cli after FS has been mounted.  We're still
>     692			 * tracking a problem where systems fail mount by subvolume id
>     693			 * when we reject replacement on a mounted FS.
>     694			 */
>     695			if (!fs_devices->opened && found_transid < device->generation) {
>     696				/*
>     697				 * That is if the FS is _not_ mounted and if you
>     698				 * are here, that means there is more than one
>     699				 * disk with same uuid and devid.We keep the one
>     700				 * with larger generation number or the last-in if
>     701				 * generation are equal.
>     702				 */
>     703				return -EEXIST;
>     704			}
>     705	
>     706			name = rcu_string_strdup(path, GFP_NOFS);
>     707			if (!name)
>     708				return -ENOMEM;
>     709			rcu_string_free(device->name);
>     710			rcu_assign_pointer(device->name, name);
>     711			if (device->missing) {
>     712				int ret;
>     713				struct btrfs_fs_info *fs_info = fs_devices->fs_info;
>     714				fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
>     715	
>     716				if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
>     717					fmode &= ~FMODE_WRITE;
>     718	
>     719				/*
>     720				 * Missing can be set only when FS is mounted.
>     721				 * So here its always fs_devices->opened > 0 and most
>     722				 * of the struct device members are already updated by
>     723				 * the mount process even if this device was missing, so
>     724				 * now follow the normal open device procedure for this
>     725				 * device. The scrub will take care of filling the
>     726				 * missing stripes for raid56 and balance for raid1 and
>     727				 * raid10.
>     728				 */
>     729				ASSERT(fs_devices->opened);
>     730				mutex_lock(&fs_devices->device_list_mutex);
>     731				mutex_lock(&fs_info->chunk_mutex);
>   > 732				ret = btrfs_open_one_device(fs_devices, device, fmode,
>     733								fs_info->bdev_holder);
>     734				if (!ret) {
>     735					fs_devices->missing_devices--;
>     736					device->missing = 0;
>     737					btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
>     738					btrfs_warn(fs_info,
>     739						"BTRFS: device %s devid %llu uuid %pU joined\n",
>     740						path, devid, device->uuid);
>     741				}
>     742	
>     743				if (device->writeable &&
>     744						!device->is_tgtdev_for_dev_replace) {
>     745					fs_devices->total_rw_bytes += device->total_bytes;
>     746					atomic64_add(device->total_bytes -
>     747							device->bytes_used,
>     748							&fs_info->free_chunk_space);
>     749				}
>     750				device->in_fs_metadata = 1;
>     751				mutex_unlock(&fs_devices->fs_info->chunk_mutex);
>     752				mutex_unlock(&fs_devices->device_list_mutex);
>     753			}
>     754		}
>     755	
>     756		/*
>     757		 * Unmount does not free the btrfs_device struct but would zero
>     758		 * generation along with most of the other members. So just update
>     759		 * it back. We need it to pick the disk with largest generation
>     760		 * (as above).
>     761		 */
>     762		if (!fs_devices->opened)
>     763			device->generation = found_transid;
>     764	
>     765		/*
>     766		 * if there is new btrfs on an already registered device,
>     767		 * then remove the stale device entry.
>     768		 */
>     769		if (ret > 0)
>     770			btrfs_free_stale_device(device);
>     771	
>     772		*fs_devices_ret = fs_devices;
>     773	
>     774		return ret;
>     775	}
>     776	
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
  2017-11-15  7:38 ` kbuild test robot
@ 2017-11-16 19:08 ` Nikolay Borisov
  2017-11-17  7:46   ` Anand Jain
  2017-11-16 23:28 ` Liu Bo
  2017-11-17 12:36 ` [PATCH v2] " Anand Jain
  3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Borisov @ 2017-11-16 19:08 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs



On 12.11.2017 12:56, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted. So this
> patch handles that case by going through the open_device steps
> which this device missed and finally adds to the device alloc list.
> 
> So now with this patch, to bring back the missing device user can run,
> 
>    btrfs dev scan <path-of-missing-device>
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> This patch needs:
>  [PATCH 0/4]  factor __btrfs_open_devices()
> 
>  fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d24e966ee29f..e7dd996831f2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>  		rcu_string_free(device->name);
>  		rcu_assign_pointer(device->name, name);
>  		if (device->missing) {
> -			fs_devices->missing_devices--;
> -			device->missing = 0;
> +			int ret;
> +			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> +			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> +
> +			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> +				fmode &= ~FMODE_WRITE;
> +
> +			/*
> +			 * Missing can be set only when FS is mounted.
> +			 * So here its always fs_devices->opened > 0 and most
> +			 * of the struct device members are already updated by
> +			 * the mount process even if this device was missing, so
> +			 * now follow the normal open device procedure for this
> +			 * device. The scrub will take care of filling the
> +			 * missing stripes for raid56 and balance for raid1 and
> +			 * raid10.
> +			 */
> +			ASSERT(fs_devices->opened);
> +			mutex_lock(&fs_devices->device_list_mutex);
> +			mutex_lock(&fs_info->chunk_mutex);
> +			ret = btrfs_open_one_device(fs_devices, device, fmode,
> +							fs_info->bdev_holder);
> +			if (!ret) {
> +				fs_devices->missing_devices--;
> +				device->missing = 0;
> +				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
> +				btrfs_warn(fs_info,
> +					"BTRFS: device %s devid %llu uuid %pU joined\n",
> +					path, devid, device->uuid);
> +			}
> +
> +			if (device->writeable &&
> +					!device->is_tgtdev_for_dev_replace) {
> +				fs_devices->total_rw_bytes += device->total_bytes;
> +				atomic64_add(device->total_bytes -
> +						device->bytes_used,
> +						&fs_info->free_chunk_space);
> +			}
> +			device->in_fs_metadata = 1;
> +			mutex_unlock(&fs_devices->fs_info->chunk_mutex);
> +			mutex_unlock(&fs_devices->device_list_mutex);

nit: You did add the fs_info local var, so for consistency's sake use that
>  		}
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
  2017-11-15  7:38 ` kbuild test robot
  2017-11-16 19:08 ` Nikolay Borisov
@ 2017-11-16 23:28 ` Liu Bo
  2017-11-17 11:53   ` Anand Jain
  2017-11-17 12:36 ` [PATCH v2] " Anand Jain
  3 siblings, 1 reply; 14+ messages in thread
From: Liu Bo @ 2017-11-16 23:28 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted.

This commit log doesn't explain what would happen when the missing
device re-appears.

> So this
> patch handles that case by going through the open_device steps
> which this device missed and finally adds to the device alloc list.
> 
> So now with this patch, to bring back the missing device user can run,
> 
>    btrfs dev scan <path-of-missing-device>

Most common distros already have some udev rules of btrfs to process a
reappeared disk.

> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> This patch needs:
>  [PATCH 0/4]  factor __btrfs_open_devices()
> 
>  fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d24e966ee29f..e7dd996831f2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>  		rcu_string_free(device->name);
>  		rcu_assign_pointer(device->name, name);
>  		if (device->missing) {
> -			fs_devices->missing_devices--;
> -			device->missing = 0;
> +			int ret;
> +			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> +			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> +
> +			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> +				fmode &= ~FMODE_WRITE;
> +
> +			/*
> +			 * Missing can be set only when FS is mounted.
> +			 * So here its always fs_devices->opened > 0 and most
> +			 * of the struct device members are already updated by
> +			 * the mount process even if this device was missing, so
> +			 * now follow the normal open device procedure for this
> +			 * device. The scrub will take care of filling the
> +			 * missing stripes for raid56 and balance for raid1 and
> +			 * raid10.
> +			 */


The idea looks good to me.

What if users skip balance/scrub given both could take a while?

Then btrfs takes the disks back and reads content from it, and it's OK
for raid1/10 as they may have a good copy, but in case of raid56, it
might hit the reconstruction bug if two disks are missing and added
back, which I tried to address recently.

At least ->writable should not be set until balance/scrub completes
the re-sync job.

Thanks,

-liubo

> +			ASSERT(fs_devices->opened);
> +			mutex_lock(&fs_devices->device_list_mutex);
> +			mutex_lock(&fs_info->chunk_mutex);
> +			ret = btrfs_open_one_device(fs_devices, device, fmode,
> +							fs_info->bdev_holder);
> +			if (!ret) {
> +				fs_devices->missing_devices--;
> +				device->missing = 0;
> +				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
> +				btrfs_warn(fs_info,
> +					"BTRFS: device %s devid %llu uuid %pU joined\n",
> +					path, devid, device->uuid);
> +			}
> +
> +			if (device->writeable &&
> +					!device->is_tgtdev_for_dev_replace) {
> +				fs_devices->total_rw_bytes += device->total_bytes;
> +				atomic64_add(device->total_bytes -
> +						device->bytes_used,
> +						&fs_info->free_chunk_space);
> +			}
> +			device->in_fs_metadata = 1;
> +			mutex_unlock(&fs_devices->fs_info->chunk_mutex);
> +			mutex_unlock(&fs_devices->device_list_mutex);
>  		}
>  	}
>  
> -- 
> 2.13.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-16 19:08 ` Nikolay Borisov
@ 2017-11-17  7:46   ` Anand Jain
  0 siblings, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-17  7:46 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs



On 11/17/2017 03:08 AM, Nikolay Borisov wrote:
> 
> 
> On 12.11.2017 12:56, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted. So this
>> patch handles that case by going through the open_device steps
>> which this device missed and finally adds to the device alloc list.
>>
>> So now with this patch, to bring back the missing device user can run,
>>
>>     btrfs dev scan <path-of-missing-device>
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> This patch needs:
>>   [PATCH 0/4]  factor __btrfs_open_devices()
>>
>>   fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index d24e966ee29f..e7dd996831f2 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>>   		rcu_string_free(device->name);
>>   		rcu_assign_pointer(device->name, name);
>>   		if (device->missing) {
>> -			fs_devices->missing_devices--;
>> -			device->missing = 0;
>> +			int ret;
>> +			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
>> +			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
>> +
>> +			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
>> +				fmode &= ~FMODE_WRITE;
>> +
>> +			/*
>> +			 * Missing can be set only when FS is mounted.
>> +			 * So here its always fs_devices->opened > 0 and most
>> +			 * of the struct device members are already updated by
>> +			 * the mount process even if this device was missing, so
>> +			 * now follow the normal open device procedure for this
>> +			 * device. The scrub will take care of filling the
>> +			 * missing stripes for raid56 and balance for raid1 and
>> +			 * raid10.
>> +			 */
>> +			ASSERT(fs_devices->opened);
>> +			mutex_lock(&fs_devices->device_list_mutex);
>> +			mutex_lock(&fs_info->chunk_mutex);
>> +			ret = btrfs_open_one_device(fs_devices, device, fmode,
>> +							fs_info->bdev_holder);
>> +			if (!ret) {
>> +				fs_devices->missing_devices--;
>> +				device->missing = 0;
>> +				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
>> +				btrfs_warn(fs_info,
>> +					"BTRFS: device %s devid %llu uuid %pU joined\n",
>> +					path, devid, device->uuid);
>> +			}
>> +
>> +			if (device->writeable &&
>> +					!device->is_tgtdev_for_dev_replace) {
>> +				fs_devices->total_rw_bytes += device->total_bytes;
>> +				atomic64_add(device->total_bytes -
>> +						device->bytes_used,
>> +						&fs_info->free_chunk_space);
>> +			}
>> +			device->in_fs_metadata = 1;
>> +			mutex_unlock(&fs_devices->fs_info->chunk_mutex);
>> +			mutex_unlock(&fs_devices->device_list_mutex);
> 
> nit: You did add the fs_info local var, so for consistency's sake use that

  Thanks. Will fix it.

-Anand

>>   		}
>>   	}
>>   
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-16 23:28 ` Liu Bo
@ 2017-11-17 11:53   ` Anand Jain
  2017-11-28 22:52     ` Liu Bo
  0 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-17 11:53 UTC (permalink / raw)
  To: bo.li.liu; +Cc: linux-btrfs



On 11/17/2017 07:28 AM, Liu Bo wrote:
> On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted.

> This commit log doesn't explain what would happen when the missing
> device re-appears.

  Hm. You mean in the current design without this patch.?
  Just nothing it gives a false impression to the user by
'btrfs dev ready' and 'btrfs fi show' that missing device
  has been included. Will update change log.

>> So this
>> patch handles that case by going through the open_device steps

>> which this device missed and finally adds to the device alloc list.
>>
>> So now with this patch, to bring back the missing device user can run,
>>
>>     btrfs dev scan <path-of-missing-device>
> 
> Most common distros already have some udev rules of btrfs to process a
> reappeared disk.

  Right. Do you see anything that will break ?
  Planning to add comments [1] in v2 for more clarity around this.

>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> This patch needs:
>>   [PATCH 0/4]  factor __btrfs_open_devices()
>>
>>   fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index d24e966ee29f..e7dd996831f2 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
>>   		rcu_string_free(device->name);
>>   		rcu_assign_pointer(device->name, name);
>>   		if (device->missing) {
>> -			fs_devices->missing_devices--;
>> -			device->missing = 0;
>> +			int ret;
>> +			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
>> +			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
>> +
>> +			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
>> +				fmode &= ~FMODE_WRITE;
>> +
>> +			/*
>> +			 * Missing can be set only when FS is mounted.
>> +			 * So here its always fs_devices->opened > 0 and most
>> +			 * of the struct device members are already updated by
>> +			 * the mount process even if this device was missing, so
>> +			 * now follow the normal open device procedure for this
>> +			 * device. The scrub will take care of filling the
>> +			 * missing stripes for raid56 and balance for raid1 and
>> +			 * raid10.
>> +			 */
> 
> 
> The idea looks good to me.
> 
> What if users skip balance/scrub given both could take a while?
> 
> Then btrfs takes the disks back and reads content from it, and it's OK
> for raid1/10 as they may have a good copy, 

  Yes, except for the split brain situation for which patches are
  in the ML, review comments appreciated.

> but in case of raid56, it
> might hit the reconstruction bug if two disks are missing and added
> back, which I tried to address recently.

> At least ->writable should not be set until balance/scrub completes
> the re-sync job.

  Hm. Why ?

> Thanks,
> 
> -liubo
> 
>> +			ASSERT(fs_devices->opened);
>> +			mutex_lock(&fs_devices->device_list_mutex);
>> +			mutex_lock(&fs_info->chunk_mutex);


[1]
			/*
			 * By not failing for the
			 * reason that btrfs_open_one_device() could
			 * fail, it will keep the original behaviors as
			 * it is for now. That's fix me for later.
			 */
>> +			ret = btrfs_open_one_device(fs_devices, device, fmode,
>> +							fs_info->bdev_holder);
>> +			if (!ret) {

				/*
				 * Making sure missing is set to 0
				 * only when bdev != NULL is as expected
				 * at most of the places in the code.
				 * Further, what if user fixes the
				 * dev and reruns dev scan, then again
				 * we need to come here.
				 */

>> +				fs_devices->missing_devices--;
>> +				device->missing = 0;
>> +				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
>> +				btrfs_warn(fs_info,
>> +					"BTRFS: device %s devid %llu uuid %pU joined\n",
>> +					path, devid, device->uuid);
>> +			}

  Also added check for missing as below in v2.

--------------
@@ -725,7 +725,9 @@ static noinline int device_list_add(const char *path,

                 ret = 1;
                 device->fs_devices = fs_devices;
-       } else if (!device->name || strcmp(device->name->str, path)) {
+       } else if (!device->name ||
+                       strcmp(device->name->str, path) ||
+                       device->missing) {
                 /*
                  * When FS is already mounted.
                  * 1. If you are here and if the device->name is NULL that
------------

Thanks, Anand

>> +
>> +			if (device->writeable &&
>> +					!device->is_tgtdev_for_dev_replace) {
>> +				fs_devices->total_rw_bytes += device->total_bytes;
>> +				atomic64_add(device->total_bytes -
>> +						device->bytes_used,
>> +						&fs_info->free_chunk_space);
>> +			}
>> +			device->in_fs_metadata = 1;
>> +			mutex_unlock(&fs_devices->fs_info->chunk_mutex);
>> +			mutex_unlock(&fs_devices->device_list_mutex);
>>   		}
>>   	}
>>   
>> -- 
>> 2.13.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
                   ` (2 preceding siblings ...)
  2017-11-16 23:28 ` Liu Bo
@ 2017-11-17 12:36 ` Anand Jain
  2017-11-18 13:52   ` Goffredo Baroncelli
  3 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-17 12:36 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Anand Jain

If the device is not present at the time of (-o degrade) mount,
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted and
then device is included in the device list but it missed the
open_device part. So this patch handles that case by going
through the open_device steps which this device missed and finally
adds to the device alloc list.

So now with this patch, to bring back the missing device user can run,

   btrfs dev scan <path-of-missing-device>

Without this kernel patch, even though 'btrfs fi show' and 'btrfs
dev ready' would tell you that missing device has reappeared
successfully but actually in kernel FS layer it didn't.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
This patch needs:
 [PATCH 0/4]  factor __btrfs_open_devices()

v2:
Add more comments.
Add more change log.
Add to check if device missing is set, to handle the case
dev open fail and user will rerun the dev scan.

 fs/btrfs/volumes.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 60 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5bd73edc2602..b3224baa6db4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -725,7 +725,9 @@ static noinline int device_list_add(const char *path,
 
 		ret = 1;
 		device->fs_devices = fs_devices;
-	} else if (!device->name || strcmp(device->name->str, path)) {
+	} else if (!device->name ||
+			strcmp(device->name->str, path) ||
+			device->missing) {
 		/*
 		 * When FS is already mounted.
 		 * 1. If you are here and if the device->name is NULL that
@@ -769,8 +771,63 @@ static noinline int device_list_add(const char *path,
 		rcu_string_free(device->name);
 		rcu_assign_pointer(device->name, name);
 		if (device->missing) {
-			fs_devices->missing_devices--;
-			device->missing = 0;
+			int ret;
+			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
+			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
+
+			if (btrfs_super_flags(disk_super) &
+					BTRFS_SUPER_FLAG_SEEDING)
+				fmode &= ~FMODE_WRITE;
+
+			/*
+			 * Missing can be set only when FS is mounted.
+			 * So here its always fs_devices->opened > 0 and most
+			 * of the struct device members are already updated by
+			 * the mount process even if this device was missing, so
+			 * now follow the normal open device procedure for this
+			 * device. The scrub will take care of filling the
+			 * missing stripes for raid56 and balance for raid1 and
+			 * raid10.
+			 */
+			ASSERT(fs_devices->opened);
+			mutex_lock(&fs_devices->device_list_mutex);
+			mutex_lock(&fs_info->chunk_mutex);
+			/*
+			 * By not failing for the reason that
+			 * btrfs_open_one_device() could fail, it will
+			 * keep the original behaviors as it is for now.
+			 * That's fix me for later.
+			 */
+			ret = btrfs_open_one_device(fs_devices, device, fmode,
+							fs_info->bdev_holder);
+			if (!ret) {
+				/*
+				 * Making sure missing is set to 0
+				 * only when bdev != NULL is as expected
+				 * at most of the places in the code.
+				 * Further, what if user fixes the
+				 * dev and reruns the dev scan, then again
+				 * we need to come here to reset missing.
+				 */
+				fs_devices->missing_devices--;
+				device->missing = 0;
+				btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
+				btrfs_warn(fs_info,
+					"BTRFS: device %s devid %llu uuid %pU joined\n",
+					path, devid, device->uuid);
+			}
+
+			if (device->writeable &&
+					!device->is_tgtdev_for_dev_replace) {
+				fs_devices->total_rw_bytes +=
+							device->total_bytes;
+				atomic64_add(device->total_bytes -
+						device->bytes_used,
+						&fs_info->free_chunk_space);
+			}
+			device->in_fs_metadata = 1;
+			mutex_unlock(&fs_info->chunk_mutex);
+			mutex_unlock(&fs_devices->device_list_mutex);
 		}
 	}
 
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-17 12:36 ` [PATCH v2] " Anand Jain
@ 2017-11-18 13:52   ` Goffredo Baroncelli
  2017-11-20  8:19     ` Anand Jain
  0 siblings, 1 reply; 14+ messages in thread
From: Goffredo Baroncelli @ 2017-11-18 13:52 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, Liu Bo

On 11/17/2017 01:36 PM, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount,
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted and
> then device is included in the device list but it missed the
> open_device part. So this patch handles that case by going
> through the open_device steps which this device missed and finally
> adds to the device alloc list.

What happens if the first devices got writes before the "last device" is joined ?

Supposing to have a raid1 pair of devices: sda, sdb

- sda is dissappeared (usb detached ?)
- the filesystem is mounted as
	mount -o degraded /dev/sdb
- some writes happens on /dev/sdb
- the user reattach /dev/sda
- udev run "btrfs dev scan"
- the system joins /dev/sda to the filesystem disks pool

Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???

Am I missing something ?

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-18 13:52   ` Goffredo Baroncelli
@ 2017-11-20  8:19     ` Anand Jain
  2017-11-20 19:28       ` Goffredo Baroncelli
  0 siblings, 1 reply; 14+ messages in thread
From: Anand Jain @ 2017-11-20  8:19 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs, Liu Bo



On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
> On 11/17/2017 01:36 PM, Anand Jain wrote:
>> If the device is not present at the time of (-o degrade) mount,
>> the mount context will create a dummy missing struct btrfs_device.
>> Later this device may reappear after the FS is mounted and
>> then device is included in the device list but it missed the
>> open_device part. So this patch handles that case by going
>> through the open_device steps which this device missed and finally
>> adds to the device alloc list.
> 
> What happens if the first devices got writes before the "last device" is joined ?
> 
> Supposing to have a raid1 pair of devices: sda, sdb
> 
> - sda is dissappeared (usb detached ?)
> - the filesystem is mounted as
> 	mount -o degraded /dev/sdb
> - some writes happens on /dev/sdb
> - the user reattach /dev/sda
> - udev run "btrfs dev scan"
> - the system joins /dev/sda to the filesystem disks pool
> 
> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???

  Thanks for the test scenario, this case is fine as its read from
  the disk having highest generation number, so we pick sdb in the
  case above.

Thanks, Anand


> Am I missing something ?
> 
> BR
> G.Baroncelli
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-20  8:19     ` Anand Jain
@ 2017-11-20 19:28       ` Goffredo Baroncelli
  2017-11-20 21:14         ` Anand Jain
  2017-12-04  7:09         ` Anand Jain
  0 siblings, 2 replies; 14+ messages in thread
From: Goffredo Baroncelli @ 2017-11-20 19:28 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, Liu Bo

On 11/20/2017 09:19 AM, Anand Jain wrote:
> 
> 
> On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
>> On 11/17/2017 01:36 PM, Anand Jain wrote:
>>> If the device is not present at the time of (-o degrade) mount,
>>> the mount context will create a dummy missing struct btrfs_device.
>>> Later this device may reappear after the FS is mounted and
>>> then device is included in the device list but it missed the
>>> open_device part. So this patch handles that case by going
>>> through the open_device steps which this device missed and finally
>>> adds to the device alloc list.
>>
>> What happens if the first devices got writes before the "last device" is joined ?
>>
>> Supposing to have a raid1 pair of devices: sda, sdb
>>
>> - sda is dissappeared (usb detached ?)
>> - the filesystem is mounted as
>>     mount -o degraded /dev/sdb
>> - some writes happens on /dev/sdb
>> - the user reattach /dev/sda
>> - udev run "btrfs dev scan"
>> - the system joins /dev/sda to the filesystem disks pool
>>
>> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
> 
>  Thanks for the test scenario, this case is fine as its read from
>  the disk having highest generation number, so we pick sdb in the
>  case above.

Ok, so in this case which is the benefit to add a disk ? With a lover number generation, the added will be used at all ?
In this case, would be better an explicit user intervention instead of an automatic one ?


> 
> Thanks, Anand
> 
> 
>> Am I missing something ?
>>
>> BR
>> G.Baroncelli
>>
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-20 19:28       ` Goffredo Baroncelli
@ 2017-11-20 21:14         ` Anand Jain
  2017-12-04  7:09         ` Anand Jain
  1 sibling, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-11-20 21:14 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs, Liu Bo



On 11/21/2017 03:28 AM, Goffredo Baroncelli wrote:
> On 11/20/2017 09:19 AM, Anand Jain wrote:
>>
>>
>> On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
>>> On 11/17/2017 01:36 PM, Anand Jain wrote:
>>>> If the device is not present at the time of (-o degrade) mount,
>>>> the mount context will create a dummy missing struct btrfs_device.
>>>> Later this device may reappear after the FS is mounted and
>>>> then device is included in the device list but it missed the
>>>> open_device part. So this patch handles that case by going
>>>> through the open_device steps which this device missed and finally
>>>> adds to the device alloc list.
>>>
>>> What happens if the first devices got writes before the "last device" is joined ?
>>>
>>> Supposing to have a raid1 pair of devices: sda, sdb
>>>
>>> - sda is dissappeared (usb detached ?)
>>> - the filesystem is mounted as
>>>      mount -o degraded /dev/sdb
>>> - some writes happens on /dev/sdb
>>> - the user reattach /dev/sda
>>> - udev run "btrfs dev scan"
>>> - the system joins /dev/sda to the filesystem disks pool
>>>
>>> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) or /dev/sda (old data), or worse read something from the former and something from the later (metadata from sda and data from sdb ) ???
>>
>>   Thanks for the test scenario, this case is fine as its read from
>>   the disk having highest generation number, so we pick sdb in the
>>   case above.
> 
> Ok, so in this case which is the benefit to add a disk ?
> With a lover number generation, the added will be used at all ?

  It will be used for new writes, it will stripe across both the disks.
  Balance is only for the older single chunks, which in the long term
  would go away when we are able to create a degraded raid1 chunks.

> In this case, would be better an explicit user intervention instead of an automatic one ?

   Without this patch the disk already joins the raid group.

   But it had a bug that it missed joining the alloc group.

Thanks, Anand
> 
>>
>> Thanks, Anand
>>
>>
>>> Am I missing something ?
>>>
>>> BR
>>> G.Baroncelli
>>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] btrfs: handle dynamically reappearing missing device
  2017-11-17 11:53   ` Anand Jain
@ 2017-11-28 22:52     ` Liu Bo
  0 siblings, 0 replies; 14+ messages in thread
From: Liu Bo @ 2017-11-28 22:52 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Fri, Nov 17, 2017 at 07:53:29PM +0800, Anand Jain wrote:
> 
> 
> On 11/17/2017 07:28 AM, Liu Bo wrote:
> > On Sun, Nov 12, 2017 at 06:56:50PM +0800, Anand Jain wrote:
> > > If the device is not present at the time of (-o degrade) mount
> > > the mount context will create a dummy missing struct btrfs_device.
> > > Later this device may reappear after the FS is mounted.
> 
> > This commit log doesn't explain what would happen when the missing
> > device re-appears.
> 
>  Hm. You mean in the current design without this patch.?
>  Just nothing it gives a false impression to the user by
> 'btrfs dev ready' and 'btrfs fi show' that missing device
>  has been included. Will update change log.
> 
> > > So this
> > > patch handles that case by going through the open_device steps
> 
> > > which this device missed and finally adds to the device alloc list.
> > > 
> > > So now with this patch, to bring back the missing device user can run,
> > > 
> > >     btrfs dev scan <path-of-missing-device>
> > 
> > Most common distros already have some udev rules of btrfs to process a
> > reappeared disk.
> 
>  Right. Do you see anything that will break ?
>  Planning to add comments [1] in v2 for more clarity around this.
> 
> > > 
> > > Signed-off-by: Anand Jain <anand.jain@oracle.com>
> > > ---
> > > This patch needs:
> > >   [PATCH 0/4]  factor __btrfs_open_devices()
> > > 
> > >   fs/btrfs/volumes.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> > >   1 file changed, 41 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > index d24e966ee29f..e7dd996831f2 100644
> > > --- a/fs/btrfs/volumes.c
> > > +++ b/fs/btrfs/volumes.c
> > > @@ -769,8 +769,47 @@ static noinline int device_list_add(const char *path,
> > >   		rcu_string_free(device->name);
> > >   		rcu_assign_pointer(device->name, name);
> > >   		if (device->missing) {
> > > -			fs_devices->missing_devices--;
> > > -			device->missing = 0;
> > > +			int ret;
> > > +			struct btrfs_fs_info *fs_info = fs_devices->fs_info;
> > > +			fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
> > > +
> > > +			if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)
> > > +				fmode &= ~FMODE_WRITE;
> > > +
> > > +			/*
> > > +			 * Missing can be set only when FS is mounted.
> > > +			 * So here its always fs_devices->opened > 0 and most
> > > +			 * of the struct device members are already updated by
> > > +			 * the mount process even if this device was missing, so
> > > +			 * now follow the normal open device procedure for this
> > > +			 * device. The scrub will take care of filling the
> > > +			 * missing stripes for raid56 and balance for raid1 and
> > > +			 * raid10.
> > > +			 */
> > 
> > 
> > The idea looks good to me.
> > 
> > What if users skip balance/scrub given both could take a while?
> > 
> > Then btrfs takes the disks back and reads content from it, and it's OK
> > for raid1/10 as they may have a good copy,
> 
>  Yes, except for the split brain situation for which patches are
>  in the ML, review comments appreciated.
> 
> > but in case of raid56, it
> > might hit the reconstruction bug if two disks are missing and added
> > back, which I tried to address recently.
> 
> > At least ->writable should not be set until balance/scrub completes
> > the re-sync job.
> 
>  Hm. Why ?
>

(Sorry for the late reply.)

On second thought, write is OK since parity will be calculated along
writes.

Will check the v2.


Thanks,

-liubo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] btrfs: handle dynamically reappearing missing device
  2017-11-20 19:28       ` Goffredo Baroncelli
  2017-11-20 21:14         ` Anand Jain
@ 2017-12-04  7:09         ` Anand Jain
  1 sibling, 0 replies; 14+ messages in thread
From: Anand Jain @ 2017-12-04  7:09 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs, Liu Bo



> [..]  would be better an explicit user intervention instead of an automatic one ?

  What is the user intervention method steps that you have in mind ?
  Just curious. Pls remember downtime is not a choice of recovery
  from this context which means FS should be available for the
  applications to perform read/write.

  -o remount was one this which I had skipped for sometime as
  theoretically it can't solve the problem (to bring back missing
  device) as well, but now I have verified it, it won't.

  There is no record of what is the original idea to perform this
  basic volume manager step.

Thanks, Anand


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-12-04  7:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-12 10:56 [PATCH] btrfs: handle dynamically reappearing missing device Anand Jain
2017-11-15  7:38 ` kbuild test robot
2017-11-15 10:18   ` Anand Jain
2017-11-16 19:08 ` Nikolay Borisov
2017-11-17  7:46   ` Anand Jain
2017-11-16 23:28 ` Liu Bo
2017-11-17 11:53   ` Anand Jain
2017-11-28 22:52     ` Liu Bo
2017-11-17 12:36 ` [PATCH v2] " Anand Jain
2017-11-18 13:52   ` Goffredo Baroncelli
2017-11-20  8:19     ` Anand Jain
2017-11-20 19:28       ` Goffredo Baroncelli
2017-11-20 21:14         ` Anand Jain
2017-12-04  7:09         ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).