All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Btrfs: fix wrong write offset when replacing a device
@ 2013-07-04 10:37 Miao Xie
  2013-07-04 12:20 ` Stefan Behrens
  0 siblings, 1 reply; 3+ messages in thread
From: Miao Xie @ 2013-07-04 10:37 UTC (permalink / raw)
  To: linux-btrfs

The filesystem was corrupted after we did a device replace.

Steps to reproduce:
 # mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
 # mount <device0> <mnt>
 # btrfs replace start -rfB 1 <device4> <mnt>
 # umount <mnt>
 # btrfsck <device4>

The reason is that we changed the write offset by mistake. When we
replace the a device, we read the data from the source device at first,
and then write the data into the corresponding place of the new device.
But when we read the data, we will try to spread the read request among
the mirrors. It is a good idea that can speed up the read request, but
should not change the write offset because it is corresponding to the source
device, not the mirror device, if we change it by the offset of the mirror
device, we would get the wrong offset. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 929093d..092bec4 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -228,7 +228,6 @@ static void scrub_bio_end_io_worker(struct btrfs_work *work);
 static void scrub_block_complete(struct scrub_block *sblock);
 static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
 			       u64 extent_logical, u64 extent_len,
-			       u64 *extent_physical,
 			       struct btrfs_device **extent_dev,
 			       int *extent_mirror_num);
 static int scrub_setup_wr_ctx(struct scrub_ctx *sctx,
@@ -2482,8 +2481,7 @@ again:
 			extent_mirror_num = mirror_num;
 			if (is_dev_replace)
 				scrub_remap_extent(fs_info, extent_logical,
-						   extent_len, &extent_physical,
-						   &extent_dev,
+						   extent_len, &extent_dev,
 						   &extent_mirror_num);
 
 			ret = btrfs_lookup_csums_range(csum_root, logical,
@@ -3057,7 +3055,6 @@ int btrfs_scrub_progress(struct btrfs_root *root, u64 devid,
 
 static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
 			       u64 extent_logical, u64 extent_len,
-			       u64 *extent_physical,
 			       struct btrfs_device **extent_dev,
 			       int *extent_mirror_num)
 {
@@ -3074,7 +3071,6 @@ static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
 		return;
 	}
 
-	*extent_physical = bbio->stripes[0].physical;
 	*extent_mirror_num = bbio->mirror_num;
 	*extent_dev = bbio->stripes[0].dev;
 	kfree(bbio);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Btrfs: fix wrong write offset when replacing a device
  2013-07-04 10:37 [PATCH] Btrfs: fix wrong write offset when replacing a device Miao Xie
@ 2013-07-04 12:20 ` Stefan Behrens
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Behrens @ 2013-07-04 12:20 UTC (permalink / raw)
  To: Miao Xie; +Cc: linux-btrfs

On Thu, 4 Jul 2013 18:37:21 +0800, Miao Xie wrote:
> The filesystem was corrupted after we did a device replace.
> 
> Steps to reproduce:
>  # mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
>  # mount <device0> <mnt>
>  # btrfs replace start -rfB 1 <device4> <mnt>
>  # umount <mnt>
>  # btrfsck <device4>
> 
> The reason is that we changed the write offset by mistake. When we
> replace the a device, we read the data from the source device at first,
> and then write the data into the corresponding place of the new device.
> But when we read the data, we will try to spread the read request among
> the mirrors. It is a good idea that can speed up the read request, but
> should not change the write offset because it is corresponding to the source
> device, not the mirror device, if we change it by the offset of the mirror
> device, we would get the wrong offset. Fix it.

The main reason for the remapping is not to try to spread the read request, it is done in order to implement the "-r" option ("-r  only read from srcdev if no other zero-defect mirror exists", this handles the case where the srcdev is very slow due to lots of read errors).

Commit 625f1c8dc changed the write address. It was the right address before, the unmapped one.

The physical disk address that is used to read from disk needs to be the mapped one (refer to scrub_add_page_to_rd_bio()), the address to write to disk needs to be the unmapped one. With the current patch, you use the right one for writing but the wrong one for reading.

The following code should fix the wrong write address, it reverts the change to the write address from the aforementioned commit. I'll prepare and send a proper commit (after testing all RAID/single/dup cases).

--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2495,7 +2495,8 @@ again:
        ret = scrub_extent(sctx, extent_logical, extent_len,
                           extent_physical, extent_dev, flags,
                           generation, extent_mirror_num,
-                          extent_physical);
+                          extent_logical - logical +
+                           physical);
        if (ret)
                goto out;

> 
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>  fs/btrfs/scrub.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 929093d..092bec4 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -228,7 +228,6 @@ static void scrub_bio_end_io_worker(struct btrfs_work *work);
>  static void scrub_block_complete(struct scrub_block *sblock);
>  static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
>  			       u64 extent_logical, u64 extent_len,
> -			       u64 *extent_physical,
>  			       struct btrfs_device **extent_dev,
>  			       int *extent_mirror_num);
>  static int scrub_setup_wr_ctx(struct scrub_ctx *sctx,
> @@ -2482,8 +2481,7 @@ again:
>  			extent_mirror_num = mirror_num;
>  			if (is_dev_replace)
>  				scrub_remap_extent(fs_info, extent_logical,
> -						   extent_len, &extent_physical,
> -						   &extent_dev,
> +						   extent_len, &extent_dev,
>  						   &extent_mirror_num);
>  
>  			ret = btrfs_lookup_csums_range(csum_root, logical,
> @@ -3057,7 +3055,6 @@ int btrfs_scrub_progress(struct btrfs_root *root, u64 devid,
>  
>  static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
>  			       u64 extent_logical, u64 extent_len,
> -			       u64 *extent_physical,
>  			       struct btrfs_device **extent_dev,
>  			       int *extent_mirror_num)
>  {
> @@ -3074,7 +3071,6 @@ static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
>  		return;
>  	}
>  
> -	*extent_physical = bbio->stripes[0].physical;
>  	*extent_mirror_num = bbio->mirror_num;
>  	*extent_dev = bbio->stripes[0].dev;
>  	kfree(bbio);
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] Btrfs: fix wrong write offset when replacing a device
@ 2013-07-04 13:50 Stefan Behrens
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Behrens @ 2013-07-04 13:50 UTC (permalink / raw)
  To: linux-btrfs

Miao Xie reported the following issue:

The filesystem was corrupted after we did a device replace.

Steps to reproduce:
 # mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
 # mount <device0> <mnt>
 # btrfs replace start -rfB 1 <device4> <mnt>
 # umount <mnt>
 # btrfsck <device4>

The reason for the issue is that we changed the write offset by mistake,
introduced by commit 625f1c8dc.

We read the data from the source device at first, and then write the
data into the corresponding place of the new device. In order to
implement the "-r" option, the source location is remapped using
btrfs_map_block(). The read takes place on the mapped location, and
the write needs to take place on the unmapped location. Currently
the write is using the mapped location, and this commit changes it
back by undoing the change to the write address that the aforementioned
commit added by mistake.

Reported-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
---
 fs/btrfs/scrub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 4ba2a69..64a157b 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2495,7 +2495,7 @@ again:
 			ret = scrub_extent(sctx, extent_logical, extent_len,
 					   extent_physical, extent_dev, flags,
 					   generation, extent_mirror_num,
-					   extent_physical);
+					   extent_logical - logical + physical);
 			if (ret)
 				goto out;
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-07-04 13:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-04 10:37 [PATCH] Btrfs: fix wrong write offset when replacing a device Miao Xie
2013-07-04 12:20 ` Stefan Behrens
2013-07-04 13:50 Stefan Behrens

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.