All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
@ 2018-06-15  2:39 Sasha Levin
  2018-06-15  2:39 ` [PATCH 4.4 2/2] Btrfs: make raid6 rebuild retry more Sasha Levin
  2018-06-18  4:20 ` [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" gregkh
  0 siblings, 2 replies; 7+ messages in thread
From: Sasha Levin @ 2018-06-15  2:39 UTC (permalink / raw)
  To: gregkh; +Cc: ben.hutchings, stable, Sasha Levin

This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.

This commit used an incorrect log message.

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
 fs/btrfs/raid56.c  | 18 ++++--------------
 fs/btrfs/volumes.c |  9 +--------
 2 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index b9fa99577bf7..1a33d3eb36de 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2160,21 +2160,11 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 	}
 
 	/*
-	 * Loop retry:
-	 * for 'mirror == 2', reconstruct from all other stripes.
-	 * for 'mirror_num > 2', select a stripe to fail on every retry.
+	 * reconstruct from the q stripe if they are
+	 * asking for mirror 3
 	 */
-	if (mirror_num > 2) {
-		/*
-		 * 'mirror == 3' is to fail the p stripe and
-		 * reconstruct from the q stripe.  'mirror > 3' is to
-		 * fail a data stripe and reconstruct from p+q stripe.
-		 */
-		rbio->failb = rbio->real_stripes - (mirror_num - 1);
-		ASSERT(rbio->failb > 0);
-		if (rbio->failb <= rbio->faila)
-			rbio->failb--;
-	}
+	if (mirror_num == 3)
+		rbio->failb = rbio->real_stripes - 2;
 
 	ret = lock_stripe_add(rbio);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b4d63a9842fa..ed75d70b4bc2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5056,14 +5056,7 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
 		ret = 2;
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-		/*
-		 * There could be two corrupted data stripes, we need
-		 * to loop retry in order to rebuild the correct data.
-		 *
-		 * Fail a stripe at a time on every retry except the
-		 * stripe under reconstruction.
-		 */
-		ret = map->num_stripes;
+		ret = 3;
 	else
 		ret = 1;
 	free_extent_map(em);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4.4 2/2] Btrfs: make raid6 rebuild retry more
  2018-06-15  2:39 [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" Sasha Levin
@ 2018-06-15  2:39 ` Sasha Levin
  2018-06-18  4:20 ` [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" gregkh
  1 sibling, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2018-06-15  2:39 UTC (permalink / raw)
  To: gregkh; +Cc: ben.hutchings, stable, Liu Bo, David Sterba, Sasha Levin

From: Liu Bo <bo.li.liu@oracle.com>

[ Upstream commit 8810f7517a3bc4ca2d41d022446d3f5fd6b77c09 ]

There is a scenario that can end up with rebuild process failing to
return good content, i.e.
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,

- the 1st retry is to rebuild with all other stripes, it'll eventually
  be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
  that it will do raid6 style rebuild,

however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content, and users will think of this as data loss.
More seriouly, if the loss happens on some important internal btree
roots, it could refuse to mount.

This extends btrfs to do more retries and each retry fails only one
stripe.  Since raid6 can tolerate 2 disk failures, if there is one
more failure besides the failure on which we're recovering, this can
always work.

The worst case is to retry as many times as the number of raid6 disks,
but given the fact that such a scenario is really rare in practice,
it's still acceptable.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
 fs/btrfs/raid56.c  | 18 ++++++++++++++----
 fs/btrfs/volumes.c |  9 ++++++++-
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1a33d3eb36de..b9fa99577bf7 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2160,11 +2160,21 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 	}
 
 	/*
-	 * reconstruct from the q stripe if they are
-	 * asking for mirror 3
+	 * Loop retry:
+	 * for 'mirror == 2', reconstruct from all other stripes.
+	 * for 'mirror_num > 2', select a stripe to fail on every retry.
 	 */
-	if (mirror_num == 3)
-		rbio->failb = rbio->real_stripes - 2;
+	if (mirror_num > 2) {
+		/*
+		 * 'mirror == 3' is to fail the p stripe and
+		 * reconstruct from the q stripe.  'mirror > 3' is to
+		 * fail a data stripe and reconstruct from p+q stripe.
+		 */
+		rbio->failb = rbio->real_stripes - (mirror_num - 1);
+		ASSERT(rbio->failb > 0);
+		if (rbio->failb <= rbio->faila)
+			rbio->failb--;
+	}
 
 	ret = lock_stripe_add(rbio);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ed75d70b4bc2..ef19567453bb 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5056,7 +5056,14 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
 		ret = 2;
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-		ret = 3;
+		/*
+		 * There could be two corrupted data stripes, we need
+		 * to loop retry in order to rebuild the correct data.
+		 * 
+		 * Fail a stripe at a time on every retry except the
+		 * stripe under reconstruction.
+		 */
+		ret = map->num_stripes;
 	else
 		ret = 1;
 	free_extent_map(em);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
  2018-06-15  2:39 [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" Sasha Levin
  2018-06-15  2:39 ` [PATCH 4.4 2/2] Btrfs: make raid6 rebuild retry more Sasha Levin
@ 2018-06-18  4:20 ` gregkh
  2018-06-19  4:33   ` Sasha Levin
  1 sibling, 1 reply; 7+ messages in thread
From: gregkh @ 2018-06-18  4:20 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ben.hutchings, stable

On Fri, Jun 15, 2018 at 02:39:22AM +0000, Sasha Levin wrote:
> This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.
> 
> This commit used an incorrect log message.
> 
> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
> ---
>  fs/btrfs/raid56.c  | 18 ++++--------------
>  fs/btrfs/volumes.c |  9 +--------
>  2 files changed, 5 insertions(+), 22 deletions(-)

You forgot a reported-by: tag :(

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
  2018-06-18  4:20 ` [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" gregkh
@ 2018-06-19  4:33   ` Sasha Levin
  2018-06-19 20:10     ` gregkh
  0 siblings, 1 reply; 7+ messages in thread
From: Sasha Levin @ 2018-06-19  4:33 UTC (permalink / raw)
  To: gregkh; +Cc: ben.hutchings, stable

On Mon, Jun 18, 2018 at 06:20:30AM +0200, gregkh@linuxfoundation.org wrote:
>On Fri, Jun 15, 2018 at 02:39:22AM +0000, Sasha Levin wrote:
>> This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.
>>
>> This commit used an incorrect log message.
>>
>> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
>> ---
>>  fs/btrfs/raid56.c  | 18 ++++--------------
>>  fs/btrfs/volumes.c |  9 +--------
>>  2 files changed, 5 insertions(+), 22 deletions(-)
>
>You forgot a reported-by: tag :(

Sorry, I'll resend!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
  2018-06-19  4:33   ` Sasha Levin
@ 2018-06-19 20:10     ` gregkh
  2018-06-19 20:15       ` Sasha Levin
  0 siblings, 1 reply; 7+ messages in thread
From: gregkh @ 2018-06-19 20:10 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ben.hutchings, stable

On Tue, Jun 19, 2018 at 04:33:49AM +0000, Sasha Levin wrote:
> On Mon, Jun 18, 2018 at 06:20:30AM +0200, gregkh@linuxfoundation.org wrote:
> >On Fri, Jun 15, 2018 at 02:39:22AM +0000, Sasha Levin wrote:
> >> This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.
> >>
> >> This commit used an incorrect log message.
> >>
> >> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
> >> ---
> >>  fs/btrfs/raid56.c  | 18 ++++--------------
> >>  fs/btrfs/volumes.c |  9 +--------
> >>  2 files changed, 5 insertions(+), 22 deletions(-)
> >
> >You forgot a reported-by: tag :(
> 
> Sorry, I'll resend!

No need, I fixed it up already :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
  2018-06-19 20:10     ` gregkh
@ 2018-06-19 20:15       ` Sasha Levin
  0 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2018-06-19 20:15 UTC (permalink / raw)
  To: gregkh; +Cc: ben.hutchings, stable

On Wed, Jun 20, 2018 at 05:10:02AM +0900, gregkh@linuxfoundation.org wrote:
>On Tue, Jun 19, 2018 at 04:33:49AM +0000, Sasha Levin wrote:
>> On Mon, Jun 18, 2018 at 06:20:30AM +0200, gregkh@linuxfoundation.org wrote:
>> >On Fri, Jun 15, 2018 at 02:39:22AM +0000, Sasha Levin wrote:
>> >> This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.
>> >>
>> >> This commit used an incorrect log message.
>> >>
>> >> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
>> >> ---
>> >>  fs/btrfs/raid56.c  | 18 ++++--------------
>> >>  fs/btrfs/volumes.c |  9 +--------
>> >>  2 files changed, 5 insertions(+), 22 deletions(-)
>> >
>> >You forgot a reported-by: tag :(
>>
>> Sorry, I'll resend!
>
>No need, I fixed it up already :)

Too late :)

Nothing like some jetlag to catch up on work...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption"
@ 2018-06-19 20:09 Sasha Levin
  0 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2018-06-19 20:09 UTC (permalink / raw)
  To: gregkh; +Cc: ben.hutchings, stable, Sasha Levin

This reverts commit 95b286daf7ba784191023ad110122703eb2ebabc.

This commit used an incorrect log message.

Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
 fs/btrfs/raid56.c  | 18 ++++--------------
 fs/btrfs/volumes.c |  9 +--------
 2 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index b9fa99577bf7..1a33d3eb36de 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2160,21 +2160,11 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 	}
 
 	/*
-	 * Loop retry:
-	 * for 'mirror == 2', reconstruct from all other stripes.
-	 * for 'mirror_num > 2', select a stripe to fail on every retry.
+	 * reconstruct from the q stripe if they are
+	 * asking for mirror 3
 	 */
-	if (mirror_num > 2) {
-		/*
-		 * 'mirror == 3' is to fail the p stripe and
-		 * reconstruct from the q stripe.  'mirror > 3' is to
-		 * fail a data stripe and reconstruct from p+q stripe.
-		 */
-		rbio->failb = rbio->real_stripes - (mirror_num - 1);
-		ASSERT(rbio->failb > 0);
-		if (rbio->failb <= rbio->faila)
-			rbio->failb--;
-	}
+	if (mirror_num == 3)
+		rbio->failb = rbio->real_stripes - 2;
 
 	ret = lock_stripe_add(rbio);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b4d63a9842fa..ed75d70b4bc2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5056,14 +5056,7 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
 		ret = 2;
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-		/*
-		 * There could be two corrupted data stripes, we need
-		 * to loop retry in order to rebuild the correct data.
-		 *
-		 * Fail a stripe at a time on every retry except the
-		 * stripe under reconstruction.
-		 */
-		ret = map->num_stripes;
+		ret = 3;
 	else
 		ret = 1;
 	free_extent_map(em);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-19 20:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-15  2:39 [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" Sasha Levin
2018-06-15  2:39 ` [PATCH 4.4 2/2] Btrfs: make raid6 rebuild retry more Sasha Levin
2018-06-18  4:20 ` [PATCH 4.4 1/2] Revert "Btrfs: fix scrub to repair raid6 corruption" gregkh
2018-06-19  4:33   ` Sasha Levin
2018-06-19 20:10     ` gregkh
2018-06-19 20:15       ` Sasha Levin
2018-06-19 20:09 Sasha Levin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.