All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: Christoph Hellwig <hch@lst.de>,
	clm@fb.com, josef@toxicpanda.com, dsterba@suse.com
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: repair all bad mirrors
Date: Tue, 21 Jun 2022 18:19:19 +0300	[thread overview]
Message-ID: <b633dedf-b322-2d8c-adfb-8ab88af5652e@suse.com> (raw)
In-Reply-To: <20220619082821.2151052-1-hch@lst.de>



On 19.06.22 г. 11:28 ч., Christoph Hellwig wrote:
> When there is more than a single level of redundancy there can also be
> mutiple bad mirrors, and the current read repair code only repairs the
> last bad one.
> 
> Restructure btrfs_repair_one_sector so that it records the original bad
> bad mirror, and the number of copies, and then repair all bad copies
> until we reach the original bad copy in clean_io_failure.  Note that
> this also means the read repair reads will always start from the next
> bad mirror and not mirror 0.
> 
> This fixes btrfs/265 in xfstests.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/btrfs/extent_io.c | 127 +++++++++++++++++++++----------------------
>   fs/btrfs/extent_io.h |   3 +-
>   2 files changed, 63 insertions(+), 67 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 854999c2e139b..9775ac5d28a9e 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2432,6 +2432,20 @@ int btrfs_repair_eb_io_failure(const struct extent_buffer *eb, int mirror_num)
>   	return ret;
>   }
>   
> +static int next_mirror(struct io_failure_record *failrec, int cur_mirror)
> +{
> +	if (cur_mirror == failrec->num_copies)
> +		return cur_mirror + 1 - failrec->num_copies;
> +	return cur_mirror + 1;
> +}
> +
> +static int prev_mirror(struct io_failure_record *failrec, int cur_mirror)
> +{
> +	if (cur_mirror == 1)
> +		return failrec->num_copies;
> +	return cur_mirror - 1;
> +}
> +
>   /*
>    * each time an IO finishes, we do a fast check in the IO failure tree
>    * to see if we need to process or clean up an io_failure_record
> @@ -2444,7 +2458,7 @@ int clean_io_failure(struct btrfs_fs_info *fs_info,
>   	u64 private;
>   	struct io_failure_record *failrec;
>   	struct extent_state *state;
> -	int num_copies;
> +	int mirror;
>   	int ret;
>   
>   	private = 0;
> @@ -2468,20 +2482,19 @@ int clean_io_failure(struct btrfs_fs_info *fs_info,
>   					    EXTENT_LOCKED);
>   	spin_unlock(&io_tree->lock);
>   
> -	if (state && state->start <= failrec->start &&
> -	    state->end >= failrec->start + failrec->len - 1) {
> -		num_copies = btrfs_num_copies(fs_info, failrec->logical,
> -					      failrec->len);
> -		if (num_copies > 1)  {
> -			repair_io_failure(fs_info, ino, start, failrec->len,
> -					  failrec->logical, page, pg_offset,
> -					  failrec->failed_mirror);
> -		}
> -	}
> +	if (!state || state->start > failrec->start ||
> +	    state->end < failrec->start + failrec->len - 1)
> +		goto out;
> +
> +	mirror = failrec->this_mirror;
> +	do {
> +		mirror = prev_mirror(failrec, mirror);
> +		repair_io_failure(fs_info, ino, start, failrec->len,
> +				  failrec->logical, page, pg_offset, mirror);
> +	} while (mirror != failrec->orig_mirror);

But does this work as intended? Say we have a raid1c4 and we read from 
mirror 3 which is bad, in this case failrec->orig_mirror = 3 and 
->this_mirror = 4. The read from mirror 4 returns good data and 
clean_io_failure is called with mirror= 3 in which case only mirror 3 is 
repaired (assume 1/2 were also bad we don't know it yet, because the 
original bio request didn't submit to them based on the PID policy).

<snip>

> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> index c0f1fb63eeae7..5bd23bb239dae 100644
> --- a/fs/btrfs/extent_io.h
> +++ b/fs/btrfs/extent_io.h
> @@ -262,8 +262,9 @@ struct io_failure_record {
>   	u64 len;
>   	u64 logical;
>   	enum btrfs_compression_type compress_type;
> +	int orig_mirror;
nit: how about we name this failed_mirror
>   	int this_mirror;
nit: this name is also somewhat uninformative and one has to go and see 
how it's used in order to figure out the usage.
> -	int failed_mirror;
> +	int num_copies;
>   };
>   
>   int btrfs_repair_one_sector(struct inode *inode,

  reply	other threads:[~2022-06-21 15:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-19  8:28 [PATCH] btrfs: repair all bad mirrors Christoph Hellwig
2022-06-21 15:19 ` Nikolay Borisov [this message]
2022-06-21 15:46   ` Christoph Hellwig
2022-06-21 17:49     ` Nikolay Borisov
2022-06-22  4:23       ` Christoph Hellwig
2022-06-22  4:32 ` Qu Wenruo
2022-06-22  5:06   ` Christoph Hellwig
2022-06-22  5:14     ` Qu Wenruo
2022-06-22  7:47       ` Christoph Hellwig
2022-06-22  8:46         ` Qu Wenruo
2022-06-22 11:02           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b633dedf-b322-2d8c-adfb-8ab88af5652e@suse.com \
    --to=nborisov@suse.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.