All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 09/13] btrfs: handle RAID56 read repair differently
Date: Tue,  3 May 2022 14:49:53 +0800	[thread overview]
Message-ID: <80006286af884f8aec7e53f6a0c87b9f968ef920.1651559986.git.wqu@suse.com> (raw)
In-Reply-To: <cover.1651559986.git.wqu@suse.com>

Our current read repair facility does its work completely relying on
mirror number.
And for repaired sector, it will write the correct data back to the bad
mirror.

This works great for mirror based profiles, but for RAID56 it's a
different story.

Partial write in btrfs raid56 will lead to unconditional RMW, completely
ignoring the mirror number (which is to indicate the corrupted data
stripe number).

This will cause us to read back the corrupted data on-disk, and result
further corruption.

To address it, we introduce btrfs_read_repair_ctrl::is_raid56, and for
RAID56 read-repair, we fallback to the tried-and-tree
btrfs_repair_io_failure().

That function handles RAID56 by using MAP_READ for btrfs_map_block() and
directly write the correct data back to disk, avoiding the RMW problem.

Unfortunately we lose the asynchronous bio assembly/submission, but it
should still be more or less acceptable considering RAID56 is really an
odd ball here.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/read-repair.c | 26 ++++++++++++++++++++++++++
 fs/btrfs/read-repair.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/fs/btrfs/read-repair.c b/fs/btrfs/read-repair.c
index aecdc4ee54ba..e1b11990a480 100644
--- a/fs/btrfs/read-repair.c
+++ b/fs/btrfs/read-repair.c
@@ -55,6 +55,8 @@ void btrfs_read_repair_add_sector(struct inode *inode,
 		ASSERT(ctrl->init_mirror);
 		ctrl->num_copies = btrfs_num_copies(fs_info, ctrl->logical,
 						    sectorsize);
+		ctrl->is_raid56 = btrfs_is_parity_mirror(fs_info,
+						ctrl->logical, sectorsize);
 		init_waitqueue_head(&ctrl->io_wait);
 		atomic_set(&ctrl->io_bytes, 0);
 		/*
@@ -153,6 +155,30 @@ static void read_repair_bio_add_sector(struct btrfs_read_repair_ctrl *ctrl,
 	if (opf == REQ_OP_WRITE) {
 		if (btrfs_repair_one_zone(fs_info, ctrl->logical))
 			return;
+
+		/*
+		 * For RAID56, we can not just write the bad data back, as
+		 * any write will trigger RMW and read back the corrrupted
+		 * on-disk stripe, causing further damage.
+		 * So here we do special repair for raid56.
+		 *
+		 * And unfortunately, this repair is very low level and not
+		 * compatible with the rest of the mirror based repair.
+		 * So it's still done in synchronous mode using
+		 * btrfs_repair_io_failure().
+		 */
+		if (ctrl->is_raid56) {
+			const u64 logical = ctrl->logical +
+					(sector_nr << fs_info->sectorsize_bits);
+			const u64 file_offset = ctrl->file_offset +
+					(sector_nr << fs_info->sectorsize_bits);
+
+			btrfs_repair_io_failure(fs_info,
+					btrfs_ino(BTRFS_I(ctrl->inode)),
+					file_offset, fs_info->sectorsize,
+					logical, page, pgoff, mirror);
+			return;
+		}
 	}
 
 	/* Check if the sector can be added to the last bio */
diff --git a/fs/btrfs/read-repair.h b/fs/btrfs/read-repair.h
index 3e1430489f89..6cc816e2ce4a 100644
--- a/fs/btrfs/read-repair.h
+++ b/fs/btrfs/read-repair.h
@@ -42,6 +42,8 @@ struct btrfs_read_repair_ctrl {
 	 * at bio allocation time.
 	 */
 	bool error;
+
+	bool is_raid56;
 };
 
 int btrfs_read_repair_alloc_bitmaps(struct btrfs_fs_info *fs_info,
-- 
2.36.0


  parent reply	other threads:[~2022-05-03  6:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03  6:49 [PATCH 00/13] btrfs: make read repair work in synchronous mode Qu Wenruo
2022-05-03  6:49 ` [PATCH 01/13] btrfs: introduce a pure data checksum checking helper Qu Wenruo
2022-05-03 15:03   ` Christoph Hellwig
2022-05-03  6:49 ` [PATCH 02/13] btrfs: quit early if the fs has no RAID56 support for raid56 related checks Qu Wenruo
2022-05-03  6:49 ` [PATCH 03/13] btrfs: save the original bi_iter into btrfs_bio for buffered read Qu Wenruo
2022-05-03  6:49 ` [PATCH 04/13] btrfs: remove duplicated parameters from submit_data_read_repair() Qu Wenruo
2022-05-03  6:49 ` [PATCH 05/13] btrfs: add btrfs_read_repair_ctrl to record corrupted sectors Qu Wenruo
2022-05-03 15:06   ` Christoph Hellwig
2022-05-04  1:12     ` Qu Wenruo
2022-05-04 14:05       ` Christoph Hellwig
2022-05-04 22:40         ` Qu Wenruo
2022-05-12 17:16           ` David Sterba
2022-05-13 10:33             ` Christoph Hellwig
2022-05-13 10:53               ` Qu Wenruo
2022-05-13 10:57                 ` Christoph Hellwig
2022-05-13 11:21                   ` Qu Wenruo
2022-05-13 11:23                     ` Christoph Hellwig
2022-05-17 13:32                       ` Qu Wenruo
2022-05-03  6:49 ` [PATCH 06/13] btrfs: add a helper to queue a corrupted sector for read repair Qu Wenruo
2022-05-03 15:07   ` Christoph Hellwig
2022-05-04  1:13     ` Qu Wenruo
2022-05-04 14:06       ` Christoph Hellwig
2022-05-12 17:20         ` David Sterba
2022-05-03  6:49 ` [PATCH 07/13] btrfs: introduce a helper to repair from one mirror Qu Wenruo
2022-05-03  6:49 ` [PATCH 08/13] btrfs: allow btrfs read repair to submit writes in asynchronous mode Qu Wenruo
2022-05-03  6:49 ` Qu Wenruo [this message]
2022-05-03  6:49 ` [PATCH 10/13] btrfs: switch buffered read to the new read repair routine Qu Wenruo
2022-05-03  6:49 ` [PATCH 11/13] btrfs: switch direct IO routine to use btrfs_read_repair_ctrl Qu Wenruo
2022-05-03  6:49 ` [PATCH 12/13] btrfs: remove io_failure_record infrastructure completely Qu Wenruo
2022-05-03  6:49 ` [PATCH 13/13] btrfs: remove btrfs_inode::io_failure_tree Qu Wenruo
2022-05-03 15:07   ` Christoph Hellwig
2022-05-12 17:08 ` [PATCH 00/13] btrfs: make read repair work in synchronous mode David Sterba
2022-05-12 23:01   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80006286af884f8aec7e53f6a0c87b9f968ef920.1651559986.git.wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.