All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 4/7] btrfs: add new read repair infrastructure
Date: Mon, 23 May 2022 09:48:28 +0800	[thread overview]
Message-ID: <c79f35aea568ff3c1aa9b68b1bd6ea923d44e72a.1653270322.git.wqu@suse.com> (raw)
In-Reply-To: <cover.1653270322.git.wqu@suse.com>

The new infrastructure only has one function,
btrfs_read_repair_sector(), which will try to get the correct content of
that sector.

The idea of the function is very straight-forward:

1) Try to read the next mirror (if possible)
2) Verify the csum (if it has)
3) Go back to 1) if csum mismatch or read failed

All the bio submission is synchronous, meaning we will wait for the
submitted bio to finish before continue.

This can be a performance bottleneck, but considering that:

- Read-repair is already a cold path
- More than one corruption in one read bio is even rarer

Thus I don't think we should spend tons of code on a very cold path, no
to mention complex code itself can be bug prone and harder to maintain.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/Makefile      |  2 +-
 fs/btrfs/read-repair.c | 74 ++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/read-repair.h | 13 ++++++++
 3 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 fs/btrfs/read-repair.c
 create mode 100644 fs/btrfs/read-repair.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 99f9995670ea..0b2605c750ca 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -31,7 +31,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \
 	   block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \
-	   subpage.o tree-mod-log.o
+	   subpage.o tree-mod-log.o read-repair.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/read-repair.c b/fs/btrfs/read-repair.c
new file mode 100644
index 000000000000..e3175e27bcbb
--- /dev/null
+++ b/fs/btrfs/read-repair.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bio.h>
+#include "ctree.h"
+#include "volumes.h"
+#include "read-repair.h"
+#include "btrfs_inode.h"
+
+static int get_next_mirror(int cur_mirror, int num_copies)
+{
+	/* In the context of read-repair, we never use 0 as mirror_num. */
+	ASSERT(cur_mirror);
+	return (cur_mirror + 1 > num_copies) ? (cur_mirror + 1 - num_copies) :
+		cur_mirror + 1;
+}
+
+static int get_prev_mirror(int cur_mirror, int num_copies)
+{
+	/* In the context of read-repair, we never use 0 as mirror_num. */
+	ASSERT(cur_mirror);
+	return (cur_mirror - 1 <= 0) ? (num_copies) : cur_mirror - 1;
+}
+
+int btrfs_read_repair_sector(struct inode *inode,
+			     struct page *page, unsigned int pgoff,
+			     u64 logical, u64 file_off, int failed_mirror,
+			     int num_copies, u8 *expected_csum)
+{
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	bool uptodate = false;
+	int i;
+
+	/* No more mirrors to retry. */
+	if (num_copies <= 1)
+		return -EIO;
+
+	for (i = get_next_mirror(failed_mirror, num_copies); i != failed_mirror;
+	     i = get_next_mirror(i, num_copies)) {
+		u8 csum[BTRFS_CSUM_SIZE];
+		struct bio *read_bio;
+		int ret;
+
+		read_bio = bio_alloc(NULL, 1, REQ_OP_READ | REQ_SYNC, GFP_NOFS);
+		if (!read_bio)
+			return -EIO;
+		__bio_add_page(read_bio, page, fs_info->sectorsize, pgoff);
+		read_bio->bi_iter.bi_sector = logical >> SECTOR_SHIFT;
+
+		ret = btrfs_map_bio_wait(fs_info, read_bio, i);
+		/* Submit failed, try next mirror. */
+		if (ret < 0)
+			continue;
+
+		if (expected_csum) {
+			ret = btrfs_check_sector_csum(fs_info, page, pgoff,
+						      csum, expected_csum);
+			if (!ret)
+				uptodate = true;
+		} else {
+			uptodate = true;
+		}
+
+		if (uptodate) {
+			btrfs_repair_io_failure(fs_info,
+					btrfs_ino(BTRFS_I(inode)), file_off,
+					fs_info->sectorsize, logical, page,
+					pgoff, get_prev_mirror(i, num_copies));
+			break;
+		}
+	}
+	if (!uptodate)
+		return -EIO;
+	return 0;
+}
diff --git a/fs/btrfs/read-repair.h b/fs/btrfs/read-repair.h
new file mode 100644
index 000000000000..e984ab0b5b18
--- /dev/null
+++ b/fs/btrfs/read-repair.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef BTRFS_READ_REPAIR_H
+#define BTRFS_READ_REPAIR_H
+
+#include <linux/blk_types.h>
+#include <linux/fs.h>
+
+int btrfs_read_repair_sector(struct inode *inode,
+			     struct page *page, unsigned int pgoff,
+			     u64 logical, u64 file_off, int failed_mirror,
+			     int num_copies, u8 *expected_csum);
+#endif
-- 
2.36.1


  parent reply	other threads:[~2022-05-23  1:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-23  1:48 [PATCH 0/7] btrfs: synchronous (but super simple) read-repair rework Qu Wenruo
2022-05-23  1:48 ` [PATCH 1/7] btrfs: save the original bi_iter into btrfs_bio for buffered read Qu Wenruo
2022-05-23  1:48 ` [PATCH 2/7] btrfs: make repair_io_failure available outside of extent_io.c Qu Wenruo
2022-05-23  1:48 ` [PATCH 3/7] btrfs: add a btrfs_map_bio_wait helper Qu Wenruo
2022-05-23  1:48 ` Qu Wenruo [this message]
2022-05-23  1:48 ` [PATCH 5/7] btrfs: use the new read repair code for buffered reads Qu Wenruo
2022-05-23  1:48 ` [PATCH 6/7] btrfs: use the new read repair code for direct I/O Qu Wenruo
2022-05-23  1:48 ` [PATCH 7/7] btrfs: remove io_failure_record infrastructure completely Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c79f35aea568ff3c1aa9b68b1bd6ea923d44e72a.1653270322.git.wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.