All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] New rescue mount options
@ 2020-09-24 15:32 Josef Bacik
  2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

This is the next version of my rescue=all patches, this time broken up in to a
few discrete patches, with some cleanups as well.  I have a PR for the xfstest
that exercises these options, it can be found here

  https://github.com/btrfs/fstests/pull/35

This is the same idea as the previous versions, except I've made a mechanical
change.  Instead we have rescue=ignorebadroots, which ignores any global roots
that we may not be able to read.  We will still fail to mount if thinks like the
chunk root or the tree root are corrupt, but this will allow us to mount read
only if the extent root or csum root are completely hosed.

I've added a new patch this go around, rescue=ignoredatacsums.  Somebody had
indicated that they would prefer that the original rescue=all allowed us to
continue to read csums if the root was still intact.  In order to handle that
usecase that's what you get with rescue=ignorebadroots, however if your csum
tree is corrupt in the middle of the tree you could still end up with problems.
Thus we have rescue=ignoredatacsums which will completely disable the csum tree
as well.

And finally we have rescue=all, which simply enables ignoredatacsums,
ignorebadroots, and notreelogreplay.  We need an easy catch-all option for
distros to fallback on to get users the highest probability of being able to
recover their data, so we will use rescue=all to turn on all the fanciest rescue
options that we have, and then use the discrete options for more fine grained
recovery.  Thanks,

Josef

Josef Bacik (5):
  btrfs: unify the ro checking for mount options
  btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  btrfs: introduce rescue=ignorebadroots
  btrfs: introduce rescue=ignoredatacsums
  btrfs: introduce rescue=all

 fs/btrfs/block-group.c | 48 +++++++++++++++++++++++++++
 fs/btrfs/block-rsv.c   |  8 +++++
 fs/btrfs/compression.c | 17 ++++------
 fs/btrfs/ctree.h       |  2 ++
 fs/btrfs/disk-io.c     | 74 +++++++++++++++++++++++++++---------------
 fs/btrfs/file-item.c   |  4 +++
 fs/btrfs/inode.c       | 18 +++++++---
 fs/btrfs/super.c       | 49 ++++++++++++++++++++++++----
 fs/btrfs/volumes.c     |  7 ++++
 9 files changed, 179 insertions(+), 48 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] btrfs: unify the ro checking for mount options
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
@ 2020-09-24 15:32 ` Josef Bacik
  2020-09-25  0:36   ` Qu Wenruo
  2020-09-28 12:37   ` Johannes Thumshirn
  2020-09-24 15:32 ` [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums Josef Bacik
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We're going to be adding more options that require RDONLY, so add a
helper to do the check and error out if we don't have RDONLY set.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/super.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 8840a4fa81eb..f99e89ec46b2 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -458,6 +458,17 @@ static const match_table_t rescue_tokens = {
 	{Opt_err, NULL},
 };
 
+static bool check_ro_option(struct btrfs_fs_info *fs_info, unsigned long opt,
+			    const char *opt_name)
+{
+	if (fs_info->mount_opt & opt) {
+		btrfs_err(fs_info, "%s must be used with ro mount option",
+			  opt_name);
+		return true;
+	}
+	return false;
+}
+
 static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
 {
 	char *opts;
@@ -968,14 +979,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 		}
 	}
 check:
-	/*
-	 * Extra check for current option against current flag
-	 */
-	if (btrfs_test_opt(info, NOLOGREPLAY) && !(new_flags & SB_RDONLY)) {
-		btrfs_err(info,
-			  "nologreplay must be used with ro mount option");
+	/* We're read-only, don't have to check. */
+	if (new_flags & SB_RDONLY)
+		goto out;
+
+	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay"))
 		ret = -EINVAL;
-	}
 out:
 	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE) &&
 	    !btrfs_test_opt(info, FREE_SPACE_TREE) &&
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
  2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
@ 2020-09-24 15:32 ` Josef Bacik
  2020-09-25  0:39   ` Qu Wenruo
  2020-09-28 12:39   ` Johannes Thumshirn
  2020-09-24 15:32 ` [PATCH 3/5] btrfs: introduce rescue=ignorebadroots Josef Bacik
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When we move to being able to handle NULL csum_roots it'll be cleaner to
just check in btrfs_lookup_bio_sums instead of at all of the caller
locations, so push the NODATASUM check into it as well so it's unified.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/compression.c | 14 +++++---------
 fs/btrfs/file-item.c   |  3 +++
 fs/btrfs/inode.c       | 12 +++++++++---
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index eeface30facd..7e1eb57b923c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -722,11 +722,9 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 			 */
 			refcount_inc(&cb->pending_bios);
 
-			if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
-				ret = btrfs_lookup_bio_sums(inode, comp_bio,
-							    (u64)-1, sums);
-				BUG_ON(ret); /* -ENOMEM */
-			}
+			ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1,
+						    sums);
+			BUG_ON(ret); /* -ENOMEM */
 
 			nr_sectors = DIV_ROUND_UP(comp_bio->bi_iter.bi_size,
 						  fs_info->sectorsize);
@@ -751,10 +749,8 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 	ret = btrfs_bio_wq_end_io(fs_info, comp_bio, BTRFS_WQ_ENDIO_DATA);
 	BUG_ON(ret); /* -ENOMEM */
 
-	if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
-		ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1, sums);
-		BUG_ON(ret); /* -ENOMEM */
-	}
+	ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1, sums);
+	BUG_ON(ret); /* -ENOMEM */
 
 	ret = btrfs_map_bio(fs_info, comp_bio, mirror_num);
 	if (ret) {
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 8f4f2bd6d9b9..8083d71d6af6 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -272,6 +272,9 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio,
 	int count = 0;
 	u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
 
+	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
+		return BLK_STS_OK;
+
 	path = btrfs_alloc_path();
 	if (!path)
 		return BLK_STS_RESOURCE;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d526833b5ec0..d262944c4297 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2202,7 +2202,12 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio,
 							   mirror_num,
 							   bio_flags);
 			goto out;
-		} else if (!skip_sum) {
+		} else {
+			/*
+			 * Lookup bio sums does extra checks around whether we
+			 * need to csum or not, which is why we ignore skip_sum
+			 * here.
+			 */
 			ret = btrfs_lookup_bio_sums(inode, bio, (u64)-1, NULL);
 			if (ret)
 				goto out;
@@ -7781,7 +7786,6 @@ static blk_qc_t btrfs_submit_direct(struct inode *inode, struct iomap *iomap,
 		struct bio *dio_bio, loff_t file_offset)
 {
 	const bool write = (bio_op(dio_bio) == REQ_OP_WRITE);
-	const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	const bool raid56 = (btrfs_data_alloc_profile(fs_info) &
 			     BTRFS_BLOCK_GROUP_RAID56_MASK);
@@ -7808,10 +7812,12 @@ static blk_qc_t btrfs_submit_direct(struct inode *inode, struct iomap *iomap,
 		return BLK_QC_T_NONE;
 	}
 
-	if (!write && csum) {
+	if (!write) {
 		/*
 		 * Load the csums up front to reduce csum tree searches and
 		 * contention when submitting bios.
+		 *
+		 * If we have csums disabled this will do nothing.
 		 */
 		status = btrfs_lookup_bio_sums(inode, dio_bio, file_offset,
 					       dip->csums);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] btrfs: introduce rescue=ignorebadroots
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
  2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
  2020-09-24 15:32 ` [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums Josef Bacik
@ 2020-09-24 15:32 ` Josef Bacik
  2020-09-25  0:47   ` Qu Wenruo
  2020-09-24 15:32 ` [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums Josef Bacik
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

In the face of extent root corruption, or any other core fs wide root
corruption we will fail to mount the file system.  This makes recovery
kind of a pain, because you need to fall back to userspace tools to
scrape off data.  Instead provide a mechanism to gracefully handle bad
roots, so we can at least mount read-only and possibly recover data from
the file system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 48 +++++++++++++++++++++++++++++++
 fs/btrfs/block-rsv.c   |  8 ++++++
 fs/btrfs/compression.c |  3 +-
 fs/btrfs/ctree.h       |  1 +
 fs/btrfs/disk-io.c     | 65 ++++++++++++++++++++++++++----------------
 fs/btrfs/file-item.c   |  3 +-
 fs/btrfs/inode.c       |  6 +++-
 fs/btrfs/super.c       | 11 ++++++-
 fs/btrfs/volumes.c     |  7 +++++
 9 files changed, 124 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index c0f1d6818df7..7e32f7820543 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1985,6 +1985,51 @@ static int read_one_block_group(struct btrfs_fs_info *info,
 	return ret;
 }
 
+static int fill_dummy_bgs(struct btrfs_fs_info *fs_info)
+{
+	struct extent_map_tree *em_tree = &fs_info->mapping_tree;
+	struct extent_map *em;
+	struct map_lookup *map;
+	struct btrfs_block_group *bg;
+	struct btrfs_space_info *space_info;
+	struct rb_node *node;
+	int ret = 0;
+
+	for (node = rb_first_cached(&em_tree->map); node;
+	     node = rb_next(node)) {
+		em = rb_entry(node, struct extent_map, rb_node);
+		map = em->map_lookup;
+		bg = btrfs_create_block_group_cache(fs_info, em->start);
+		if (!bg) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		/* Fill dummy cache as FULL */
+		bg->length = em->len;
+		bg->flags = map->type;
+		bg->last_byte_to_unpin = (u64)-1;
+		bg->cached = BTRFS_CACHE_FINISHED;
+		bg->used = em->len;
+		bg->flags = map->type;
+		ret = btrfs_add_block_group_cache(fs_info, bg);
+		if (ret) {
+			btrfs_remove_free_space_cache(bg);
+			btrfs_put_block_group(bg);
+			break;
+		}
+		btrfs_update_space_info(fs_info, bg->flags, em->len, em->len,
+					0, &space_info);
+		bg->space_info = space_info;
+		link_block_group(bg);
+
+		set_avail_alloc_bits(fs_info, bg->flags);
+	}
+	if (!ret)
+		btrfs_init_global_block_rsv(fs_info);
+	return ret;
+}
+
 int btrfs_read_block_groups(struct btrfs_fs_info *info)
 {
 	struct btrfs_path *path;
@@ -1995,6 +2040,9 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
 	int need_clear = 0;
 	u64 cache_gen;
 
+	if (!info->extent_root)
+		return fill_dummy_bgs(info);
+
 	key.objectid = 0;
 	key.offset = 0;
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 7e1549a84fcc..58b38912498d 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -426,6 +426,14 @@ void btrfs_init_global_block_rsv(struct btrfs_fs_info *fs_info)
 	fs_info->delayed_block_rsv.space_info = space_info;
 	fs_info->delayed_refs_rsv.space_info = space_info;
 
+	/*
+	 * Our various recovery options can leave us with NULL roots, so check
+	 * here and just bail before we go deref'ing NULLS everywhere.
+	 */
+	if (!fs_info->extent_root || !fs_info->csum_root ||
+	    !fs_info->dev_root || !fs_info->chunk_root || !fs_info->tree_root)
+		return;
+
 	fs_info->extent_root->block_rsv = &fs_info->delayed_refs_rsv;
 	fs_info->csum_root->block_rsv = &fs_info->delayed_refs_rsv;
 	fs_info->dev_root->block_rsv = &fs_info->global_block_rsv;
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 7e1eb57b923c..46dc621f599d 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -150,7 +150,8 @@ static int check_compressed_csum(struct btrfs_inode *inode, struct bio *bio,
 	struct compressed_bio *cb = bio->bi_private;
 	u8 *cb_sum = cb->sums;
 
-	if (inode->flags & BTRFS_INODE_NODATASUM)
+	if (!fs_info->csum_root ||
+	    inode->flags & BTRFS_INODE_NODATASUM)
 		return 0;
 
 	shash->tfm = fs_info->csum_shash;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aac3d6f4e35b..fb3cfd0aaf1e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1295,6 +1295,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info)
 #define BTRFS_MOUNT_NOLOGREPLAY		(1 << 27)
 #define BTRFS_MOUNT_REF_VERIFY		(1 << 28)
 #define BTRFS_MOUNT_DISCARD_ASYNC	(1 << 29)
+#define BTRFS_MOUNT_IGNOREBADROOTS	(1 << 30)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
 #define BTRFS_DEFAULT_MAX_INLINE	(2048)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3d39f5d47ad3..5deedfb0e5c7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2247,30 +2247,39 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (IS_ERR(root)) {
-		ret = PTR_ERR(root);
-		goto out;
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+			ret = PTR_ERR(root);
+			goto out;
+		}
+	} else {
+		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+		fs_info->extent_root = root;
 	}
-	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-	fs_info->extent_root = root;
 
 	location.objectid = BTRFS_DEV_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (IS_ERR(root)) {
-		ret = PTR_ERR(root);
-		goto out;
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+			ret = PTR_ERR(root);
+			goto out;
+		}
+	} else {
+		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+		fs_info->dev_root = root;
+		btrfs_init_devices_late(fs_info);
 	}
-	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-	fs_info->dev_root = root;
-	btrfs_init_devices_late(fs_info);
 
 	location.objectid = BTRFS_CSUM_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (IS_ERR(root)) {
-		ret = PTR_ERR(root);
-		goto out;
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+			ret = PTR_ERR(root);
+			goto out;
+		}
+	} else {
+		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+		fs_info->csum_root = root;
 	}
-	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-	fs_info->csum_root = root;
 
 	/*
 	 * This tree can share blocks with some other fs tree during relocation
@@ -2279,11 +2288,14 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	root = btrfs_get_fs_root(tree_root->fs_info,
 				 BTRFS_DATA_RELOC_TREE_OBJECTID, true);
 	if (IS_ERR(root)) {
-		ret = PTR_ERR(root);
-		goto out;
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+			ret = PTR_ERR(root);
+			goto out;
+		}
+	} else {
+		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+		fs_info->data_reloc_root = root;
 	}
-	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-	fs_info->data_reloc_root = root;
 
 	location.objectid = BTRFS_QUOTA_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
@@ -2296,9 +2308,11 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	location.objectid = BTRFS_UUID_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (IS_ERR(root)) {
-		ret = PTR_ERR(root);
-		if (ret != -ENOENT)
-			goto out;
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+			ret = PTR_ERR(root);
+			if (ret != -ENOENT)
+				goto out;
+		}
 	} else {
 		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
 		fs_info->uuid_root = root;
@@ -2308,11 +2322,14 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 		location.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID;
 		root = btrfs_read_tree_root(tree_root, &location);
 		if (IS_ERR(root)) {
-			ret = PTR_ERR(root);
-			goto out;
+			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+				ret = PTR_ERR(root);
+				goto out;
+			}
+		}  else {
+			set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+			fs_info->free_space_root = root;
 		}
-		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-		fs_info->free_space_root = root;
 	}
 
 	return 0;
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 8083d71d6af6..90c728652033 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -272,7 +272,8 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio,
 	int count = 0;
 	u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
 
-	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
+	if (!fs_info->csum_root ||
+	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
 		return BLK_STS_OK;
 
 	path = btrfs_alloc_path();
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d262944c4297..27752c66dec4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2187,7 +2187,8 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio,
 	int skip_sum;
 	int async = !atomic_read(&BTRFS_I(inode)->sync_writers);
 
-	skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM;
+	skip_sum = (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM) ||
+		!fs_info->csum_root;
 
 	if (btrfs_is_free_space_inode(BTRFS_I(inode)))
 		metadata = BTRFS_WQ_ENDIO_FREE_SPACE;
@@ -2844,6 +2845,9 @@ int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 phy_offset,
 	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
 		return 0;
 
+	if (!root->fs_info->csum_root)
+		return 0;
+
 	if (root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID &&
 	    test_range_bit(io_tree, start, end, EXTENT_NODATASUM, 1, NULL)) {
 		clear_extent_bits(io_tree, start, end, EXTENT_NODATASUM);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f99e89ec46b2..7cc7a9233f5e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -360,6 +360,7 @@ enum {
 	Opt_rescue,
 	Opt_usebackuproot,
 	Opt_nologreplay,
+	Opt_ignorebadroots,
 
 	/* Deprecated options */
 	Opt_recovery,
@@ -455,6 +456,7 @@ static const match_table_t tokens = {
 static const match_table_t rescue_tokens = {
 	{Opt_usebackuproot, "usebackuproot"},
 	{Opt_nologreplay, "nologreplay"},
+	{Opt_ignorebadroots, "ignorebadroots"},
 	{Opt_err, NULL},
 };
 
@@ -498,6 +500,10 @@ static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
 			btrfs_set_and_info(info, NOLOGREPLAY,
 					   "disabling log replay at mount time");
 			break;
+		case Opt_ignorebadroots:
+			btrfs_set_and_info(info, IGNOREBADROOTS,
+					   "ignoring bad roots");
+			break;
 		case Opt_err:
 			btrfs_info(info, "unrecognized rescue option '%s'", p);
 			ret = -EINVAL;
@@ -983,7 +989,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 	if (new_flags & SB_RDONLY)
 		goto out;
 
-	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay"))
+	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay") ||
+	    check_ro_option(info, BTRFS_MOUNT_IGNOREBADROOTS, "ignorebadroots"))
 		ret = -EINVAL;
 out:
 	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE) &&
@@ -1430,6 +1437,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 		seq_puts(seq, ",notreelog");
 	if (btrfs_test_opt(info, NOLOGREPLAY))
 		seq_puts(seq, ",rescue=nologreplay");
+	if (btrfs_test_opt(info, IGNOREBADROOTS))
+		seq_puts(seq, ",rescue=ignorebadroots");
 	if (btrfs_test_opt(info, FLUSHONCOMMIT))
 		seq_puts(seq, ",flushoncommit");
 	if (btrfs_test_opt(info, DISCARD_SYNC))
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 46f4efd58652..08b3ca60f3df 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7656,6 +7656,13 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info)
 	u64 prev_dev_ext_end = 0;
 	int ret = 0;
 
+	/*
+	 * We don't have a dev_root because we mounted with ignorebadroots and
+	 * failed to load the root, so skip the verification.
+	 */
+	if (!root)
+		return 0;
+
 	key.objectid = 1;
 	key.type = BTRFS_DEV_EXTENT_KEY;
 	key.offset = 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
                   ` (2 preceding siblings ...)
  2020-09-24 15:32 ` [PATCH 3/5] btrfs: introduce rescue=ignorebadroots Josef Bacik
@ 2020-09-24 15:32 ` Josef Bacik
  2020-09-25  0:50   ` Qu Wenruo
  2020-09-24 15:32 ` [PATCH 5/5] btrfs: introduce rescue=all Josef Bacik
  2020-09-25  0:34 ` [PATCH 0/5] New rescue mount options Qu Wenruo
  5 siblings, 1 reply; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

There are cases where you can end up with bad data csums because of
misbehaving applications.  This happens when an application modifies a
buffer in-flight when doing an O_DIRECT write.  In order to recover the
file we need a way to turn off data checksums so you can copy the file
off, and then you can delete the file and restore it properly later.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h   |  1 +
 fs/btrfs/disk-io.c | 21 ++++++++++++---------
 fs/btrfs/super.c   | 11 ++++++++++-
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index fb3cfd0aaf1e..397f5f6b88a4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1296,6 +1296,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info)
 #define BTRFS_MOUNT_REF_VERIFY		(1 << 28)
 #define BTRFS_MOUNT_DISCARD_ASYNC	(1 << 29)
 #define BTRFS_MOUNT_IGNOREBADROOTS	(1 << 30)
+#define BTRFS_MOUNT_IGNOREDATACSUMS	(1 << 31)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
 #define BTRFS_DEFAULT_MAX_INLINE	(2048)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5deedfb0e5c7..6f9d37567a10 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2269,16 +2269,19 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 		btrfs_init_devices_late(fs_info);
 	}
 
-	location.objectid = BTRFS_CSUM_TREE_OBJECTID;
-	root = btrfs_read_tree_root(tree_root, &location);
-	if (IS_ERR(root)) {
-		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
-			ret = PTR_ERR(root);
-			goto out;
+	/* If IGNOREDATASCUMS is set don't bother reading the csum root. */
+	if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {
+		location.objectid = BTRFS_CSUM_TREE_OBJECTID;
+		root = btrfs_read_tree_root(tree_root, &location);
+		if (IS_ERR(root)) {
+			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+				ret = PTR_ERR(root);
+				goto out;
+			}
+		} else {
+			set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+			fs_info->csum_root = root;
 		}
-	} else {
-		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
-		fs_info->csum_root = root;
 	}
 
 	/*
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 7cc7a9233f5e..2282f0240c1d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -361,6 +361,7 @@ enum {
 	Opt_usebackuproot,
 	Opt_nologreplay,
 	Opt_ignorebadroots,
+	Opt_ignoredatacsums,
 
 	/* Deprecated options */
 	Opt_recovery,
@@ -457,6 +458,7 @@ static const match_table_t rescue_tokens = {
 	{Opt_usebackuproot, "usebackuproot"},
 	{Opt_nologreplay, "nologreplay"},
 	{Opt_ignorebadroots, "ignorebadroots"},
+	{Opt_ignoredatacsums, "ignoredatacsums"},
 	{Opt_err, NULL},
 };
 
@@ -504,6 +506,10 @@ static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
 			btrfs_set_and_info(info, IGNOREBADROOTS,
 					   "ignoring bad roots");
 			break;
+		case Opt_ignoredatacsums:
+			btrfs_set_and_info(info, IGNOREDATACSUMS,
+					   "ignoring data csums");
+			break;
 		case Opt_err:
 			btrfs_info(info, "unrecognized rescue option '%s'", p);
 			ret = -EINVAL;
@@ -990,7 +996,10 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 		goto out;
 
 	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay") ||
-	    check_ro_option(info, BTRFS_MOUNT_IGNOREBADROOTS, "ignorebadroots"))
+	    check_ro_option(info, BTRFS_MOUNT_IGNOREBADROOTS,
+			    "ignorebadroots") ||
+	    check_ro_option(info, BTRFS_MOUNT_IGNOREDATACSUMS,
+			    "ignoredatacsums"))
 		ret = -EINVAL;
 out:
 	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE) &&
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] btrfs: introduce rescue=all
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
                   ` (3 preceding siblings ...)
  2020-09-24 15:32 ` [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums Josef Bacik
@ 2020-09-24 15:32 ` Josef Bacik
  2020-09-25  0:51   ` Qu Wenruo
  2020-09-25  0:34 ` [PATCH 0/5] New rescue mount options Qu Wenruo
  5 siblings, 1 reply; 18+ messages in thread
From: Josef Bacik @ 2020-09-24 15:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have the building blocks for some better recovery options
with corrupted file systems, add a rescue=all option to enable all of
the relevant rescue options.  This will allow distro's to simply default
to rescue=all for the "oh dear lord the world's on fire" recovery
without needing to know all the different options that we have and may
add in the future.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/super.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 2282f0240c1d..3412763a9a0d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -362,6 +362,7 @@ enum {
 	Opt_nologreplay,
 	Opt_ignorebadroots,
 	Opt_ignoredatacsums,
+	Opt_all,
 
 	/* Deprecated options */
 	Opt_recovery,
@@ -459,6 +460,7 @@ static const match_table_t rescue_tokens = {
 	{Opt_nologreplay, "nologreplay"},
 	{Opt_ignorebadroots, "ignorebadroots"},
 	{Opt_ignoredatacsums, "ignoredatacsums"},
+	{Opt_all, "all"},
 	{Opt_err, NULL},
 };
 
@@ -510,6 +512,12 @@ static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
 			btrfs_set_and_info(info, IGNOREDATACSUMS,
 					   "ignoring data csums");
 			break;
+		case Opt_all:
+			btrfs_set_opt(info->mount_opt, IGNOREDATACSUMS);
+			btrfs_set_opt(info->mount_opt, IGNOREBADROOTS);
+			btrfs_set_opt(info->mount_opt, NOLOGREPLAY);
+			btrfs_info(info, "enabling all of the rescue options");
+			break;
 		case Opt_err:
 			btrfs_info(info, "unrecognized rescue option '%s'", p);
 			ret = -EINVAL;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] New rescue mount options
  2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
                   ` (4 preceding siblings ...)
  2020-09-24 15:32 ` [PATCH 5/5] btrfs: introduce rescue=all Josef Bacik
@ 2020-09-25  0:34 ` Qu Wenruo
  5 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:34 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2592 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> Hello,
> 
> This is the next version of my rescue=all patches, this time broken up in to a
> few discrete patches, with some cleanups as well.  I have a PR for the xfstest
> that exercises these options, it can be found here
> 
>   https://github.com/btrfs/fstests/pull/35
> 
> This is the same idea as the previous versions, except I've made a mechanical
> change.  Instead we have rescue=ignorebadroots, which ignores any global roots
> that we may not be able to read.  We will still fail to mount if thinks like the
> chunk root or the tree root are corrupt, but this will allow us to mount read
> only if the extent root or csum root are completely hosed.
> 
> I've added a new patch this go around, rescue=ignoredatacsums.  Somebody had
> indicated that they would prefer that the original rescue=all allowed us to
> continue to read csums if the root was still intact.  In order to handle that
> usecase that's what you get with rescue=ignorebadroots, however if your csum
> tree is corrupt in the middle of the tree you could still end up with problems.
> Thus we have rescue=ignoredatacsums which will completely disable the csum tree
> as well.

The extra hard ignore for data csum is really good.

> 
> And finally we have rescue=all, which simply enables ignoredatacsums,
> ignorebadroots, and notreelogreplay.  We need an easy catch-all option for
> distros to fallback on to get users the highest probability of being able to
> recover their data, so we will use rescue=all to turn on all the fanciest rescue
> options that we have, and then use the discrete options for more fine grained
> recovery.  Thanks,

Now rescue=all makes more sense. It's just an alias for one to salvage
as much data as possible.

Thanks,
Qu
> 
> Josef
> 
> Josef Bacik (5):
>   btrfs: unify the ro checking for mount options
>   btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
>   btrfs: introduce rescue=ignorebadroots
>   btrfs: introduce rescue=ignoredatacsums
>   btrfs: introduce rescue=all
> 
>  fs/btrfs/block-group.c | 48 +++++++++++++++++++++++++++
>  fs/btrfs/block-rsv.c   |  8 +++++
>  fs/btrfs/compression.c | 17 ++++------
>  fs/btrfs/ctree.h       |  2 ++
>  fs/btrfs/disk-io.c     | 74 +++++++++++++++++++++++++++---------------
>  fs/btrfs/file-item.c   |  4 +++
>  fs/btrfs/inode.c       | 18 +++++++---
>  fs/btrfs/super.c       | 49 ++++++++++++++++++++++++----
>  fs/btrfs/volumes.c     |  7 ++++
>  9 files changed, 179 insertions(+), 48 deletions(-)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] btrfs: unify the ro checking for mount options
  2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
@ 2020-09-25  0:36   ` Qu Wenruo
  2020-09-28 12:37   ` Johannes Thumshirn
  1 sibling, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:36 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1744 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> We're going to be adding more options that require RDONLY, so add a
> helper to do the check and error out if we don't have RDONLY set.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/super.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 8840a4fa81eb..f99e89ec46b2 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -458,6 +458,17 @@ static const match_table_t rescue_tokens = {
>  	{Opt_err, NULL},
>  };
>  
> +static bool check_ro_option(struct btrfs_fs_info *fs_info, unsigned long opt,
> +			    const char *opt_name)
> +{
> +	if (fs_info->mount_opt & opt) {
> +		btrfs_err(fs_info, "%s must be used with ro mount option",
> +			  opt_name);
> +		return true;
> +	}
> +	return false;
> +}
> +
>  static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
>  {
>  	char *opts;
> @@ -968,14 +979,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
>  		}
>  	}
>  check:
> -	/*
> -	 * Extra check for current option against current flag
> -	 */
> -	if (btrfs_test_opt(info, NOLOGREPLAY) && !(new_flags & SB_RDONLY)) {
> -		btrfs_err(info,
> -			  "nologreplay must be used with ro mount option");
> +	/* We're read-only, don't have to check. */
> +	if (new_flags & SB_RDONLY)
> +		goto out;
> +
> +	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay"))
>  		ret = -EINVAL;
> -	}
>  out:
>  	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE) &&
>  	    !btrfs_test_opt(info, FREE_SPACE_TREE) &&
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  2020-09-24 15:32 ` [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums Josef Bacik
@ 2020-09-25  0:39   ` Qu Wenruo
  2020-09-28 18:28     ` Josef Bacik
  2020-09-28 12:39   ` Johannes Thumshirn
  1 sibling, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:39 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 4176 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> When we move to being able to handle NULL csum_roots it'll be cleaner to
> just check in btrfs_lookup_bio_sums instead of at all of the caller
> locations, so push the NODATASUM check into it as well so it's unified.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But an off-topic question inlined below:

> ---
>  fs/btrfs/compression.c | 14 +++++---------
>  fs/btrfs/file-item.c   |  3 +++
>  fs/btrfs/inode.c       | 12 +++++++++---
>  3 files changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index eeface30facd..7e1eb57b923c 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -722,11 +722,9 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
>  			 */
>  			refcount_inc(&cb->pending_bios);
>  
> -			if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
> -				ret = btrfs_lookup_bio_sums(inode, comp_bio,
> -							    (u64)-1, sums);
> -				BUG_ON(ret); /* -ENOMEM */

Is it really possible to have compressed extent without data csum?
Won't nodatacsum disable compression?

Or are we just here to handle some old compressed but not csumed data?

Thanks,
Qu

> -			}
> +			ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1,
> +						    sums);
> +			BUG_ON(ret); /* -ENOMEM */
>  
>  			nr_sectors = DIV_ROUND_UP(comp_bio->bi_iter.bi_size,
>  						  fs_info->sectorsize);
> @@ -751,10 +749,8 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
>  	ret = btrfs_bio_wq_end_io(fs_info, comp_bio, BTRFS_WQ_ENDIO_DATA);
>  	BUG_ON(ret); /* -ENOMEM */
>  
> -	if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
> -		ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1, sums);
> -		BUG_ON(ret); /* -ENOMEM */
> -	}
> +	ret = btrfs_lookup_bio_sums(inode, comp_bio, (u64)-1, sums);
> +	BUG_ON(ret); /* -ENOMEM */
>  
>  	ret = btrfs_map_bio(fs_info, comp_bio, mirror_num);
>  	if (ret) {
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index 8f4f2bd6d9b9..8083d71d6af6 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -272,6 +272,9 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio,
>  	int count = 0;
>  	u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
>  
> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
> +		return BLK_STS_OK;
> +
>  	path = btrfs_alloc_path();
>  	if (!path)
>  		return BLK_STS_RESOURCE;
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d526833b5ec0..d262944c4297 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2202,7 +2202,12 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio,
>  							   mirror_num,
>  							   bio_flags);
>  			goto out;
> -		} else if (!skip_sum) {
> +		} else {
> +			/*
> +			 * Lookup bio sums does extra checks around whether we
> +			 * need to csum or not, which is why we ignore skip_sum
> +			 * here.
> +			 */
>  			ret = btrfs_lookup_bio_sums(inode, bio, (u64)-1, NULL);
>  			if (ret)
>  				goto out;
> @@ -7781,7 +7786,6 @@ static blk_qc_t btrfs_submit_direct(struct inode *inode, struct iomap *iomap,
>  		struct bio *dio_bio, loff_t file_offset)
>  {
>  	const bool write = (bio_op(dio_bio) == REQ_OP_WRITE);
> -	const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM);
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  	const bool raid56 = (btrfs_data_alloc_profile(fs_info) &
>  			     BTRFS_BLOCK_GROUP_RAID56_MASK);
> @@ -7808,10 +7812,12 @@ static blk_qc_t btrfs_submit_direct(struct inode *inode, struct iomap *iomap,
>  		return BLK_QC_T_NONE;
>  	}
>  
> -	if (!write && csum) {
> +	if (!write) {
>  		/*
>  		 * Load the csums up front to reduce csum tree searches and
>  		 * contention when submitting bios.
> +		 *
> +		 * If we have csums disabled this will do nothing.
>  		 */
>  		status = btrfs_lookup_bio_sums(inode, dio_bio, file_offset,
>  					       dip->csums);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] btrfs: introduce rescue=ignorebadroots
  2020-09-24 15:32 ` [PATCH 3/5] btrfs: introduce rescue=ignorebadroots Josef Bacik
@ 2020-09-25  0:47   ` Qu Wenruo
  2020-09-28 18:24     ` Josef Bacik
  0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:47 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1645 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> In the face of extent root corruption, or any other core fs wide root
> corruption we will fail to mount the file system.  This makes recovery
> kind of a pain, because you need to fall back to userspace tools to
> scrape off data.  Instead provide a mechanism to gracefully handle bad
> roots, so we can at least mount read-only and possibly recover data from
> the file system.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Mostly OK, but still a small problem inlined below.
[...]
> index 46f4efd58652..08b3ca60f3df 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7656,6 +7656,13 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info)
>  	u64 prev_dev_ext_end = 0;
>  	int ret = 0;
>  
> +	/*
> +	 * We don't have a dev_root because we mounted with ignorebadroots and
> +	 * failed to load the root, so skip the verification.
> +	 */
> +	if (!root)
> +		return 0;
> +

The check itself is mostly for write, to ensure we won't have
missing/unnecessary dev extents to mess up chunk allocation.

For RO operations, the check makes little sense, and can be safely
ignored for ignorebadroots.

Furthermore this only handles the case where the device tree root is
corrupted.
But if only part of the device tree is corrupted, we still continue
checking and fail to mount.

It's better to skip the whole check for dev extents if we're using
ignorebadroots rescue option.
No matter if the root is corrupted or not.


Thanks,
Qu

>  	key.objectid = 1;
>  	key.type = BTRFS_DEV_EXTENT_KEY;
>  	key.offset = 0;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums
  2020-09-24 15:32 ` [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums Josef Bacik
@ 2020-09-25  0:50   ` Qu Wenruo
  0 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:50 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 4180 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> There are cases where you can end up with bad data csums because of
> misbehaving applications.  This happens when an application modifies a
> buffer in-flight when doing an O_DIRECT write.  In order to recover the
> file we need a way to turn off data checksums so you can copy the file
> off, and then you can delete the file and restore it properly later.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ctree.h   |  1 +
>  fs/btrfs/disk-io.c | 21 ++++++++++++---------
>  fs/btrfs/super.c   | 11 ++++++++++-
>  3 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index fb3cfd0aaf1e..397f5f6b88a4 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1296,6 +1296,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info)
>  #define BTRFS_MOUNT_REF_VERIFY		(1 << 28)
>  #define BTRFS_MOUNT_DISCARD_ASYNC	(1 << 29)
>  #define BTRFS_MOUNT_IGNOREBADROOTS	(1 << 30)
> +#define BTRFS_MOUNT_IGNOREDATACSUMS	(1 << 31)
>  
>  #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
>  #define BTRFS_DEFAULT_MAX_INLINE	(2048)
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 5deedfb0e5c7..6f9d37567a10 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2269,16 +2269,19 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
>  		btrfs_init_devices_late(fs_info);
>  	}
>  
> -	location.objectid = BTRFS_CSUM_TREE_OBJECTID;
> -	root = btrfs_read_tree_root(tree_root, &location);
> -	if (IS_ERR(root)) {
> -		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
> -			ret = PTR_ERR(root);
> -			goto out;
> +	/* If IGNOREDATASCUMS is set don't bother reading the csum root. */
> +	if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {

This indeed matches the name, ignoredatacsums, no matter if the data
csum matches or not.

I guess if the user is using this option, they really don't bother the
datacsum and just want to grab whatever they can get.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> +		location.objectid = BTRFS_CSUM_TREE_OBJECTID;
> +		root = btrfs_read_tree_root(tree_root, &location);
> +		if (IS_ERR(root)) {
> +			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
> +				ret = PTR_ERR(root);
> +				goto out;
> +			}
> +		} else {
> +			set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
> +			fs_info->csum_root = root;
>  		}
> -	} else {
> -		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
> -		fs_info->csum_root = root;
>  	}
>  
>  	/*
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 7cc7a9233f5e..2282f0240c1d 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -361,6 +361,7 @@ enum {
>  	Opt_usebackuproot,
>  	Opt_nologreplay,
>  	Opt_ignorebadroots,
> +	Opt_ignoredatacsums,
>  
>  	/* Deprecated options */
>  	Opt_recovery,
> @@ -457,6 +458,7 @@ static const match_table_t rescue_tokens = {
>  	{Opt_usebackuproot, "usebackuproot"},
>  	{Opt_nologreplay, "nologreplay"},
>  	{Opt_ignorebadroots, "ignorebadroots"},
> +	{Opt_ignoredatacsums, "ignoredatacsums"},
>  	{Opt_err, NULL},
>  };
>  
> @@ -504,6 +506,10 @@ static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
>  			btrfs_set_and_info(info, IGNOREBADROOTS,
>  					   "ignoring bad roots");
>  			break;
> +		case Opt_ignoredatacsums:
> +			btrfs_set_and_info(info, IGNOREDATACSUMS,
> +					   "ignoring data csums");
> +			break;
>  		case Opt_err:
>  			btrfs_info(info, "unrecognized rescue option '%s'", p);
>  			ret = -EINVAL;
> @@ -990,7 +996,10 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
>  		goto out;
>  
>  	if (check_ro_option(info, BTRFS_MOUNT_NOLOGREPLAY, "nologreplay") ||
> -	    check_ro_option(info, BTRFS_MOUNT_IGNOREBADROOTS, "ignorebadroots"))
> +	    check_ro_option(info, BTRFS_MOUNT_IGNOREBADROOTS,
> +			    "ignorebadroots") ||
> +	    check_ro_option(info, BTRFS_MOUNT_IGNOREDATACSUMS,
> +			    "ignoredatacsums"))
>  		ret = -EINVAL;
>  out:
>  	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE) &&
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] btrfs: introduce rescue=all
  2020-09-24 15:32 ` [PATCH 5/5] btrfs: introduce rescue=all Josef Bacik
@ 2020-09-25  0:51   ` Qu Wenruo
  0 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2020-09-25  0:51 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1780 bytes --]



On 2020/9/24 下午11:32, Josef Bacik wrote:
> Now that we have the building blocks for some better recovery options
> with corrupted file systems, add a rescue=all option to enable all of
> the relevant rescue options.  This will allow distro's to simply default
> to rescue=all for the "oh dear lord the world's on fire" recovery
> without needing to know all the different options that we have and may
> add in the future.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/super.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 2282f0240c1d..3412763a9a0d 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -362,6 +362,7 @@ enum {
>  	Opt_nologreplay,
>  	Opt_ignorebadroots,
>  	Opt_ignoredatacsums,
> +	Opt_all,
>  
>  	/* Deprecated options */
>  	Opt_recovery,
> @@ -459,6 +460,7 @@ static const match_table_t rescue_tokens = {
>  	{Opt_nologreplay, "nologreplay"},
>  	{Opt_ignorebadroots, "ignorebadroots"},
>  	{Opt_ignoredatacsums, "ignoredatacsums"},
> +	{Opt_all, "all"},
>  	{Opt_err, NULL},
>  };
>  
> @@ -510,6 +512,12 @@ static int parse_rescue_options(struct btrfs_fs_info *info, const char *options)
>  			btrfs_set_and_info(info, IGNOREDATACSUMS,
>  					   "ignoring data csums");
>  			break;
> +		case Opt_all:
> +			btrfs_set_opt(info->mount_opt, IGNOREDATACSUMS);
> +			btrfs_set_opt(info->mount_opt, IGNOREBADROOTS);
> +			btrfs_set_opt(info->mount_opt, NOLOGREPLAY);
> +			btrfs_info(info, "enabling all of the rescue options");
> +			break;
>  		case Opt_err:
>  			btrfs_info(info, "unrecognized rescue option '%s'", p);
>  			ret = -EINVAL;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] btrfs: unify the ro checking for mount options
  2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
  2020-09-25  0:36   ` Qu Wenruo
@ 2020-09-28 12:37   ` Johannes Thumshirn
  2020-09-28 18:23     ` Josef Bacik
  1 sibling, 1 reply; 18+ messages in thread
From: Johannes Thumshirn @ 2020-09-28 12:37 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 24/09/2020 17:33, Josef Bacik wrote:
> We're going to be adding more options that require RDONLY, so add a
> helper to do the check and error out if we don't have RDONLY set.
>
> +	/* We're read-only, don't have to check. */
> +	if (new_flags & SB_RDONLY)
> +		goto out;
> +

Why aren't you moving the SB_RDONLY check into the new check_ro_option() as well?
This is what I would have thought this patch does after just reading the commit message.

Thanks,
	Johannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  2020-09-24 15:32 ` [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums Josef Bacik
  2020-09-25  0:39   ` Qu Wenruo
@ 2020-09-28 12:39   ` Johannes Thumshirn
  1 sibling, 0 replies; 18+ messages in thread
From: Johannes Thumshirn @ 2020-09-28 12:39 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] btrfs: unify the ro checking for mount options
  2020-09-28 12:37   ` Johannes Thumshirn
@ 2020-09-28 18:23     ` Josef Bacik
  2020-09-29  6:36       ` Johannes Thumshirn
  0 siblings, 1 reply; 18+ messages in thread
From: Josef Bacik @ 2020-09-28 18:23 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-btrfs, kernel-team

On 9/28/20 8:37 AM, Johannes Thumshirn wrote:
> On 24/09/2020 17:33, Josef Bacik wrote:
>> We're going to be adding more options that require RDONLY, so add a
>> helper to do the check and error out if we don't have RDONLY set.
>>
>> +	/* We're read-only, don't have to check. */
>> +	if (new_flags & SB_RDONLY)
>> +		goto out;
>> +
> 
> Why aren't you moving the SB_RDONLY check into the new check_ro_option() as well?
> This is what I would have thought this patch does after just reading the commit message.
> 

To avoid the multiple calls if we're not read only, otherwise it'll be multiple 
function calls to check that that SB_RDONLY is set.  The compiler will probably 
optimize that away, but I just went with this instead.  I'm good either way if 
people have strong opinions one way or the other.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] btrfs: introduce rescue=ignorebadroots
  2020-09-25  0:47   ` Qu Wenruo
@ 2020-09-28 18:24     ` Josef Bacik
  0 siblings, 0 replies; 18+ messages in thread
From: Josef Bacik @ 2020-09-28 18:24 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 9/24/20 8:47 PM, Qu Wenruo wrote:
> 
> 
> On 2020/9/24 下午11:32, Josef Bacik wrote:
>> In the face of extent root corruption, or any other core fs wide root
>> corruption we will fail to mount the file system.  This makes recovery
>> kind of a pain, because you need to fall back to userspace tools to
>> scrape off data.  Instead provide a mechanism to gracefully handle bad
>> roots, so we can at least mount read-only and possibly recover data from
>> the file system.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Mostly OK, but still a small problem inlined below.
> [...]
>> index 46f4efd58652..08b3ca60f3df 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -7656,6 +7656,13 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info)
>>   	u64 prev_dev_ext_end = 0;
>>   	int ret = 0;
>>   
>> +	/*
>> +	 * We don't have a dev_root because we mounted with ignorebadroots and
>> +	 * failed to load the root, so skip the verification.
>> +	 */
>> +	if (!root)
>> +		return 0;
>> +
> 
> The check itself is mostly for write, to ensure we won't have
> missing/unnecessary dev extents to mess up chunk allocation.
> 
> For RO operations, the check makes little sense, and can be safely
> ignored for ignorebadroots.
> 
> Furthermore this only handles the case where the device tree root is
> corrupted.
> But if only part of the device tree is corrupted, we still continue
> checking and fail to mount.
> 
> It's better to skip the whole check for dev extents if we're using
> ignorebadroots rescue option.
> No matter if the root is corrupted or not.
> 

Yeah good point, I'll fix this up.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  2020-09-25  0:39   ` Qu Wenruo
@ 2020-09-28 18:28     ` Josef Bacik
  0 siblings, 0 replies; 18+ messages in thread
From: Josef Bacik @ 2020-09-28 18:28 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 9/24/20 8:39 PM, Qu Wenruo wrote:
> 
> 
> On 2020/9/24 下午11:32, Josef Bacik wrote:
>> When we move to being able to handle NULL csum_roots it'll be cleaner to
>> just check in btrfs_lookup_bio_sums instead of at all of the caller
>> locations, so push the NODATASUM check into it as well so it's unified.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> But an off-topic question inlined below:
> 
>> ---
>>   fs/btrfs/compression.c | 14 +++++---------
>>   fs/btrfs/file-item.c   |  3 +++
>>   fs/btrfs/inode.c       | 12 +++++++++---
>>   3 files changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
>> index eeface30facd..7e1eb57b923c 100644
>> --- a/fs/btrfs/compression.c
>> +++ b/fs/btrfs/compression.c
>> @@ -722,11 +722,9 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
>>   			 */
>>   			refcount_inc(&cb->pending_bios);
>>   
>> -			if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
>> -				ret = btrfs_lookup_bio_sums(inode, comp_bio,
>> -							    (u64)-1, sums);
>> -				BUG_ON(ret); /* -ENOMEM */
> 
> Is it really possible to have compressed extent without data csum?
> Won't nodatacsum disable compression?
> 
> Or are we just here to handle some old compressed but not csumed data?
> 

We used to allow it, so I'm content to leave this here.  Maybe at some point 
we'll allow it in the future, but IDK it doesn't hurt anything to handle it 
here.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] btrfs: unify the ro checking for mount options
  2020-09-28 18:23     ` Josef Bacik
@ 2020-09-29  6:36       ` Johannes Thumshirn
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Thumshirn @ 2020-09-29  6:36 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 28/09/2020 20:23, Josef Bacik wrote:
> To avoid the multiple calls if we're not read only, otherwise it'll be multiple 
> function calls to check that that SB_RDONLY is set.  The compiler will probably 
> optimize that away, but I just went with this instead.  I'm good either way if 
> people have strong opinions one way or the other.

The name check_ro_option() really suggests it checking for a read-only option and
having multiple calls if RO isn't set really won't be a problem IMHO as we're not
really in the fast-path here, but I don't have a strong opinion on it either.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-09-29  6:36 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-24 15:32 [PATCH 0/5] New rescue mount options Josef Bacik
2020-09-24 15:32 ` [PATCH 1/5] btrfs: unify the ro checking for " Josef Bacik
2020-09-25  0:36   ` Qu Wenruo
2020-09-28 12:37   ` Johannes Thumshirn
2020-09-28 18:23     ` Josef Bacik
2020-09-29  6:36       ` Johannes Thumshirn
2020-09-24 15:32 ` [PATCH 2/5] btrfs: push the NODATASUM check into btrfs_lookup_bio_sums Josef Bacik
2020-09-25  0:39   ` Qu Wenruo
2020-09-28 18:28     ` Josef Bacik
2020-09-28 12:39   ` Johannes Thumshirn
2020-09-24 15:32 ` [PATCH 3/5] btrfs: introduce rescue=ignorebadroots Josef Bacik
2020-09-25  0:47   ` Qu Wenruo
2020-09-28 18:24     ` Josef Bacik
2020-09-24 15:32 ` [PATCH 4/5] btrfs: introduce rescue=ignoredatacsums Josef Bacik
2020-09-25  0:50   ` Qu Wenruo
2020-09-24 15:32 ` [PATCH 5/5] btrfs: introduce rescue=all Josef Bacik
2020-09-25  0:51   ` Qu Wenruo
2020-09-25  0:34 ` [PATCH 0/5] New rescue mount options Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.