All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/54] Cleanup error handling in relocation
@ 2020-12-02 19:50 Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
                   ` (53 more replies)
  0 siblings, 54 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v2->v3:
- A lot of extra patches fixing various things that I encountered while
  debugging the corruption problem that was uncovered by these patches.
- Fixed the panic that Zygo was seeing and other issues.
- Fixed up the comments from Nikolay and Filipe.

A slight note, the first set of patches could probably be taken now, and in fact

  btrfs: fix error handling in commit_fs_roots

Was sent earlier this week and is very important and needs to be reviewed and
merged ASAP.  The following are safe and could be merged outside of the rest of
this series

  btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  btrfs: fix lockdep splat in btrfs_recover_relocation
  btrfs: keep track of the root owner for relocation reads
  btrfs: noinline btrfs_should_cancel_balance
  btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
  btrfs: pass down the tree block level through ref-verify
  btrfs: make sure owner is set in ref-verify
  btrfs: don't clear ret in btrfs_start_dirty_block_groups

The rest obviously are all around the actual error handling.

v1->v2:
- fixed a bug where I accidentally dropped reading flags in relocate_block_group
  when I dropped the extra checks that we handle in the tree checker.

--- Original message ---
Hello,

Relocation is the last place that is not able to handle errors at all, which
results in all sorts of lovely panics if you encounter corruptions or IO errors.
I'm going to start cleaning up relocation, but before I move code around I want
the error handling to be somewhat sane, so I'm not changing behavior and error
handling at the same time.

These patches are purely about error handling, there is no behavior changing
other than returning errors up the chain properly.  There is a lot of room for
follow up cleanups, which will happen next.  However I wanted to get this series
done today and out so we could get it merged ASAP, and then the follow up
cleanups can happen later as they are less important and less critical.

The only exception to the above is the patch to add the error injection sites
for btrfs_cow_block and btrfs_search_slot, and a lockdep fix that I discovered
while running my tests, those are the first two patches in the series.

I tested this with my error injection stress test, where I keep track of all
stack traces that have been tested and only inject errors when we have a new
stack trace, which means I should have covered all of the various error
conditions.  With this patchset I'm no longer panicing while stressing the error
conditions.  Thanks,

Josef

Josef Bacik (54):
  btrfs: fix error handling in commit_fs_roots
  btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  btrfs: fix lockdep splat in btrfs_recover_relocation
  btrfs: keep track of the root owner for relocation reads
  btrfs: noinline btrfs_should_cancel_balance
  btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
  btrfs: pass down the tree block level through ref-verify
  btrfs: make sure owner is set in ref-verify
  btrfs: don't clear ret in btrfs_start_dirty_block_groups
  btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation
  btrfs: convert BUG_ON()'s in relocate_tree_block
  btrfs: return an error from btrfs_record_root_in_trans
  btrfs: handle errors from select_reloc_root()
  btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors
  btrfs: check record_root_in_trans related failures in
    select_reloc_root
  btrfs: do proper error handling in record_reloc_root_in_trans
  btrfs: handle btrfs_record_root_in_trans failure in
    btrfs_rename_exchange
  btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename
  btrfs: handle btrfs_record_root_in_trans failure in
    btrfs_delete_subvolume
  btrfs: handle btrfs_record_root_in_trans failure in
    btrfs_recover_log_trees
  btrfs: handle btrfs_record_root_in_trans failure in create_subvol
  btrfs: btrfs: handle btrfs_record_root_in_trans failure in
    relocate_tree_block
  btrfs: handle btrfs_record_root_in_trans failure in start_transaction
  btrfs: handle record_root_in_trans failure in qgroup_account_snapshot
  btrfs: handle record_root_in_trans failure in
    btrfs_record_root_in_trans
  btrfs: handle record_root_in_trans failure in create_pending_snapshot
  btrfs: do not panic in __add_reloc_root
  btrfs: have proper error handling in btrfs_init_reloc_root
  btrfs: do proper error handling in create_reloc_root
  btrfs: validate ->reloc_root after recording root in trans
  btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots
  btrfs: change insert_dirty_subvol to return errors
  btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol
  btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge
  btrfs: do proper error handling in btrfs_update_reloc_root
  btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s
  btrfs: handle initial btrfs_cow_block error in replace_path
  btrfs: handle the loop btrfs_cow_block error in replace_path
  btrfs: handle btrfs_search_slot failure in replace_path
  btrfs: handle errors in reference count manipulation in replace_path
  btrfs: handle extent reference errors in do_relocation
  btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly
  btrfs: remove the extent item sanity checks in relocate_block_group
  btrfs: do proper error handling in create_reloc_inode
  btrfs: handle __add_reloc_root failure in btrfs_recover_relocation
  btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot
  btrfs: cleanup error handling in prepare_to_merge
  btrfs: handle extent corruption with select_one_root properly
  btrfs: do proper error handling in merge_reloc_roots
  btrfs: check return value of btrfs_commit_transaction in relocation
  btrfs: do not WARN_ON() if we can't find the reloc root
  btrfs: print the actual offset in btrfs_root_name
  btrfs: fix reloc root leak with 0 ref reloc roots on recovery
  btrfs: splice remaining dirty_bg's onto the transaction dirty bg list

 fs/btrfs/backref.c      |   9 +-
 fs/btrfs/block-group.c  |   6 +-
 fs/btrfs/ctree.c        |   2 +
 fs/btrfs/disk-io.c      |   2 +-
 fs/btrfs/inode.c        |  20 +-
 fs/btrfs/ioctl.c        |   6 +-
 fs/btrfs/print-tree.c   |  10 +-
 fs/btrfs/print-tree.h   |   2 +-
 fs/btrfs/ref-verify.c   |  43 ++--
 fs/btrfs/relocation.c   | 438 +++++++++++++++++++++++++++++++---------
 fs/btrfs/transaction.c  |  46 +++--
 fs/btrfs/tree-checker.c |   5 +
 fs/btrfs/tree-log.c     |   8 +-
 fs/btrfs/volumes.c      |   2 +
 14 files changed, 440 insertions(+), 159 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  1:45   ` Qu Wenruo
  2020-12-03  8:09   ` Johannes Thumshirn
  2020-12-02 19:50 ` [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
                   ` (52 subsequent siblings)
  53 siblings, 2 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While doing error injection I would sometimes get a corrupt file system.
This is because I was injecting errors at btrfs_search_slot, but would
only do it one time per stack.  This uncovered a problem in
commit_fs_roots, where if we get an error we would just break.  However
we're in a nested loop, the main loop being a loop to find all the dirty
fs roots, and then subsequent root updates would succeed clearing the
error value.

This isn't likely to happen in real scenarios, however we could
potentially get a random ENOMEM once and then not again, and we'd end up
with a corrupted file system.  Fix this by moving the error checking
around a bit to the nested loop, as this is the only place where
something will fail, and return the error as soon as it occurs.

With this patch my reproducer no longer corrupts the file system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 8e0f7a1029c6..a614f7699ce4 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1319,7 +1319,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 	struct btrfs_root *gang[8];
 	int i;
 	int ret;
-	int err = 0;
 
 	spin_lock(&fs_info->fs_roots_radix_lock);
 	while (1) {
@@ -1331,6 +1330,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 			break;
 		for (i = 0; i < ret; i++) {
 			struct btrfs_root *root = gang[i];
+			int err;
+
 			radix_tree_tag_clear(&fs_info->fs_roots_radix,
 					(unsigned long)root->root_key.objectid,
 					BTRFS_ROOT_TRANS_TAG);
@@ -1353,14 +1354,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 			err = btrfs_update_root(trans, fs_info->tree_root,
 						&root->root_key,
 						&root->root_item);
-			spin_lock(&fs_info->fs_roots_radix_lock);
 			if (err)
-				break;
+				return err;
+			spin_lock(&fs_info->fs_roots_radix_lock);
 			btrfs_qgroup_free_meta_all_pertrans(root);
 		}
 	}
 	spin_unlock(&fs_info->fs_roots_radix_lock);
-	return err;
+	return 0;
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  1:48   ` Qu Wenruo
  2020-12-03  8:21   ` Johannes Thumshirn
  2020-12-02 19:50 ` [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
                   ` (51 subsequent siblings)
  53 siblings, 2 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The following patches are going to address error handling in relocation,
in order to test those patches I need to be able to inject errors in
btrfs_search_slot and btrfs_cow_block, as we call both of these pretty
often in different cases during relocation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index e5a0941c4bde..f40d3a2590a5 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1494,6 +1494,7 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
 
 	return ret;
 }
+ALLOW_ERROR_INJECTION(btrfs_cow_block, ERRNO);
 
 /*
  * helper function for defrag to decide if two blocks pointed to by a
@@ -2800,6 +2801,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		btrfs_release_path(p);
 	return ret;
 }
+ALLOW_ERROR_INJECTION(btrfs_search_slot, ERRNO);
 
 /*
  * Like btrfs_search_slot, this looks for a key in the given tree. It uses the
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  1:49   ` Qu Wenruo
  2020-12-03  8:44   ` Johannes Thumshirn
  2020-12-02 19:50 ` [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads Josef Bacik
                   ` (50 subsequent siblings)
  53 siblings, 2 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While testing the error paths of relocation I hit the following lockdep
splat

======================================================
WARNING: possible circular locking dependency detected
5.10.0-rc6+ #217 Not tainted
------------------------------------------------------
mount/779 is trying to acquire lock:
ffffa0e676945418 (&fs_info->balance_mutex){+.+.}-{3:3}, at: btrfs_recover_balance+0x2f0/0x340

but task is already holding lock:
ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (btrfs-root-00){++++}-{3:3}:
       down_read_nested+0x43/0x130
       __btrfs_tree_read_lock+0x27/0x100
       btrfs_read_lock_root_node+0x31/0x40
       btrfs_search_slot+0x462/0x8f0
       btrfs_update_root+0x55/0x2b0
       btrfs_drop_snapshot+0x398/0x750
       clean_dirty_subvols+0xdf/0x120
       btrfs_recover_relocation+0x534/0x5a0
       btrfs_start_pre_rw_mount+0xcb/0x170
       open_ctree+0x151f/0x1726
       btrfs_mount_root.cold+0x12/0xea
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       vfs_kern_mount.part.0+0x71/0xb0
       btrfs_mount+0x10d/0x380
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       path_mount+0x433/0xc10
       __x64_sys_mount+0xe3/0x120
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #1 (sb_internal#2){.+.+}-{0:0}:
       start_transaction+0x444/0x700
       insert_balance_item.isra.0+0x37/0x320
       btrfs_balance+0x354/0xf40
       btrfs_ioctl_balance+0x2cf/0x380
       __x64_sys_ioctl+0x83/0xb0
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (&fs_info->balance_mutex){+.+.}-{3:3}:
       __lock_acquire+0x1120/0x1e10
       lock_acquire+0x116/0x370
       __mutex_lock+0x7e/0x7b0
       btrfs_recover_balance+0x2f0/0x340
       open_ctree+0x1095/0x1726
       btrfs_mount_root.cold+0x12/0xea
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       vfs_kern_mount.part.0+0x71/0xb0
       btrfs_mount+0x10d/0x380
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       path_mount+0x433/0xc10
       __x64_sys_mount+0xe3/0x120
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

other info that might help us debug this:

Chain exists of:
  &fs_info->balance_mutex --> sb_internal#2 --> btrfs-root-00

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(btrfs-root-00);
                               lock(sb_internal#2);
                               lock(btrfs-root-00);
  lock(&fs_info->balance_mutex);

 *** DEADLOCK ***

2 locks held by mount/779:
 #0: ffffa0e60dc040e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380
 #1: ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100

stack backtrace:
CPU: 0 PID: 779 Comm: mount Not tainted 5.10.0-rc6+ #217
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
 dump_stack+0x8b/0xb0
 check_noncircular+0xcf/0xf0
 ? trace_call_bpf+0x139/0x260
 __lock_acquire+0x1120/0x1e10
 lock_acquire+0x116/0x370
 ? btrfs_recover_balance+0x2f0/0x340
 __mutex_lock+0x7e/0x7b0
 ? btrfs_recover_balance+0x2f0/0x340
 ? btrfs_recover_balance+0x2f0/0x340
 ? rcu_read_lock_sched_held+0x3f/0x80
 ? kmem_cache_alloc_trace+0x2c4/0x2f0
 ? btrfs_get_64+0x5e/0x100
 btrfs_recover_balance+0x2f0/0x340
 open_ctree+0x1095/0x1726
 btrfs_mount_root.cold+0x12/0xea
 ? rcu_read_lock_sched_held+0x3f/0x80
 legacy_get_tree+0x30/0x50
 vfs_get_tree+0x28/0xc0
 vfs_kern_mount.part.0+0x71/0xb0
 btrfs_mount+0x10d/0x380
 ? __kmalloc_track_caller+0x2f2/0x320
 legacy_get_tree+0x30/0x50
 vfs_get_tree+0x28/0xc0
 ? capable+0x3a/0x60
 path_mount+0x433/0xc10
 __x64_sys_mount+0xe3/0x120
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is thankfully straightforward to fix, simply release the path
before we setup the reloc_ctl.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7930e1c78c45..49ba941f0314 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4318,6 +4318,8 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
 		btrfs_warn(fs_info,
 	"balance: cannot set exclusive op status, resume manually");
 
+	btrfs_release_path(path);
+
 	mutex_lock(&fs_info->balance_mutex);
 	BUG_ON(fs_info->balance_ctl);
 	spin_lock(&fs_info->balance_lock);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (2 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:04   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
                   ` (49 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While testing the error paths in relocation, I hit the following lockdep
splat

======================================================
WARNING: possible circular locking dependency detected
5.10.0-rc3+ #206 Not tainted
------------------------------------------------------
btrfs-balance/1571 is trying to acquire lock:
ffff8cdbcc8f77d0 (&head_ref->mutex){+.+.}-{3:3}, at: btrfs_lookup_extent_info+0x156/0x3b0

but task is already holding lock:
ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (btrfs-tree-00){++++}-{3:3}:
       down_write_nested+0x43/0x80
       __btrfs_tree_lock+0x27/0x100
       btrfs_search_slot+0x248/0x890
       relocate_tree_blocks+0x490/0x650
       relocate_block_group+0x1ba/0x5d0
       kretprobe_trampoline+0x0/0x50

-> #1 (btrfs-csum-01){++++}-{3:3}:
       down_read_nested+0x43/0x130
       __btrfs_tree_read_lock+0x27/0x100
       btrfs_read_lock_root_node+0x31/0x40
       btrfs_search_slot+0x5ab/0x890
       btrfs_del_csums+0x10b/0x3c0
       __btrfs_free_extent+0x49d/0x8e0
       __btrfs_run_delayed_refs+0x283/0x11f0
       btrfs_run_delayed_refs+0x86/0x220
       btrfs_start_dirty_block_groups+0x2ba/0x520
       kretprobe_trampoline+0x0/0x50

-> #0 (&head_ref->mutex){+.+.}-{3:3}:
       __lock_acquire+0x1167/0x2150
       lock_acquire+0x116/0x3e0
       __mutex_lock+0x7e/0x7b0
       btrfs_lookup_extent_info+0x156/0x3b0
       walk_down_proc+0x1c3/0x280
       walk_down_tree+0x64/0xe0
       btrfs_drop_subtree+0x182/0x260
       do_relocation+0x52e/0x660
       relocate_tree_blocks+0x2ae/0x650
       relocate_block_group+0x1ba/0x5d0
       kretprobe_trampoline+0x0/0x50

other info that might help us debug this:

Chain exists of:
  &head_ref->mutex --> btrfs-csum-01 --> btrfs-tree-00

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(btrfs-tree-00);
                               lock(btrfs-csum-01);
                               lock(btrfs-tree-00);
  lock(&head_ref->mutex);

 *** DEADLOCK ***

5 locks held by btrfs-balance/1571:
 #0: ffff8cdb89749ff8 (&fs_info->delete_unused_bgs_mutex){+.+.}-{3:3}, at: btrfs_balance+0x563/0xf40
 #1: ffff8cdb89748838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x156/0x300
 #2: ffff8cdbc2c16650 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x413/0x5c0
 #3: ffff8cdbc135f538 (btrfs-treloc-01){+.+.}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
 #4: ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100

stack backtrace:
CPU: 1 PID: 1571 Comm: btrfs-balance Not tainted 5.10.0-rc3+ #206
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
 dump_stack+0x8b/0xb0
 check_noncircular+0xcf/0xf0
 ? trace_call_bpf+0x139/0x260
 __lock_acquire+0x1167/0x2150
 lock_acquire+0x116/0x3e0
 ? btrfs_lookup_extent_info+0x156/0x3b0
 __mutex_lock+0x7e/0x7b0
 ? btrfs_lookup_extent_info+0x156/0x3b0
 ? btrfs_lookup_extent_info+0x156/0x3b0
 ? release_extent_buffer+0x124/0x170
 ? _raw_spin_unlock+0x1f/0x30
 ? release_extent_buffer+0x124/0x170
 btrfs_lookup_extent_info+0x156/0x3b0
 walk_down_proc+0x1c3/0x280
 walk_down_tree+0x64/0xe0
 btrfs_drop_subtree+0x182/0x260
 do_relocation+0x52e/0x660
 relocate_tree_blocks+0x2ae/0x650
 ? add_tree_block+0x149/0x1b0
 relocate_block_group+0x1ba/0x5d0
 elfcorehdr_read+0x40/0x40
 ? elfcorehdr_read+0x40/0x40
 ? btrfs_balance+0x796/0xf40
 ? __kthread_parkme+0x66/0x90
 ? btrfs_balance+0xf40/0xf40
 ? balance_kthread+0x37/0x50
 ? kthread+0x137/0x150
 ? __kthread_bind_mask+0x60/0x60
 ? ret_from_fork+0x1f/0x30

As you can see this is bogus, we never take another tree's lock under
the csum lock.  This happens because sometimes we have to read tree
blocks from disk without knowing which root they belong to during
relocation.  We defaulted to an owner of 0, which translates to an fs
tree.  This is fine as all fs trees have the same class, but obviously
isn't fine if the block belongs to a cow only tree.

Thankfully cow only trees only have their owners root as a reference to
them, and since we already look up the extent information during
relocation, go ahead and check and see if this block might belong to a
cow only tree, and if so save the owner in the struct tree_block.  This
allows us to read_tree_block with the proper owner, which gets rid of
this lockdep splat.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 47 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 19b7db8b2117..2b30e39e922a 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -98,6 +98,7 @@ struct tree_block {
 		u64 bytenr;
 	}; /* Use rb_simple_node for search/insert */
 	struct btrfs_key key;
+	u64 owner;
 	unsigned int level:8;
 	unsigned int key_ready:1;
 };
@@ -2393,8 +2394,8 @@ static int get_tree_block_key(struct btrfs_fs_info *fs_info,
 {
 	struct extent_buffer *eb;
 
-	eb = read_tree_block(fs_info, block->bytenr, 0, block->key.offset,
-			     block->level, NULL);
+	eb = read_tree_block(fs_info, block->bytenr, block->owner,
+			     block->key.offset, block->level, NULL);
 	if (IS_ERR(eb)) {
 		return PTR_ERR(eb);
 	} else if (!extent_buffer_uptodate(eb)) {
@@ -2493,7 +2494,8 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans,
 	/* Kick in readahead for tree blocks with missing keys */
 	rbtree_postorder_for_each_entry_safe(block, next, blocks, rb_node) {
 		if (!block->key_ready)
-			btrfs_readahead_tree_block(fs_info, block->bytenr, 0, 0,
+			btrfs_readahead_tree_block(fs_info, block->bytenr,
+						   block->owner, 0,
 						   block->level);
 	}
 
@@ -2801,21 +2803,59 @@ static int add_tree_block(struct reloc_control *rc,
 	u32 item_size;
 	int level = -1;
 	u64 generation;
+	u64 owner = 0;
 
 	eb =  path->nodes[0];
 	item_size = btrfs_item_size_nr(eb, path->slots[0]);
 
 	if (extent_key->type == BTRFS_METADATA_ITEM_KEY ||
 	    item_size >= sizeof(*ei) + sizeof(*bi)) {
+		unsigned long ptr = 0, end;
+
 		ei = btrfs_item_ptr(eb, path->slots[0],
 				struct btrfs_extent_item);
+		end = (unsigned long)ei + item_size;
 		if (extent_key->type == BTRFS_EXTENT_ITEM_KEY) {
 			bi = (struct btrfs_tree_block_info *)(ei + 1);
 			level = btrfs_tree_block_level(eb, bi);
+			ptr = (unsigned long)(bi + 1);
 		} else {
 			level = (int)extent_key->offset;
+			ptr = (unsigned long)(ei + 1);
 		}
 		generation = btrfs_extent_generation(eb, ei);
+
+		/*
+		 * We're reading random blocks without knowing their owner ahead
+		 * of time.  This is ok most of the time, as all reloc roots and
+		 * fs roots have the same lock type.  However normal trees do
+		 * not, and the only way to know ahead of time is to read the
+		 * inline ref offset.  We know it's an fs root if
+		 *
+		 * 1. There's more than one ref.
+		 * 2. There's a SHARED_DATA_REF_KEY set.
+		 * 3. FULL_BACKREF is set on the flags.
+		 *
+		 * Otherwise it's safe to assume that the ref offset == the
+		 * owner of this block, so we can use that when calling
+		 * read_tree_block.
+		 */
+		if (btrfs_extent_refs(eb, ei) == 1 &&
+		    !(btrfs_extent_flags(eb, ei) &
+		      BTRFS_BLOCK_FLAG_FULL_BACKREF) &&
+		    ptr < end) {
+			struct btrfs_extent_inline_ref *iref;
+			int type;
+
+			iref = (struct btrfs_extent_inline_ref *)ptr;
+			type = btrfs_get_extent_inline_ref_type(eb, iref,
+							BTRFS_REF_TYPE_BLOCK);
+			if (type == BTRFS_REF_TYPE_INVALID)
+				return -EINVAL;
+			if (type == BTRFS_TREE_BLOCK_REF_KEY)
+				owner = btrfs_extent_inline_ref_offset(eb,
+								       iref);
+		}
 	} else if (unlikely(item_size == sizeof(struct btrfs_extent_item_v0))) {
 		btrfs_print_v0_err(eb->fs_info);
 		btrfs_handle_fs_error(eb->fs_info, -EINVAL, NULL);
@@ -2837,6 +2877,7 @@ static int add_tree_block(struct reloc_control *rc,
 	block->key.offset = generation;
 	block->level = level;
 	block->key_ready = 0;
+	block->owner = owner;
 
 	rb_node = rb_simple_insert(blocks, block->bytenr, &block->rb_node);
 	if (rb_node)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (3 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:06   ` Qu Wenruo
                     ` (2 more replies)
  2020-12-02 19:50 ` [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Josef Bacik
                   ` (48 subsequent siblings)
  53 siblings, 3 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

I was attempting to reproduce a problem that Zygo hit, but my error
injection wasn't firing for a few of the common calls to
btrfs_should_cancel_balance.  This is because the compiler decided to
inline it at these spots.  Keep this from happening by explicitly
noinline'ing the function so that error injection will always work.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 2b30e39e922a..ce935139d87b 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2617,7 +2617,7 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
 /*
  * Allow error injection to test balance cancellation
  */
-int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
+noinline int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
 {
 	return atomic_read(&fs_info->balance_cancel_req) ||
 		fatal_signal_pending(current);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (4 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:08   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 07/54] btrfs: pass down the tree block level through ref-verify Josef Bacik
                   ` (47 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Zygo reported the following panic when testing my error handling patches
for relocation

------------[ cut here ]------------
kernel BUG at fs/btrfs/backref.c:2545!
invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G        W 14
Hardware name: QEMU Standard PC (i440FX + PIIX,

Call Trace:
 btrfs_backref_error_cleanup+0x4df/0x530
 build_backref_tree+0x1a5/0x700
 ? _raw_spin_unlock+0x22/0x30
 ? release_extent_buffer+0x225/0x280
 ? free_extent_buffer.part.52+0xd7/0x140
 relocate_tree_blocks+0x2a6/0xb60
 ? kasan_unpoison_shadow+0x35/0x50
 ? do_relocation+0xc10/0xc10
 ? kasan_kmalloc+0x9/0x10
 ? kmem_cache_alloc_trace+0x6a3/0xcb0
 ? free_extent_buffer.part.52+0xd7/0x140
 ? rb_insert_color+0x342/0x360
 ? add_tree_block.isra.36+0x236/0x2b0
 relocate_block_group+0x2eb/0x780
 ? merge_reloc_roots+0x470/0x470
 btrfs_relocate_block_group+0x26e/0x4c0
 btrfs_relocate_chunk+0x52/0x120
 btrfs_balance+0xe2e/0x18f0
 ? pvclock_clocksource_read+0xeb/0x190
 ? btrfs_relocate_chunk+0x120/0x120
 ? lock_contended+0x620/0x6e0
 ? do_raw_spin_lock+0x1e0/0x1e0
 ? do_raw_spin_unlock+0xa8/0x140
 btrfs_ioctl_balance+0x1f9/0x460
 btrfs_ioctl+0x24c8/0x4380
 ? __kasan_check_read+0x11/0x20
 ? check_chain_key+0x1f4/0x2f0
 ? __asan_loadN+0xf/0x20
 ? btrfs_ioctl_get_supported_features+0x30/0x30
 ? kvm_sched_clock_read+0x18/0x30
 ? check_chain_key+0x1f4/0x2f0
 ? lock_downgrade+0x3f0/0x3f0
 ? handle_mm_fault+0xad6/0x2150
 ? do_vfs_ioctl+0xfc/0x9d0
 ? ioctl_file_clone+0xe0/0xe0
 ? check_flags.part.50+0x6c/0x1e0
 ? check_flags.part.50+0x6c/0x1e0
 ? check_flags+0x26/0x30
 ? lock_is_held_type+0xc3/0xf0
 ? syscall_enter_from_user_mode+0x1b/0x60
 ? do_syscall_64+0x13/0x80
 ? rcu_read_lock_sched_held+0xa1/0xd0
 ? __kasan_check_read+0x11/0x20
 ? __fget_light+0xae/0x110
 __x64_sys_ioctl+0xc3/0x100
 do_syscall_64+0x37/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This occurs because of this check

if (RB_EMPTY_NODE(&upper->rb_node))
	BUG_ON(!list_empty(&node->upper));

As we are dropping the backref node, if we discover that our upper node
in the edge we just cleaned up isn't linked into the cache that we are
now done with this node, thus the BUG_ON().

However this is an erroneous assumption, as we will look up all the
references for a node first, and then process the pending edges.  All of
the 'upper' nodes in our pending edges won't be in the cache's rb_tree
yet, because they haven't been processed.  We could very well have many
edges still left to cleanup on this node.

The fact is we simply do not need this check, we can just process all of
the edges only for this node, because below this check we do the
following

if (list_empty(&upper->lower)) {
	list_add_tail(&upper->lower, &cache->leaves);
	upper->lowest = 1;
}

If the upper node truly isn't used yet, then we add it to the
cache->leaves list to be cleaned up later.  If it is still used then the
last child node that has it linked into its node will add it to the
leaves list and then it will be cleaned up.

Fix this problem by dropping this logic altogether.  With this fix I no
longer see the panic when testing with error injection in the backref
code.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/backref.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..56f7c840031e 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -2541,13 +2541,6 @@ void btrfs_backref_cleanup_node(struct btrfs_backref_cache *cache,
 		list_del(&edge->list[UPPER]);
 		btrfs_backref_free_edge(cache, edge);
 
-		if (RB_EMPTY_NODE(&upper->rb_node)) {
-			BUG_ON(!list_empty(&node->upper));
-			btrfs_backref_drop_node(cache, node);
-			node = upper;
-			node->lowest = 1;
-			continue;
-		}
 		/*
 		 * Add the node to leaf node list if no other child block
 		 * cached.
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 07/54] btrfs: pass down the tree block level through ref-verify
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (5 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 08/54] btrfs: make sure owner is set in ref-verify Josef Bacik
                   ` (46 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

I noticed that sometimes I would have the wrong level printed out with
ref-verify while testing some error injection related problems.  This is
because we only get the level from the main extent item, but our
references could go off the current leaf into another, and at that point
we lose our level.  Fix this by keeping track of the last tree block
level that we found, the same way we keep track of our bytenr and
num_bytes, in case we happen to wander into another leaf while still
processing the references for a bytenr.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ref-verify.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c
index 4b9b6c52a83b..409b02566b25 100644
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -495,14 +495,15 @@ static int process_extent_item(struct btrfs_fs_info *fs_info,
 }
 
 static int process_leaf(struct btrfs_root *root,
-			struct btrfs_path *path, u64 *bytenr, u64 *num_bytes)
+			struct btrfs_path *path, u64 *bytenr, u64 *num_bytes,
+			int *tree_block_level)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct extent_buffer *leaf = path->nodes[0];
 	struct btrfs_extent_data_ref *dref;
 	struct btrfs_shared_data_ref *sref;
 	u32 count;
-	int i = 0, tree_block_level = 0, ret = 0;
+	int i = 0, ret = 0;
 	struct btrfs_key key;
 	int nritems = btrfs_header_nritems(leaf);
 
@@ -515,15 +516,15 @@ static int process_leaf(struct btrfs_root *root,
 		case BTRFS_METADATA_ITEM_KEY:
 			*bytenr = key.objectid;
 			ret = process_extent_item(fs_info, path, &key, i,
-						  &tree_block_level);
+						  tree_block_level);
 			break;
 		case BTRFS_TREE_BLOCK_REF_KEY:
 			ret = add_tree_block(fs_info, key.offset, 0,
-					     key.objectid, tree_block_level);
+					     key.objectid, *tree_block_level);
 			break;
 		case BTRFS_SHARED_BLOCK_REF_KEY:
 			ret = add_tree_block(fs_info, 0, key.offset,
-					     key.objectid, tree_block_level);
+					     key.objectid, *tree_block_level);
 			break;
 		case BTRFS_EXTENT_DATA_REF_KEY:
 			dref = btrfs_item_ptr(leaf, i,
@@ -549,7 +550,8 @@ static int process_leaf(struct btrfs_root *root,
 
 /* Walk down to the leaf from the given level */
 static int walk_down_tree(struct btrfs_root *root, struct btrfs_path *path,
-			  int level, u64 *bytenr, u64 *num_bytes)
+			  int level, u64 *bytenr, u64 *num_bytes,
+			  int *tree_block_level)
 {
 	struct extent_buffer *eb;
 	int ret = 0;
@@ -565,7 +567,8 @@ static int walk_down_tree(struct btrfs_root *root, struct btrfs_path *path,
 			path->slots[level-1] = 0;
 			path->locks[level-1] = BTRFS_READ_LOCK;
 		} else {
-			ret = process_leaf(root, path, bytenr, num_bytes);
+			ret = process_leaf(root, path, bytenr, num_bytes,
+					   tree_block_level);
 			if (ret)
 				break;
 		}
@@ -974,6 +977,7 @@ int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_path *path;
 	struct extent_buffer *eb;
+	int tree_block_level = 0;
 	u64 bytenr = 0, num_bytes = 0;
 	int ret, level;
 
@@ -998,7 +1002,7 @@ int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info)
 		 * different leaf from the original extent item.
 		 */
 		ret = walk_down_tree(fs_info->extent_root, path, level,
-				     &bytenr, &num_bytes);
+				     &bytenr, &num_bytes, &tree_block_level);
 		if (ret)
 			break;
 		ret = walk_up_tree(path, &level);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 08/54] btrfs: make sure owner is set in ref-verify
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (6 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 07/54] btrfs: pass down the tree block level through ref-verify Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
                   ` (45 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

I noticed that shared ref entries in ref-verify didn't have the proper
owner set, which caused me to think there was something seriously wrong.
However the problem is if we have a parent we simply weren't filling out
the owner part of the reference, even though we have it.  Fix this by
making sure we set all the proper fields when we modify a reference,
this way we'll have the proper owner if a problem happens and we don't
waste time thinking we're updating the wrong level.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ref-verify.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c
index 409b02566b25..2b490becbe67 100644
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -669,18 +669,18 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
 	u64 bytenr = generic_ref->bytenr;
 	u64 num_bytes = generic_ref->len;
 	u64 parent = generic_ref->parent;
-	u64 ref_root;
-	u64 owner;
-	u64 offset;
+	u64 ref_root = 0;
+	u64 owner = 0;
+	u64 offset = 0;
 
 	if (!btrfs_test_opt(fs_info, REF_VERIFY))
 		return 0;
 
 	if (generic_ref->type == BTRFS_REF_METADATA) {
-		ref_root = generic_ref->tree_ref.root;
+		if (!parent)
+			ref_root = generic_ref->tree_ref.root;
 		owner = generic_ref->tree_ref.level;
-		offset = 0;
-	} else {
+	} else if (!parent) {
 		ref_root = generic_ref->data_ref.ref_root;
 		owner = generic_ref->data_ref.ino;
 		offset = generic_ref->data_ref.offset;
@@ -696,13 +696,10 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
 		goto out;
 	}
 
-	if (parent) {
-		ref->parent = parent;
-	} else {
-		ref->root_objectid = ref_root;
-		ref->owner = owner;
-		ref->offset = offset;
-	}
+	ref->parent = parent;
+	ref->owner = owner;
+	ref->root_objectid = ref_root;
+	ref->offset = offset;
 	ref->num_refs = (action == BTRFS_DROP_DELAYED_REF) ? -1 : 1;
 
 	memcpy(&ra->ref, ref, sizeof(struct ref_entry));
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (7 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 08/54] btrfs: make sure owner is set in ref-verify Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:13   ` Qu Wenruo
  2020-12-03  8:58   ` Johannes Thumshirn
  2020-12-02 19:50 ` [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation Josef Bacik
                   ` (44 subsequent siblings)
  53 siblings, 2 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we fail to update a block group item in the loop we'll break, however
we'll do btrfs_run_delayed_refs and lose our error value in ret, and
thus not clean up properly.  Fix this by only running the delayed refs
if there was no failure.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 52f2198d44c9..0886e81e5540 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
 	 * Go through delayed refs for all the stuff we've just kicked off
 	 * and then loop back (just once)
 	 */
-	ret = btrfs_run_delayed_refs(trans, 0);
+	if (!ret)
+		ret = btrfs_run_delayed_refs(trans, 0);
 	if (!ret && loops == 0) {
 		loops++;
 		spin_lock(&cur_trans->dirty_bgs_lock);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (8 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:14   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block Josef Bacik
                   ` (43 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

A few of these are checking for correctness, and won't be triggered by
corrupted file systems, so convert them to ASSERT() instead of BUG_ON()
and add a comment explaining their existence.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index ce935139d87b..d0ce771a2a8d 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2183,7 +2183,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
 	int slot;
 	int ret = 0;
 
-	BUG_ON(lowest && node->eb);
+	/*
+	 * If we are lowest then this is the first time we're processing this
+	 * block, and thus shouldn't have an eb associated with it yet.
+	 */
+	ASSERT(!lowest || !node->eb);
 
 	path->lowest_level = node->level + 1;
 	rc->backref_cache.path[node->level] = node;
@@ -2268,7 +2272,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
 			free_extent_buffer(eb);
 			if (ret < 0)
 				goto next;
-			BUG_ON(node->eb != eb);
+			/*
+			 * We've just cow'ed this block, it should have updated
+			 * the correct backref node entry.
+			 */
+			ASSERT(node->eb == eb);
 		} else {
 			btrfs_set_node_blockptr(upper->eb, slot,
 						node->eb->start);
@@ -2304,7 +2312,12 @@ static int do_relocation(struct btrfs_trans_handle *trans,
 	}
 
 	path->lowest_level = 0;
-	BUG_ON(ret == -ENOSPC);
+
+	/*
+	 * We should have allocated all of our space in the block rsv and thus
+	 * shouldn't ENOSPC.
+	 */
+	ASSERT(ret != -ENOSPC);
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (9 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:15   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
                   ` (42 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have a couple of BUG_ON()'s in relocate_tree_block() that can be
tripped if we have file system corruption.  Convert these to ASSERT()'s
so developers still get yelled at when they break the backref code, but
error out nicely for users so the whole box doesn't go down.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index d0ce771a2a8d..4333ee329290 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2456,8 +2456,28 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 
 	if (root) {
 		if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
-			BUG_ON(node->new_bytenr);
-			BUG_ON(!list_empty(&node->list));
+			/*
+			 * This block was the root block of a root, and this is
+			 * the first time we're processing the block and thus it
+			 * should not have had the ->new_bytenr modified and
+			 * should have not been included on the changed list.
+			 *
+			 * However in the case of corruption we could have
+			 * multiple refs pointing to the same block improperly,
+			 * and thus we would trip over these checks.  ASSERT()
+			 * for the developer case, because it could indicate a
+			 * bug in the backref code, however error out for a
+			 * normal user in the case of corruption.
+			 */
+			ASSERT(node->new_bytenr == 0);
+			ASSERT(list_empty(&node->list));
+			if (node->new_bytenr || !list_empty(&node->list)) {
+				btrfs_err(root->fs_info,
+				  "bytenr %llu has improper references to it",
+					  node->bytenr);
+				ret = -EUCLEAN;
+				goto out;
+			}
 			btrfs_record_root_in_trans(trans, root);
 			root = root->reloc_root;
 			node->new_bytenr = root->node->start;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (10 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:20   ` Qu Wenruo
  2020-12-03 13:50   ` Johannes Thumshirn
  2020-12-02 19:50 ` [PATCH v3 13/54] btrfs: handle errors from select_reloc_root() Josef Bacik
                   ` (41 subsequent siblings)
  53 siblings, 2 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can create a reloc root when we record the root in the trans, which
can fail for all sorts of different reasons.  Propagate this error up
the chain of callers.  Future patches will fix the callers of
btrfs_record_root_in_trans() to handle the error.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index a614f7699ce4..28e7a7464b60 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -400,6 +400,7 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
 			       int force)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
+	int ret = 0;
 
 	if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 	    root->last_trans < trans->transid) || force) {
@@ -448,11 +449,11 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
 		 * lock.  smp_wmb() makes sure that all the writes above are
 		 * done before we pop in the zero below
 		 */
-		btrfs_init_reloc_root(trans, root);
+		ret = btrfs_init_reloc_root(trans, root);
 		smp_mb__before_atomic();
 		clear_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
 	}
-	return 0;
+	return ret;
 }
 
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 13/54] btrfs: handle errors from select_reloc_root()
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (11 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:23   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors Josef Bacik
                   ` (40 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Currently select_reloc_root() doesn't return an error, but followup
patches will make it possible for it to return an error.  We do have
proper error recovery in do_relocation however, so handle the
possibility of select_reloc_root() having an error properly instead of
BUG_ON(!root).  I've also adjusted select_reloc_root() to return
ERR_PTR(-ENOENT) if we don't find a root, instead of NULL, to make the
error case easier to deal with.  I've replaced the BUG_ON(!root) with an
ASSERT(ret != -ENOENT), as this indicates we messed up the backref
walking code, but could indicate corruption so we do not want to have a
BUG_ON() here.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 4333ee329290..66515ccc04fe 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2027,7 +2027,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 			break;
 	}
 	if (!root)
-		return NULL;
+		return ERR_PTR(-ENOENT);
 
 	next = node;
 	/* setup backref node path for btrfs_reloc_cow_block */
@@ -2198,7 +2198,18 @@ static int do_relocation(struct btrfs_trans_handle *trans,
 
 		upper = edge->node[UPPER];
 		root = select_reloc_root(trans, rc, upper, edges);
-		BUG_ON(!root);
+		if (IS_ERR(root)) {
+			ret = PTR_ERR(root);
+
+			/*
+			 * This can happen if there's fs corruption, but if we
+			 * have ASSERT()'s on then we're developers and we
+			 * likely made a logic mistake in the backref code, so
+			 * check for this error condition.
+			 */
+			ASSERT(ret != -ENOENT);
+			goto next;
+		}
 
 		if (upper->eb && !upper->locked) {
 			if (!lowest) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (12 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 13/54] btrfs: handle errors from select_reloc_root() Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:29   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root Josef Bacik
                   ` (39 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have several BUG_ON()'s in select_reloc_root() that can be tripped if
you have extent tree corruption.  Convert these to ASSERT()'s, because
if we hit it during testing it really is bad, or could indicate a
problem with the backref walking code.

However if users hit these problems it generally indicates corruption,
I've hit a few machines in the fleet that trip over these with clearly
corrupted extent trees, so be nice and spit out an error message and
return an error instead of bringing the whole box down.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 51 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 66515ccc04fe..bf4e1018356a 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1996,8 +1996,35 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 		cond_resched();
 		next = walk_up_backref(next, edges, &index);
 		root = next->root;
-		BUG_ON(!root);
-		BUG_ON(!test_bit(BTRFS_ROOT_SHAREABLE, &root->state));
+
+		/*
+		 * If there is no root, then our references for this block are
+		 * incomplete, as we should be able to walk all the way up to a
+		 * block that is owned by a root.
+		 *
+		 * This path is only for SHAREABLE roots, so if we come upon a
+		 * non-SHAREABLE root then we have backrefs that resolve
+		 * improperly.
+		 *
+		 * Both of these cases indicate file system corruption, or a bug
+		 * in the backref walking code.  The ASSERT() is to make sure
+		 * developers get bitten as soon as possible, proper error
+		 * handling is for users who may have corrupt file systems.
+		 */
+		if (!root) {
+			ASSERT(root);
+			btrfs_err(trans->fs_info,
+		"bytenr %llu doesn't have a backref path ending in a root",
+				  node->bytenr);
+			return ERR_PTR(-EUCLEAN);
+		}
+		if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
+			ASSERT(test_bit(BTRFS_ROOT_SHAREABLE, &root->state));
+			btrfs_err(trans->fs_info,
+"bytenr %llu has multiple refs with one ending in a non shareable root",
+				  node->bytenr);
+			return ERR_PTR(-EUCLEAN);
+		}
 
 		if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
 			record_reloc_root_in_trans(trans, root);
@@ -2008,8 +2035,24 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 		root = root->reloc_root;
 
 		if (next->new_bytenr != root->node->start) {
-			BUG_ON(next->new_bytenr);
-			BUG_ON(!list_empty(&next->list));
+			/*
+			 * We just created the reloc root, so we shouldn't have
+			 * ->new_bytenr set and this shouldn't be in the changed
+			 *  list.  If it is then we have multiple roots pointing
+			 *  at the same bytenr, or we've made a mistake in the
+			 *  backref walking code.  ASSERT() for developers,
+			 *  error out for users, as it indicates corruption or a
+			 *  bad bug.
+			 */
+			ASSERT(next->new_bytenr == 0);
+			ASSERT(list_empty(&next->list));
+			if (next->new_bytenr || !list_empty(&next->list)) {
+				btrfs_err(trans->fs_info,
+"bytenr %llu possibly has multiple roots pointing at the same bytenr %llu",
+					  node->bytenr, next->bytenr);
+				return ERR_PTR(-EUCLEAN);
+			}
+
 			next->new_bytenr = root->node->start;
 			btrfs_put_root(next->root);
 			next->root = btrfs_grab_root(root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (13 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:33   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans Josef Bacik
                   ` (38 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We will record the fs root or the reloc root in the trans in
select_reloc_root.  These will actually return errors in the following
patches, so check their return value here and return it up the stack.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index bf4e1018356a..d663d8fc085d 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1990,6 +1990,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 	struct btrfs_backref_node *next;
 	struct btrfs_root *root;
 	int index = 0;
+	int ret;
 
 	next = node;
 	while (1) {
@@ -2027,11 +2028,15 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 		}
 
 		if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
-			record_reloc_root_in_trans(trans, root);
+			ret = record_reloc_root_in_trans(trans, root);
+			if (ret)
+				return ERR_PTR(ret);
 			break;
 		}
 
-		btrfs_record_root_in_trans(trans, root);
+		ret = btrfs_record_root_in_trans(trans, root);
+		if (ret)
+			return ERR_PTR(ret);
 		root = root->reloc_root;
 
 		if (next->new_bytenr != root->node->start) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (14 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:39   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange Josef Bacik
                   ` (37 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Generally speaking this shouldn't ever fail, the corresponding fs root
for the reloc root will already be in memory, so we won't get -ENOMEM
here.

However if there is no corresponding root for the reloc root then we
could get -ENOMEM when we try to allocate it or we could get -ENOENT
when we look it up and see that it doesn't exist.

Convert these BUG_ON()'s into ASSERT()'s + proper error handling for the
case of corruption.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index d663d8fc085d..5a4b44857522 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1973,8 +1973,30 @@ static int record_reloc_root_in_trans(struct btrfs_trans_handle *trans,
 		return 0;
 
 	root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset, false);
-	BUG_ON(IS_ERR(root));
-	BUG_ON(root->reloc_root != reloc_root);
+
+	/*
+	 * This should succeed, since we can't have a reloc root without having
+	 * already looked up the actual root and created the reloc root for this
+	 * root.
+	 *
+	 * However if there's some sort of corruption where we have a ref to a
+	 * reloc root without a corresponding root this could return -ENOENT.
+	 *
+	 * The ASSERT()'s are to catch this case in testing, because it could
+	 * indicate a bug, but for non-developers it indicates corruption and we
+	 * should error out.
+	 */
+	ASSERT(!IS_ERR(root));
+	ASSERT(root->reloc_root == reloc_root);
+	if (IS_ERR(root))
+		return PTR_ERR(root);
+	if (root->reloc_root != reloc_root) {
+		btrfs_err(fs_info,
+			  "root %llu has two reloc roots associated with it",
+			  reloc_root->root_key.offset);
+		btrfs_put_root(root);
+		return -EUCLEAN;
+	}
 	ret = btrfs_record_root_in_trans(trans, root);
 	btrfs_put_root(root);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (15 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:40   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 18/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename Josef Bacik
                   ` (36 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in btrfs_rename_exchange.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/inode.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ce42d52d53e..d34cba37a08f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8878,8 +8878,11 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 		goto out_notrans;
 	}
 
-	if (dest != root)
-		btrfs_record_root_in_trans(trans, dest);
+	if (dest != root) {
+		ret = btrfs_record_root_in_trans(trans, dest);
+		if (ret)
+			goto out_fail;
+	}
 
 	/*
 	 * We need to find a free sequence number both in the source and
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 18/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (16 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume Josef Bacik
                   ` (35 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in btrfs_rename.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/inode.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d34cba37a08f..40601a0ff4f2 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9186,8 +9186,11 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		goto out_notrans;
 	}
 
-	if (dest != root)
-		btrfs_record_root_in_trans(trans, dest);
+	if (dest != root) {
+		ret = btrfs_record_root_in_trans(trans, dest);
+		if (ret)
+			goto out_fail;
+	}
 
 	ret = btrfs_set_inode_index(BTRFS_I(new_dir), &index);
 	if (ret)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (17 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 18/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:41   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees Josef Bacik
                   ` (34 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in btrfs_delete_subvolume.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/inode.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 40601a0ff4f2..1f9fa63ef194 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4157,7 +4157,11 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
 		goto out_end_trans;
 	}
 
-	btrfs_record_root_in_trans(trans, dest);
+	ret = btrfs_record_root_in_trans(trans, dest);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto out_end_trans;
+	}
 
 	memset(&dest->root_item.drop_progress, 0,
 		sizeof(dest->root_item.drop_progress));
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (18 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:42   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol Josef Bacik
                   ` (33 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in btrfs_recover_log_trees.

This appears tricky, however we have a reference count on the
destination root, so if this fails we need to continue on in the loop to
make sure the properly cleanup is done.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-log.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 254c2ee43aae..77adeb3c988d 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6286,8 +6286,12 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
 		}
 
 		wc.replay_dest->log_root = log;
-		btrfs_record_root_in_trans(trans, wc.replay_dest);
-		ret = walk_log_tree(trans, log, &wc);
+		ret = btrfs_record_root_in_trans(trans, wc.replay_dest);
+		if (ret)
+			btrfs_handle_fs_error(fs_info, ret,
+				"Couldn't record the root in the transaction.");
+		else
+			ret = walk_log_tree(trans, log, &wc);
 
 		if (!ret && wc.stage == LOG_WALK_REPLAY_ALL) {
 			ret = fixup_inode_link_counts(trans, wc.replay_dest,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (19 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:43   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block Josef Bacik
                   ` (32 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in create_subvol.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ioctl.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 703212ff50a5..ad50e654ee64 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -714,7 +714,11 @@ static noinline int create_subvol(struct inode *dir,
 	/* Freeing will be done in btrfs_put_root() of new_root */
 	anon_dev = 0;
 
-	btrfs_record_root_in_trans(trans, new_root);
+	ret = btrfs_record_root_in_trans(trans, new_root);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto fail;
+	}
 
 	ret = btrfs_create_subvol_root(trans, new_root, root, new_dirid);
 	btrfs_put_root(new_root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (20 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:44   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction Josef Bacik
                   ` (31 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in relocate_tree_block.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 5a4b44857522..e9d445899818 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2559,7 +2559,9 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 				ret = -EUCLEAN;
 				goto out;
 			}
-			btrfs_record_root_in_trans(trans, root);
+			ret = btrfs_record_root_in_trans(trans, root);
+			if (ret)
+				goto out;
 			root = root->reloc_root;
 			node->new_bytenr = root->node->start;
 			btrfs_put_root(node->root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (21 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:47   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot Josef Bacik
                   ` (30 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_record_root_in_trans will return errors in the future, so handle
the error properly in start_transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 28e7a7464b60..c17ab5194f5a 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -734,7 +734,11 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
 	 * Thus it need to be called after current->journal_info initialized,
 	 * or we can deadlock.
 	 */
-	btrfs_record_root_in_trans(h, root);
+	ret = btrfs_record_root_in_trans(h, root);
+	if (ret) {
+		btrfs_end_transaction(h);
+		return ERR_PTR(ret);
+	}
 
 	return h;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (22 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:48   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 25/54] btrfs: handle record_root_in_trans failure in btrfs_record_root_in_trans Josef Bacik
                   ` (29 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

record_root_in_trans can fail currently, so handle this failure
properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index c17ab5194f5a..db676d99b098 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1436,7 +1436,9 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
 	 * recorded root will never be updated again, causing an outdated root
 	 * item.
 	 */
-	record_root_in_trans(trans, src, 1);
+	ret = record_root_in_trans(trans, src, 1);
+	if (ret)
+		return ret;
 
 	/*
 	 * We are going to commit transaction, see btrfs_commit_transaction()
@@ -1488,7 +1490,7 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
 	 * insert_dir_item()
 	 */
 	if (!ret)
-		record_root_in_trans(trans, parent, 1);
+		ret = record_root_in_trans(trans, parent, 1);
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 25/54] btrfs: handle record_root_in_trans failure in btrfs_record_root_in_trans
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (23 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot Josef Bacik
                   ` (28 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

record_root_in_trans can fail currently, handle this failure properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index db676d99b098..087d919de9fb 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -480,6 +480,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
+	int ret;
 
 	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		return 0;
@@ -494,10 +495,10 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
 		return 0;
 
 	mutex_lock(&fs_info->reloc_mutex);
-	record_root_in_trans(trans, root, 0);
+	ret = record_root_in_trans(trans, root, 0);
 	mutex_unlock(&fs_info->reloc_mutex);
 
-	return 0;
+	return ret;
 }
 
 static inline int is_transaction_blocked(struct btrfs_transaction *trans)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (24 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 25/54] btrfs: handle record_root_in_trans failure in btrfs_record_root_in_trans Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  2:56   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root Josef Bacik
                   ` (27 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

record_root_in_trans can currently fail, so handle this failure
properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 087d919de9fb..5393c0c4926c 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1568,8 +1568,9 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 	dentry = pending->dentry;
 	parent_inode = pending->dir;
 	parent_root = BTRFS_I(parent_inode)->root;
-	record_root_in_trans(trans, parent_root, 0);
-
+	ret = record_root_in_trans(trans, parent_root, 0);
+	if (ret)
+		goto fail;
 	cur_time = current_time(parent_inode);
 
 	/*
@@ -1605,7 +1606,11 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 		goto fail;
 	}
 
-	record_root_in_trans(trans, root, 0);
+	ret = record_root_in_trans(trans, root, 0);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto fail;
+	}
 	btrfs_set_root_last_snapshot(&root->root_item, trans->transid);
 	memcpy(new_root_item, &root->root_item, sizeof(*new_root_item));
 	btrfs_check_and_init_root_item(new_root_item);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (25 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  3:00   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 28/54] btrfs: have proper error handling in btrfs_init_reloc_root Josef Bacik
                   ` (26 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we have a duplicate entry for a reloc root then we could have fs
corruption that resulted in a double allocation.  This shouldn't happen
generally so leave an ASSERT() for this case, but return an error
instead of panicing in the normal user case.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index e9d445899818..7993a34a46ca 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -637,10 +637,12 @@ static int __must_check __add_reloc_root(struct btrfs_root *root)
 	rb_node = rb_simple_insert(&rc->reloc_root_tree.rb_root,
 				   node->bytenr, &node->rb_node);
 	spin_unlock(&rc->reloc_root_tree.lock);
+	ASSERT(rb_node == NULL);
 	if (rb_node) {
-		btrfs_panic(fs_info, -EEXIST,
+		btrfs_err(fs_info,
 			    "Duplicate root found for start=%llu while inserting into relocation tree",
 			    node->bytenr);
+		return -EEXIST;
 	}
 
 	list_add_tail(&root->root_list, &rc->reloc_roots);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 28/54] btrfs: have proper error handling in btrfs_init_reloc_root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (26 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root Josef Bacik
                   ` (25 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

create_reloc_root will return errors in the future, and __add_reloc_root
can return -ENOMEM or -EEXIST, so handle these errors properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 7993a34a46ca..6d3a80d54b32 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -860,9 +860,14 @@ int btrfs_init_reloc_root(struct btrfs_trans_handle *trans,
 	reloc_root = create_reloc_root(trans, root, root->root_key.objectid);
 	if (clear_rsv)
 		trans->block_rsv = rsv;
+	if (IS_ERR(reloc_root))
+		return PTR_ERR(reloc_root);
 
 	ret = __add_reloc_root(reloc_root);
-	BUG_ON(ret < 0);
+	if (ret) {
+		btrfs_put_root(reloc_root);
+		return ret;
+	}
 	root->reloc_root = btrfs_grab_root(reloc_root);
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (27 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 28/54] btrfs: have proper error handling in btrfs_init_reloc_root Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  3:29   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans Josef Bacik
                   ` (24 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We do memory allocations here, read blocks from disk, all sorts of
operations that could easily fail at any given point.  Instead of
panicing the box, simply return the error back up the chain, all callers
at this point have proper error handling.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 6d3a80d54b32..cebf8e9d7d96 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -737,10 +737,11 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
 	struct extent_buffer *eb;
 	struct btrfs_root_item *root_item;
 	struct btrfs_key root_key;
-	int ret;
+	int ret = 0;
 
 	root_item = kmalloc(sizeof(*root_item), GFP_NOFS);
-	BUG_ON(!root_item);
+	if (!root_item)
+		return ERR_PTR(-ENOMEM);
 
 	root_key.objectid = BTRFS_TREE_RELOC_OBJECTID;
 	root_key.type = BTRFS_ROOT_ITEM_KEY;
@@ -752,7 +753,9 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
 		/* called by btrfs_init_reloc_root */
 		ret = btrfs_copy_root(trans, root, root->commit_root, &eb,
 				      BTRFS_TREE_RELOC_OBJECTID);
-		BUG_ON(ret);
+		if (ret)
+			goto fail;
+
 		/*
 		 * Set the last_snapshot field to the generation of the commit
 		 * root - like this ctree.c:btrfs_block_can_be_shared() behaves
@@ -773,7 +776,8 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
 		 */
 		ret = btrfs_copy_root(trans, root, root->node, &eb,
 				      BTRFS_TREE_RELOC_OBJECTID);
-		BUG_ON(ret);
+		if (ret)
+			goto fail;
 	}
 
 	memcpy(root_item, &root->root_item, sizeof(*root_item));
@@ -793,14 +797,20 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
 
 	ret = btrfs_insert_root(trans, fs_info->tree_root,
 				&root_key, root_item);
-	BUG_ON(ret);
+	if (ret)
+		goto fail;
+
 	kfree(root_item);
 
 	reloc_root = btrfs_read_tree_root(fs_info->tree_root, &root_key);
-	BUG_ON(IS_ERR(reloc_root));
+	if (IS_ERR(reloc_root))
+		return reloc_root;
 	set_bit(BTRFS_ROOT_SHAREABLE, &reloc_root->state);
 	reloc_root->last_trans = trans->transid;
 	return reloc_root;
+fail:
+	kfree(root_item);
+	return ERR_PTR(ret);
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (28 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  4:49   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots Josef Bacik
                   ` (23 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Zygo Blaxell

If we fail to setup a ->reloc_root in a different thread that path will
error out, however it still leaves root->reloc_root NULL but would still
appear set up in the transaction.  Subsequent calls to
btrfs_record_root_in_transaction would succeed without attempting to
create the reloc root, as the transid has already been update.  Handle
this case by making sure we have a root->reloc_root set after a
btrfs_record_root_in_transaction call so we don't end up deref'ing a
NULL pointer.

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index cebf8e9d7d96..c9df05f02649 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2078,6 +2078,13 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 			return ERR_PTR(ret);
 		root = root->reloc_root;
 
+		/*
+		 * We could have raced with another thread which failed, so
+		 * ->reloc_root may not be set, return -ENOENT in this case.
+		 */
+		if (!root)
+			return ERR_PTR(-ENOENT);
+
 		if (next->new_bytenr != root->node->start) {
 			/*
 			 * We just created the reloc root, so we shouldn't have
@@ -2579,6 +2586,14 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 			ret = btrfs_record_root_in_trans(trans, root);
 			if (ret)
 				goto out;
+			/*
+			 * Another thread could have failed, need to check if we
+			 * have ->reloc_root actually set.
+			 */
+			if (!root->reloc_root) {
+				ret = -ENOENT;
+				goto out;
+			}
 			root = root->reloc_root;
 			node->new_bytenr = root->node->start;
 			btrfs_put_root(node->root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (29 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  4:51   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 32/54] btrfs: change insert_dirty_subvol to return errors Josef Bacik
                   ` (22 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_update_reloc_root will will return errors in the future, so handle
the error properly in commit_fs_roots.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 5393c0c4926c..5064beff3f9f 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1344,7 +1344,9 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
 			spin_unlock(&fs_info->fs_roots_radix_lock);
 
 			btrfs_free_log(trans, root);
-			btrfs_update_reloc_root(trans, root);
+			err = btrfs_update_reloc_root(trans, root);
+			if (err)
+				return err;
 
 			/* see comments in should_cow_block() */
 			clear_bit(BTRFS_ROOT_FORCE_COW, &root->state);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 32/54] btrfs: change insert_dirty_subvol to return errors
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (30 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 33/54] btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol Josef Bacik
                   ` (21 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This will be able to return errors in the future, so change it to return
an error and handle the error appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index c9df05f02649..6b2d7168f98e 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1556,9 +1556,9 @@ static int find_next_key(struct btrfs_path *path, int level,
 /*
  * Insert current subvolume into reloc_control::dirty_subvol_roots
  */
-static void insert_dirty_subvol(struct btrfs_trans_handle *trans,
-				struct reloc_control *rc,
-				struct btrfs_root *root)
+static int insert_dirty_subvol(struct btrfs_trans_handle *trans,
+			       struct reloc_control *rc,
+			       struct btrfs_root *root)
 {
 	struct btrfs_root *reloc_root = root->reloc_root;
 	struct btrfs_root_item *reloc_root_item;
@@ -1578,6 +1578,7 @@ static void insert_dirty_subvol(struct btrfs_trans_handle *trans,
 		btrfs_grab_root(root);
 		list_add_tail(&root->reloc_dirty_list, &rc->dirty_subvol_roots);
 	}
+	return 0;
 }
 
 static int clean_dirty_subvols(struct reloc_control *rc)
@@ -1779,8 +1780,11 @@ static noinline_for_stack int merge_reloc_root(struct reloc_control *rc,
 out:
 	btrfs_free_path(path);
 
-	if (ret == 0)
-		insert_dirty_subvol(trans, rc, root);
+	if (ret == 0) {
+		ret = insert_dirty_subvol(trans, rc, root);
+		if (ret)
+			btrfs_abort_transaction(trans, ret);
+	}
 
 	if (trans)
 		btrfs_end_transaction_throttle(trans);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 33/54] btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (31 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 32/54] btrfs: change insert_dirty_subvol to return errors Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 34/54] btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge Josef Bacik
                   ` (20 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_update_reloc_root will will return errors in the future, so handle
the error properly in insert_dirty_subvol.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 6b2d7168f98e..96cc9376b3a6 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1562,6 +1562,7 @@ static int insert_dirty_subvol(struct btrfs_trans_handle *trans,
 {
 	struct btrfs_root *reloc_root = root->reloc_root;
 	struct btrfs_root_item *reloc_root_item;
+	int ret;
 
 	/* @root must be a subvolume tree root with a valid reloc tree */
 	ASSERT(root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID);
@@ -1572,7 +1573,9 @@ static int insert_dirty_subvol(struct btrfs_trans_handle *trans,
 		sizeof(reloc_root_item->drop_progress));
 	btrfs_set_root_drop_level(reloc_root_item, 0);
 	btrfs_set_root_refs(reloc_root_item, 0);
-	btrfs_update_reloc_root(trans, root);
+	ret = btrfs_update_reloc_root(trans, root);
+	if (ret)
+		return ret;
 
 	if (list_empty(&root->reloc_dirty_list)) {
 		btrfs_grab_root(root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 34/54] btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (32 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 33/54] btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-02 19:50 ` [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root Josef Bacik
                   ` (19 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_update_reloc_root will will return errors in the future, so handle
an error properly in prepare_to_merge.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 96cc9376b3a6..e41d14958b8b 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1860,10 +1860,21 @@ int prepare_to_merge(struct reloc_control *rc, int err)
 		 */
 		if (!err)
 			btrfs_set_root_refs(&reloc_root->root_item, 1);
-		btrfs_update_reloc_root(trans, root);
+		ret = btrfs_update_reloc_root(trans, root);
 
+		/*
+		 * Even if we have an error we need this reloc root back on our
+		 * list so we can clean up properly.
+		 */
 		list_add(&reloc_root->root_list, &reloc_roots);
 		btrfs_put_root(root);
+
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			if (!err)
+				err = ret;
+			break;
+		}
 	}
 
 	list_splice(&reloc_roots, &rc->reloc_roots);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (33 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 34/54] btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  4:54   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s Josef Bacik
                   ` (18 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We call btrfs_update_root in btrfs_update_reloc_root, which can fail for
all sorts of reasons, including IO errors.  Instead of panicing the box
lets return the error, now that all callers properly handle those
errors.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index e41d14958b8b..2fcb07bc8450 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -894,7 +894,7 @@ int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
 	int ret;
 
 	if (!have_reloc_root(root))
-		goto out;
+		return 0;
 
 	reloc_root = root->reloc_root;
 	root_item = &reloc_root->root_item;
@@ -927,10 +927,8 @@ int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
 
 	ret = btrfs_update_root(trans, fs_info->tree_root,
 				&reloc_root->root_key, root_item);
-	BUG_ON(ret);
 	btrfs_put_root(reloc_root);
-out:
-	return 0;
+	return ret;
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (34 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  4:55   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path Josef Bacik
                   ` (17 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

A few BUG_ON()'s in replace_path are purely to keep us from making
logical mistakes, so replace them with ASSERT()'s.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 2fcb07bc8450..b872a64de8bb 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1202,8 +1202,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 	int ret;
 	int slot;
 
-	BUG_ON(src->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID);
-	BUG_ON(dest->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID);
+	ASSERT(src->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID);
+	ASSERT(dest->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID);
 
 	last_snapshot = btrfs_root_last_snapshot(&src->root_item);
 again:
@@ -1234,7 +1234,7 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 	parent = eb;
 	while (1) {
 		level = btrfs_header_level(parent);
-		BUG_ON(level < lowest_level);
+		ASSERT(level >= lowest_level);
 
 		ret = btrfs_bin_search(parent, &key, &slot);
 		if (ret < 0)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (35 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  5:05   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 38/54] btrfs: handle the loop " Josef Bacik
                   ` (16 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we error out cow'ing the root node when doing a replace_path then we
simply unlock and free the buffer and return the error.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index b872a64de8bb..52d6e7ab4265 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1222,7 +1222,11 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 	if (cow) {
 		ret = btrfs_cow_block(trans, dest, eb, NULL, 0, &eb,
 				      BTRFS_NESTING_COW);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_tree_unlock(eb);
+			free_extent_buffer(eb);
+			return ret;
+		}
 	}
 
 	if (next_key) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 38/54] btrfs: handle the loop btrfs_cow_block error in replace_path
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (36 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  5:11   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure " Josef Bacik
                   ` (15 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

As we loop through the path to replace it, we will have to cow each node
we hit on the path down to the lowest_level.  If this fails we simply
unlock and free the block and break from the loop.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 52d6e7ab4265..781908f3a3af 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1286,7 +1286,11 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 				ret = btrfs_cow_block(trans, dest, eb, parent,
 						      slot, &eb,
 						      BTRFS_NESTING_COW);
-				BUG_ON(ret);
+				if (ret) {
+					btrfs_tree_unlock(eb);
+					free_extent_buffer(eb);
+					break;
+				}
 			}
 
 			btrfs_tree_unlock(parent);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure in replace_path
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (37 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 38/54] btrfs: handle the loop " Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  5:13   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 40/54] btrfs: handle errors in reference count manipulation " Josef Bacik
                   ` (14 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This can fail for any number of reasons, why bring the whole box down
with it?

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 781908f3a3af..8c407ebc5500 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1314,7 +1314,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 		path->lowest_level = level;
 		ret = btrfs_search_slot(trans, src, &key, path, 0, 1);
 		path->lowest_level = 0;
-		BUG_ON(ret);
+		if (ret)
+			break;
 
 		/*
 		 * Info qgroup to trace both subtrees.
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 40/54] btrfs: handle errors in reference count manipulation in replace_path
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (38 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure " Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  5:14   ` Qu Wenruo
  2020-12-02 19:50 ` [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation Josef Bacik
                   ` (13 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If any of the reference count manipulation stuff fails in replace_path
we need to abort the transaction, as we've modified the blocks already.
We can simply break at this point and everything will be cleaned up.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 8c407ebc5500..ef33b89e352e 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1355,27 +1355,39 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
 		ref.skip_qgroup = true;
 		btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid);
 		ret = btrfs_inc_extent_ref(trans, &ref);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			break;
+		}
 		btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, new_bytenr,
 				       blocksize, 0);
 		ref.skip_qgroup = true;
 		btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid);
 		ret = btrfs_inc_extent_ref(trans, &ref);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			break;
+		}
 
 		btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, new_bytenr,
 				       blocksize, path->nodes[level]->start);
 		btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid);
 		ref.skip_qgroup = true;
 		ret = btrfs_free_extent(trans, &ref);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			break;
+		}
 
 		btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, old_bytenr,
 				       blocksize, 0);
 		btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid);
 		ref.skip_qgroup = true;
 		ret = btrfs_free_extent(trans, &ref);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			break;
+		}
 
 		btrfs_unlock_up_safe(path, 0);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (39 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 40/54] btrfs: handle errors in reference count manipulation " Josef Bacik
@ 2020-12-02 19:50 ` Josef Bacik
  2020-12-03  5:15   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly Josef Bacik
                   ` (12 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can already deal with errors appropriately from do_relocation, simply
handle any errors that come from changing the refs at this point
cleanly.  We have to abort the transaction if we fail here as we've
modified metadata at this point.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index ef33b89e352e..3159f6517588 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2433,10 +2433,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
 			btrfs_init_tree_ref(&ref, node->level,
 					    btrfs_header_owner(upper->eb));
 			ret = btrfs_inc_extent_ref(trans, &ref);
-			BUG_ON(ret);
-
-			ret = btrfs_drop_subtree(trans, root, eb, upper->eb);
-			BUG_ON(ret);
+			if (ret) {
+				btrfs_abort_transaction(trans, ret);
+				goto next;
+			}
+			btrfs_drop_subtree(trans, root, eb, upper->eb);
 		}
 next:
 		if (!upper->pending)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (40 preceding siblings ...)
  2020-12-02 19:50 ` [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:19   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group Josef Bacik
                   ` (11 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We need to validate that a data extent item does not have the
FULL_BACKREF flag set on it's flags.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-checker.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 028e733e42f3..39714aeb9b36 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1283,6 +1283,11 @@ static int check_extent_item(struct extent_buffer *leaf,
 				   key->offset, fs_info->sectorsize);
 			return -EUCLEAN;
 		}
+		if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
+			extent_err(leaf, slot,
+			"invalid extent flag, data has full backref set");
+			return -EUCLEAN;
+		}
 	}
 	ptr = (unsigned long)(struct btrfs_extent_item *)(ei + 1);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (41 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:20   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode Josef Bacik
                   ` (10 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

These checks are all taken care of for us by the tree checker code.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 29 +----------------------------
 1 file changed, 1 insertion(+), 28 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 3159f6517588..8f4f1e21c770 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3370,20 +3370,6 @@ static void unset_reloc_control(struct reloc_control *rc)
 	mutex_unlock(&fs_info->reloc_mutex);
 }
 
-static int check_extent_flags(u64 flags)
-{
-	if ((flags & BTRFS_EXTENT_FLAG_DATA) &&
-	    (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK))
-		return 1;
-	if (!(flags & BTRFS_EXTENT_FLAG_DATA) &&
-	    !(flags & BTRFS_EXTENT_FLAG_TREE_BLOCK))
-		return 1;
-	if ((flags & BTRFS_EXTENT_FLAG_DATA) &&
-	    (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF))
-		return 1;
-	return 0;
-}
-
 static noinline_for_stack
 int prepare_to_relocate(struct reloc_control *rc)
 {
@@ -3435,7 +3421,6 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
 	struct btrfs_path *path;
 	struct btrfs_extent_item *ei;
 	u64 flags;
-	u32 item_size;
 	int ret;
 	int err = 0;
 	int progress = 0;
@@ -3484,19 +3469,7 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
 
 		ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
 				    struct btrfs_extent_item);
-		item_size = btrfs_item_size_nr(path->nodes[0], path->slots[0]);
-		if (item_size >= sizeof(*ei)) {
-			flags = btrfs_extent_flags(path->nodes[0], ei);
-			ret = check_extent_flags(flags);
-			BUG_ON(ret);
-		} else if (unlikely(item_size == sizeof(struct btrfs_extent_item_v0))) {
-			err = -EINVAL;
-			btrfs_print_v0_err(trans->fs_info);
-			btrfs_abort_transaction(trans, err);
-			break;
-		} else {
-			BUG();
-		}
+		flags = btrfs_extent_flags(path->nodes[0], ei);
 
 		if (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK) {
 			ret = add_tree_block(rc, &key, path, &blocks);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (42 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:25   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation Josef Bacik
                   ` (9 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We already handle some errors in this function, and the callers do the
correct error handling, so clean up the rest of the function to do the
appropriate error handling.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 8f4f1e21c770..bcced4e436af 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3634,10 +3634,15 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
 		goto out;
 
 	err = __insert_orphan_inode(trans, root, objectid);
-	BUG_ON(err);
+	if (err)
+		goto out;
 
 	inode = btrfs_iget(fs_info->sb, objectid, root);
-	BUG_ON(IS_ERR(inode));
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		inode = NULL;
+		goto out;
+	}
 	BTRFS_I(inode)->index_cnt = group->start;
 
 	err = btrfs_orphan_add(trans, BTRFS_I(inode));
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (43 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:32   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot Josef Bacik
                   ` (8 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can already handle errors appropriately from this function, deal with
an error coming from __add_reloc_root appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index bcced4e436af..6315e74c1da0 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3984,7 +3984,12 @@ int btrfs_recover_relocation(struct btrfs_root *root)
 		}
 
 		err = __add_reloc_root(reloc_root);
-		BUG_ON(err < 0); /* -ENOMEM or logic error */
+		if (err) {
+			list_add_tail(&reloc_root->root_list, &reloc_roots);
+			btrfs_put_root(fs_root);
+			btrfs_end_transaction(trans);
+			goto out_unset;
+		}
 		fs_root->reloc_root = btrfs_grab_root(reloc_root);
 		btrfs_put_root(fs_root);
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (44 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:34   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge Josef Bacik
                   ` (7 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we fail to add the reloc root, drop it and return the error.  All
callers of this function already handle errors appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 6315e74c1da0..695a52cd07b0 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4204,7 +4204,10 @@ int btrfs_reloc_post_snapshot(struct btrfs_trans_handle *trans,
 		return PTR_ERR(reloc_root);
 
 	ret = __add_reloc_root(reloc_root);
-	BUG_ON(ret < 0);
+	if (ret) {
+		btrfs_put_root(reloc_root);
+		return ret;
+	}
 	new_root->reloc_root = btrfs_grab_root(reloc_root);
 
 	if (rc->create_reloc_tree)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (45 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:39   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly Josef Bacik
                   ` (6 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This probably can't happen even with a corrupt file system, because we
would have failed much earlier on than here.  However there's no reason
we can't just check and bail out as appropriate, so do that and convert
the correctness BUG_ON() to an ASSERT().

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 695a52cd07b0..d4656a8f507d 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1870,8 +1870,14 @@ int prepare_to_merge(struct reloc_control *rc, int err)
 
 		root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
 				false);
-		BUG_ON(IS_ERR(root));
-		BUG_ON(root->reloc_root != reloc_root);
+		if (IS_ERR(root)) {
+			list_add(&reloc_root->root_list, &reloc_roots);
+			btrfs_abort_transaction(trans, (int)PTR_ERR(root));
+			if (!err)
+				err = PTR_ERR(root);
+			break;
+		}
+		ASSERT(root->reloc_root == reloc_root);
 
 		/*
 		 * set reference count to 1, so btrfs_recover_relocation
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (46 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:40   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots Josef Bacik
                   ` (5 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

In corruption cases we could have paths from a block up to no root at
all, and thus we'll BUG_ON(!root) in select_one_root.  Handle this by
adding an ASSERT() for developers, and returning an error for normal
users.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index d4656a8f507d..91479979d2a7 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2200,7 +2200,16 @@ struct btrfs_root *select_one_root(struct btrfs_backref_node *node)
 		cond_resched();
 		next = walk_up_backref(next, edges, &index);
 		root = next->root;
-		BUG_ON(!root);
+
+		/*
+		 * This can occur if we have incomplete extent refs leading all
+		 * the way up a particular path, in this case return -EUCLEAN.
+		 * However leave as an ASSERT() for developers, because it could
+		 * indicate a bug in the backref code.
+		 */
+		ASSERT(root);
+		if (!root)
+			return ERR_PTR(-EUCLEAN);
 
 		/* No other choice for non-shareable tree */
 		if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
@@ -2598,8 +2607,12 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 
 	BUG_ON(node->processed);
 	root = select_one_root(node);
-	if (root == ERR_PTR(-ENOENT)) {
-		update_processed_blocks(rc, node);
+	if (IS_ERR(root)) {
+		ret = PTR_ERR(root);
+		if (ret == -ENOENT) {
+			ret = 0;
+			update_processed_blocks(rc, node);
+		}
 		goto out;
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (47 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:42   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation Josef Bacik
                   ` (4 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have a BUG_ON() if we get an error back from btrfs_get_fs_root().
This honestly should never fail, as at this point we have a solid
coordination of fs root to reloc root, and these roots will all be in
memory.  But in the name of killing BUG_ON()'s remove this one and
handle the error properly.  Change the remaining BUG_ON() to an
ASSERT().

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 91479979d2a7..099a64b47020 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1949,9 +1949,18 @@ void merge_reloc_roots(struct reloc_control *rc)
 
 		root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
 					 false);
+		if (IS_ERR(root)) {
+			/*
+			 * This likely won't happen, since we would have failed
+			 * at a higher level.  However for correctness sake
+			 * handle the error anyway.
+			 */
+			ret = PTR_ERR(root);
+			goto out;
+		}
+
 		if (btrfs_root_refs(&reloc_root->root_item) > 0) {
-			BUG_ON(IS_ERR(root));
-			BUG_ON(root->reloc_root != reloc_root);
+			ASSERT(root->reloc_root == reloc_root);
 			ret = merge_reloc_root(rc, root);
 			btrfs_put_root(root);
 			if (ret) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (48 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:42   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 51/54] btrfs: do not WARN_ON() if we can't find the reloc root Josef Bacik
                   ` (3 subsequent siblings)
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

There's a few places where we don't check the return value of
btrfs_commit_transaction in relocation.c.  Thankfully all these places
have straightforward error handling, so simply change all of the sites
at once.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 099a64b47020..15b6e54394b7 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1905,7 +1905,7 @@ int prepare_to_merge(struct reloc_control *rc, int err)
 	list_splice(&reloc_roots, &rc->reloc_roots);
 
 	if (!err)
-		btrfs_commit_transaction(trans);
+		err = btrfs_commit_transaction(trans);
 	else
 		btrfs_end_transaction(trans);
 	return err;
@@ -3436,8 +3436,7 @@ int prepare_to_relocate(struct reloc_control *rc)
 		 */
 		return PTR_ERR(trans);
 	}
-	btrfs_commit_transaction(trans);
-	return 0;
+	return btrfs_commit_transaction(trans);
 }
 
 static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
@@ -3596,7 +3595,9 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
 		err = PTR_ERR(trans);
 		goto out_free;
 	}
-	btrfs_commit_transaction(trans);
+	ret = btrfs_commit_transaction(trans);
+	if (ret && !err)
+		err = ret;
 out_free:
 	ret = clean_dirty_subvols(rc);
 	if (ret < 0 && !err)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 51/54] btrfs: do not WARN_ON() if we can't find the reloc root
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (49 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-02 19:51 ` [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name Josef Bacik
                   ` (2 subsequent siblings)
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Zygo Blaxell

Any number of things could have gone wrong, like ENOMEM or EIO, so don't
WARN_ON() if we're unable to find the reloc root in the backref code.

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/backref.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 56f7c840031e..525815d2914b 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -2617,7 +2617,7 @@ static int handle_direct_tree_backref(struct btrfs_backref_cache *cache,
 		/* Only reloc backref cache cares about a specific root */
 		if (cache->is_reloc) {
 			root = find_reloc_root(cache->fs_info, cur->bytenr);
-			if (WARN_ON(!root))
+			if (!root)
 				return -ENOENT;
 			cur->root = root;
 		} else {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (50 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 51/54] btrfs: do not WARN_ON() if we can't find the reloc root Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:44   ` Qu Wenruo
  2020-12-02 19:51 ` [PATCH v3 53/54] btrfs: fix reloc root leak with 0 ref reloc roots on recovery Josef Bacik
  2020-12-02 19:51 ` [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list Josef Bacik
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We're supposed to print the root_key.offset in btrfs_root_name in the
case of a reloc root, not the objectid.  Fix this helper to take the key
so we have access to the offset when we need it.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c    |  2 +-
 fs/btrfs/print-tree.c | 10 +++++-----
 fs/btrfs/print-tree.h |  2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 46dd9e0b077e..c73d172aa1f7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1458,7 +1458,7 @@ void btrfs_check_leaked_roots(struct btrfs_fs_info *fs_info)
 		root = list_first_entry(&fs_info->allocated_roots,
 					struct btrfs_root, leak_list);
 		btrfs_err(fs_info, "leaked root %s refcount %d",
-			  btrfs_root_name(root->root_key.objectid, buf),
+			  btrfs_root_name(&root->root_key, buf),
 			  refcount_read(&root->refs));
 		while (refcount_read(&root->refs) > 1)
 			btrfs_put_root(root);
diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
index fe5e0026129d..b8137dbf6a3a 100644
--- a/fs/btrfs/print-tree.c
+++ b/fs/btrfs/print-tree.c
@@ -26,22 +26,22 @@ static const struct root_name_map root_map[] = {
 	{ BTRFS_DATA_RELOC_TREE_OBJECTID,	"DATA_RELOC_TREE"	},
 };
 
-const char *btrfs_root_name(u64 objectid, char *buf)
+const char *btrfs_root_name(struct btrfs_key *key, char *buf)
 {
 	int i;
 
-	if (objectid == BTRFS_TREE_RELOC_OBJECTID) {
+	if (key->objectid == BTRFS_TREE_RELOC_OBJECTID) {
 		snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN,
-			 "TREE_RELOC offset=%llu", objectid);
+			 "TREE_RELOC offset=%llu", key->offset);
 		return buf;
 	}
 
 	for (i = 0; i < ARRAY_SIZE(root_map); i++) {
-		if (root_map[i].id == objectid)
+		if (root_map[i].id == key->objectid)
 			return root_map[i].name;
 	}
 
-	snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN, "%llu", objectid);
+	snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN, "%llu", key->objectid);
 	return buf;
 }
 
diff --git a/fs/btrfs/print-tree.h b/fs/btrfs/print-tree.h
index 78b99385a503..802628dd1a6e 100644
--- a/fs/btrfs/print-tree.h
+++ b/fs/btrfs/print-tree.h
@@ -11,6 +11,6 @@
 
 void btrfs_print_leaf(struct extent_buffer *l);
 void btrfs_print_tree(struct extent_buffer *c, bool follow);
-const char *btrfs_root_name(u64 objectid, char *buf);
+const char *btrfs_root_name(struct btrfs_key *key, char *buf);
 
 #endif
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 53/54] btrfs: fix reloc root leak with 0 ref reloc roots on recovery
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (51 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-02 19:51 ` [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list Josef Bacik
  53 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When recovering a relocation, if we run into a reloc root that has 0
refs we simply add it to the reloc_control->reloc_roots list, and then
clean it up later.  The problem with this is __del_reloc_root() doesn't
do anything if the root isn't in the radix tree, which in this case it
won't be because we never call __add_reloc_root() on the reloc_root.

This exit condition simply isn't correct really.  During normal
operation we can remove ourselves from the rb tree and then we're meant
to clean up later at merge_reloc_roots() time, and this happens
correctly.  During recovery we're depending on free_reloc_roots() to
drop our references, but we're short-circuiting.

Fix this by continuing to check if we're on the list and dropping
ourselves from the reloc_control root list and dropping our reference
appropriately.  Change the corresponding BUG_ON() to an ASSERT() that
does the correct thing if we aren't in the rb tree.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 15b6e54394b7..a49a422f2f9b 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -671,9 +671,7 @@ static void __del_reloc_root(struct btrfs_root *root)
 			RB_CLEAR_NODE(&node->rb_node);
 		}
 		spin_unlock(&rc->reloc_root_tree.lock);
-		if (!node)
-			return;
-		BUG_ON((struct btrfs_root *)node->data != root);
+		ASSERT(!node || (struct btrfs_root *)node->data == root);
 	}
 
 	/*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list
  2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
                   ` (52 preceding siblings ...)
  2020-12-02 19:51 ` [PATCH v3 53/54] btrfs: fix reloc root leak with 0 ref reloc roots on recovery Josef Bacik
@ 2020-12-02 19:51 ` Josef Bacik
  2020-12-03  5:47   ` Qu Wenruo
  53 siblings, 1 reply; 114+ messages in thread
From: Josef Bacik @ 2020-12-02 19:51 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While doing error injection testing with my relocation patches I hit the
following ASSERT()

assertion failed: list_empty(&block_group->dirty_list), in fs/btrfs/block-group.c:3356
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.h:3357!
invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 PID: 24351 Comm: umount Tainted: G        W         5.10.0-rc3+ #193
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
RIP: 0010:assertfail.constprop.0+0x18/0x1a
RSP: 0018:ffffa09b019c7e00 EFLAGS: 00010282
RAX: 0000000000000056 RBX: ffff8f6492c18000 RCX: 0000000000000000
RDX: ffff8f64fbc27c60 RSI: ffff8f64fbc19050 RDI: ffff8f64fbc19050
RBP: ffff8f6483bbdc00 R08: 0000000000000000 R09: 0000000000000000
R10: ffffa09b019c7c38 R11: ffffffff85d70928 R12: ffff8f6492c18100
R13: ffff8f6492c18148 R14: ffff8f6483bbdd70 R15: dead000000000100
FS:  00007fbfda4cdc40(0000) GS:ffff8f64fbc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbfda666fd0 CR3: 000000013cf66002 CR4: 0000000000370ef0
Call Trace:
 btrfs_free_block_groups.cold+0x55/0x55
 close_ctree+0x2c5/0x306
 ? fsnotify_destroy_marks+0x14/0x100
 generic_shutdown_super+0x6c/0x100
 kill_anon_super+0x14/0x30
 btrfs_kill_super+0x12/0x20
 deactivate_locked_super+0x36/0xa0
 cleanup_mnt+0x12d/0x190
 task_work_run+0x5c/0xa0
 exit_to_user_mode_prepare+0x1b1/0x1d0
 syscall_exit_to_user_mode+0x54/0x280
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happened because I injected an error in btrfs_cow_block() while
running the dirty block groups.  When we run the dirty block groups, we
splice the list onto a local list to process.  However if an error
occurs, we only cleanup the transactions dirty block group list, not any
pending block groups we have on our locally spliced list.  Fix this by
splicing the list back onto the transactions dirty block group list, so
any remaining block groups are cleaned up.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 0886e81e5540..5cfa52b1a3b8 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2685,6 +2685,9 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
 		}
 		spin_unlock(&cur_trans->dirty_bgs_lock);
 	} else if (ret < 0) {
+		spin_lock(&cur_trans->dirty_bgs_lock);
+		list_splice_init(&dirty, &cur_trans->dirty_bgs);
+		spin_unlock(&cur_trans->dirty_bgs_lock);
 		btrfs_cleanup_dirty_bgs(cur_trans, fs_info);
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots
  2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
@ 2020-12-03  1:45   ` Qu Wenruo
  2020-12-03  8:09   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  1:45 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2504 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> While doing error injection I would sometimes get a corrupt file system.
> This is because I was injecting errors at btrfs_search_slot, but would
> only do it one time per stack.  This uncovered a problem in
> commit_fs_roots, where if we get an error we would just break.  However
> we're in a nested loop, the main loop being a loop to find all the dirty
> fs roots, and then subsequent root updates would succeed clearing the
> error value.
> 
> This isn't likely to happen in real scenarios, however we could
> potentially get a random ENOMEM once and then not again, and we'd end up
> with a corrupted file system.  Fix this by moving the error checking
> around a bit to the nested loop, as this is the only place where
> something will fail, and return the error as soon as it occurs.
> 
> With this patch my reproducer no longer corrupts the file system.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Yep, that err can be overwritten by next loop, so definitely a problem.

Thanks,
Qu
> ---
>  fs/btrfs/transaction.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 8e0f7a1029c6..a614f7699ce4 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1319,7 +1319,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  	struct btrfs_root *gang[8];
>  	int i;
>  	int ret;
> -	int err = 0;
>  
>  	spin_lock(&fs_info->fs_roots_radix_lock);
>  	while (1) {
> @@ -1331,6 +1330,8 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			break;
>  		for (i = 0; i < ret; i++) {
>  			struct btrfs_root *root = gang[i];
> +			int err;
> +
>  			radix_tree_tag_clear(&fs_info->fs_roots_radix,
>  					(unsigned long)root->root_key.objectid,
>  					BTRFS_ROOT_TRANS_TAG);
> @@ -1353,14 +1354,14 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			err = btrfs_update_root(trans, fs_info->tree_root,
>  						&root->root_key,
>  						&root->root_item);
> -			spin_lock(&fs_info->fs_roots_radix_lock);
>  			if (err)
> -				break;
> +				return err;
> +			spin_lock(&fs_info->fs_roots_radix_lock);
>  			btrfs_qgroup_free_meta_all_pertrans(root);
>  		}
>  	}
>  	spin_unlock(&fs_info->fs_roots_radix_lock);
> -	return err;
> +	return 0;
>  }
>  
>  /*
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  2020-12-02 19:50 ` [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
@ 2020-12-03  1:48   ` Qu Wenruo
  2020-12-03  8:21   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  1:48 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1218 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> The following patches are going to address error handling in relocation,
> in order to test those patches I need to be able to inject errors in
> btrfs_search_slot and btrfs_cow_block, as we call both of these pretty
> often in different cases during relocation.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/ctree.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index e5a0941c4bde..f40d3a2590a5 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -1494,6 +1494,7 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
>  
>  	return ret;
>  }
> +ALLOW_ERROR_INJECTION(btrfs_cow_block, ERRNO);
>  
>  /*
>   * helper function for defrag to decide if two blocks pointed to by a
> @@ -2800,6 +2801,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>  		btrfs_release_path(p);
>  	return ret;
>  }
> +ALLOW_ERROR_INJECTION(btrfs_search_slot, ERRNO);
>  
>  /*
>   * Like btrfs_search_slot, this looks for a key in the given tree. It uses the
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation
  2020-12-02 19:50 ` [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
@ 2020-12-03  1:49   ` Qu Wenruo
  2020-12-03  8:44   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  1:49 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 5151 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> While testing the error paths of relocation I hit the following lockdep
> splat
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.10.0-rc6+ #217 Not tainted
> ------------------------------------------------------
> mount/779 is trying to acquire lock:
> ffffa0e676945418 (&fs_info->balance_mutex){+.+.}-{3:3}, at: btrfs_recover_balance+0x2f0/0x340
> 
> but task is already holding lock:
> ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2 (btrfs-root-00){++++}-{3:3}:
>        down_read_nested+0x43/0x130
>        __btrfs_tree_read_lock+0x27/0x100
>        btrfs_read_lock_root_node+0x31/0x40
>        btrfs_search_slot+0x462/0x8f0
>        btrfs_update_root+0x55/0x2b0
>        btrfs_drop_snapshot+0x398/0x750
>        clean_dirty_subvols+0xdf/0x120
>        btrfs_recover_relocation+0x534/0x5a0
>        btrfs_start_pre_rw_mount+0xcb/0x170
>        open_ctree+0x151f/0x1726
>        btrfs_mount_root.cold+0x12/0xea
>        legacy_get_tree+0x30/0x50
>        vfs_get_tree+0x28/0xc0
>        vfs_kern_mount.part.0+0x71/0xb0
>        btrfs_mount+0x10d/0x380
>        legacy_get_tree+0x30/0x50
>        vfs_get_tree+0x28/0xc0
>        path_mount+0x433/0xc10
>        __x64_sys_mount+0xe3/0x120
>        do_syscall_64+0x33/0x40
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> -> #1 (sb_internal#2){.+.+}-{0:0}:
>        start_transaction+0x444/0x700
>        insert_balance_item.isra.0+0x37/0x320
>        btrfs_balance+0x354/0xf40
>        btrfs_ioctl_balance+0x2cf/0x380
>        __x64_sys_ioctl+0x83/0xb0
>        do_syscall_64+0x33/0x40
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> -> #0 (&fs_info->balance_mutex){+.+.}-{3:3}:
>        __lock_acquire+0x1120/0x1e10
>        lock_acquire+0x116/0x370
>        __mutex_lock+0x7e/0x7b0
>        btrfs_recover_balance+0x2f0/0x340
>        open_ctree+0x1095/0x1726
>        btrfs_mount_root.cold+0x12/0xea
>        legacy_get_tree+0x30/0x50
>        vfs_get_tree+0x28/0xc0
>        vfs_kern_mount.part.0+0x71/0xb0
>        btrfs_mount+0x10d/0x380
>        legacy_get_tree+0x30/0x50
>        vfs_get_tree+0x28/0xc0
>        path_mount+0x433/0xc10
>        __x64_sys_mount+0xe3/0x120
>        do_syscall_64+0x33/0x40
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   &fs_info->balance_mutex --> sb_internal#2 --> btrfs-root-00
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(btrfs-root-00);
>                                lock(sb_internal#2);
>                                lock(btrfs-root-00);
>   lock(&fs_info->balance_mutex);
> 
>  *** DEADLOCK ***
> 
> 2 locks held by mount/779:
>  #0: ffffa0e60dc040e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380
>  #1: ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100
> 
> stack backtrace:
> CPU: 0 PID: 779 Comm: mount Not tainted 5.10.0-rc6+ #217
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
> Call Trace:
>  dump_stack+0x8b/0xb0
>  check_noncircular+0xcf/0xf0
>  ? trace_call_bpf+0x139/0x260
>  __lock_acquire+0x1120/0x1e10
>  lock_acquire+0x116/0x370
>  ? btrfs_recover_balance+0x2f0/0x340
>  __mutex_lock+0x7e/0x7b0
>  ? btrfs_recover_balance+0x2f0/0x340
>  ? btrfs_recover_balance+0x2f0/0x340
>  ? rcu_read_lock_sched_held+0x3f/0x80
>  ? kmem_cache_alloc_trace+0x2c4/0x2f0
>  ? btrfs_get_64+0x5e/0x100
>  btrfs_recover_balance+0x2f0/0x340
>  open_ctree+0x1095/0x1726
>  btrfs_mount_root.cold+0x12/0xea
>  ? rcu_read_lock_sched_held+0x3f/0x80
>  legacy_get_tree+0x30/0x50
>  vfs_get_tree+0x28/0xc0
>  vfs_kern_mount.part.0+0x71/0xb0
>  btrfs_mount+0x10d/0x380
>  ? __kmalloc_track_caller+0x2f2/0x320
>  legacy_get_tree+0x30/0x50
>  vfs_get_tree+0x28/0xc0
>  ? capable+0x3a/0x60
>  path_mount+0x433/0xc10
>  __x64_sys_mount+0xe3/0x120
>  do_syscall_64+0x33/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> This is thankfully straightforward to fix, simply release the path
> before we setup the reloc_ctl.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/volumes.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 7930e1c78c45..49ba941f0314 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4318,6 +4318,8 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
>  		btrfs_warn(fs_info,
>  	"balance: cannot set exclusive op status, resume manually");
>  
> +	btrfs_release_path(path);
> +
>  	mutex_lock(&fs_info->balance_mutex);
>  	BUG_ON(fs_info->balance_ctl);
>  	spin_lock(&fs_info->balance_lock);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads
  2020-12-02 19:50 ` [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads Josef Bacik
@ 2020-12-03  2:04   ` Qu Wenruo
  2020-12-03 15:55     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:04 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 7878 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> While testing the error paths in relocation, I hit the following lockdep
> splat
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.10.0-rc3+ #206 Not tainted
> ------------------------------------------------------
> btrfs-balance/1571 is trying to acquire lock:
> ffff8cdbcc8f77d0 (&head_ref->mutex){+.+.}-{3:3}, at: btrfs_lookup_extent_info+0x156/0x3b0
> 
> but task is already holding lock:
> ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2 (btrfs-tree-00){++++}-{3:3}:
>        down_write_nested+0x43/0x80
>        __btrfs_tree_lock+0x27/0x100
>        btrfs_search_slot+0x248/0x890
>        relocate_tree_blocks+0x490/0x650
>        relocate_block_group+0x1ba/0x5d0
>        kretprobe_trampoline+0x0/0x50
> 
> -> #1 (btrfs-csum-01){++++}-{3:3}:
>        down_read_nested+0x43/0x130
>        __btrfs_tree_read_lock+0x27/0x100
>        btrfs_read_lock_root_node+0x31/0x40
>        btrfs_search_slot+0x5ab/0x890
>        btrfs_del_csums+0x10b/0x3c0
>        __btrfs_free_extent+0x49d/0x8e0
>        __btrfs_run_delayed_refs+0x283/0x11f0
>        btrfs_run_delayed_refs+0x86/0x220
>        btrfs_start_dirty_block_groups+0x2ba/0x520
>        kretprobe_trampoline+0x0/0x50
> 
> -> #0 (&head_ref->mutex){+.+.}-{3:3}:
>        __lock_acquire+0x1167/0x2150
>        lock_acquire+0x116/0x3e0
>        __mutex_lock+0x7e/0x7b0
>        btrfs_lookup_extent_info+0x156/0x3b0
>        walk_down_proc+0x1c3/0x280
>        walk_down_tree+0x64/0xe0
>        btrfs_drop_subtree+0x182/0x260
>        do_relocation+0x52e/0x660
>        relocate_tree_blocks+0x2ae/0x650
>        relocate_block_group+0x1ba/0x5d0
>        kretprobe_trampoline+0x0/0x50
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   &head_ref->mutex --> btrfs-csum-01 --> btrfs-tree-00
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(btrfs-tree-00);
>                                lock(btrfs-csum-01);
>                                lock(btrfs-tree-00);

I found it a little confusing that, subv trees got the name "tree".

Maybe another patch to rename it to something like "fs" or "subv" would
be better?

[...]
> 
> As you can see this is bogus, we never take another tree's lock under
> the csum lock.  This happens because sometimes we have to read tree
> blocks from disk without knowing which root they belong to during
> relocation.  We defaulted to an owner of 0, which translates to an fs
> tree.  This is fine as all fs trees have the same class, but obviously
> isn't fine if the block belongs to a cow only tree.
> 
> Thankfully cow only trees only have their owners root as a reference to
> them, and since we already look up the extent information during
> relocation, go ahead and check and see if this block might belong to a
> cow only tree, and if so save the owner in the struct tree_block.  This
> allows us to read_tree_block with the proper owner, which gets rid of
> this lockdep splat.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

The fix is OK, although some extra comment inlined below.
> ---
>  fs/btrfs/relocation.c | 47 ++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 19b7db8b2117..2b30e39e922a 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -98,6 +98,7 @@ struct tree_block {
>  		u64 bytenr;
>  	}; /* Use rb_simple_node for search/insert */
>  	struct btrfs_key key;
> +	u64 owner;
>  	unsigned int level:8;
>  	unsigned int key_ready:1;
>  };
> @@ -2393,8 +2394,8 @@ static int get_tree_block_key(struct btrfs_fs_info *fs_info,
>  {
>  	struct extent_buffer *eb;
>  
> -	eb = read_tree_block(fs_info, block->bytenr, 0, block->key.offset,
> -			     block->level, NULL);
> +	eb = read_tree_block(fs_info, block->bytenr, block->owner,
> +			     block->key.offset, block->level, NULL);
>  	if (IS_ERR(eb)) {
>  		return PTR_ERR(eb);
>  	} else if (!extent_buffer_uptodate(eb)) {
> @@ -2493,7 +2494,8 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans,
>  	/* Kick in readahead for tree blocks with missing keys */
>  	rbtree_postorder_for_each_entry_safe(block, next, blocks, rb_node) {
>  		if (!block->key_ready)
> -			btrfs_readahead_tree_block(fs_info, block->bytenr, 0, 0,
> +			btrfs_readahead_tree_block(fs_info, block->bytenr,
> +						   block->owner, 0,
>  						   block->level);
>  	}
>  
> @@ -2801,21 +2803,59 @@ static int add_tree_block(struct reloc_control *rc,
>  	u32 item_size;
>  	int level = -1;
>  	u64 generation;
> +	u64 owner = 0;
>  
>  	eb =  path->nodes[0];
>  	item_size = btrfs_item_size_nr(eb, path->slots[0]);
>  
>  	if (extent_key->type == BTRFS_METADATA_ITEM_KEY ||
>  	    item_size >= sizeof(*ei) + sizeof(*bi)) {
> +		unsigned long ptr = 0, end;

Do we really need that end to iterate through the extent item?

For cow-only trees, we only cow them to do the balance, which means
metadata/extent item for them should only contain one inline item and no
way to have keyed item.

If the metadata/extent item has more than one inline ref, it must not be
for COW trees.

Can't we use extent item size as a quick check?

Also this inspires me to add tree-checker for extent item size.

Thanks,
Qu

>  		ei = btrfs_item_ptr(eb, path->slots[0],
>  				struct btrfs_extent_item);
> +		end = (unsigned long)ei + item_size;
>  		if (extent_key->type == BTRFS_EXTENT_ITEM_KEY) {
>  			bi = (struct btrfs_tree_block_info *)(ei + 1);
>  			level = btrfs_tree_block_level(eb, bi);
> +			ptr = (unsigned long)(bi + 1);
>  		} else {
>  			level = (int)extent_key->offset;
> +			ptr = (unsigned long)(ei + 1);
>  		}
>  		generation = btrfs_extent_generation(eb, ei);
> +
> +		/*
> +		 * We're reading random blocks without knowing their owner ahead
> +		 * of time.  This is ok most of the time, as all reloc roots and
> +		 * fs roots have the same lock type.  However normal trees do
> +		 * not, and the only way to know ahead of time is to read the
> +		 * inline ref offset.  We know it's an fs root if
> +		 *
> +		 * 1. There's more than one ref.
> +		 * 2. There's a SHARED_DATA_REF_KEY set.
> +		 * 3. FULL_BACKREF is set on the flags.
> +		 *
> +		 * Otherwise it's safe to assume that the ref offset == the
> +		 * owner of this block, so we can use that when calling
> +		 * read_tree_block.
> +		 */
> +		if (btrfs_extent_refs(eb, ei) == 1 &&
> +		    !(btrfs_extent_flags(eb, ei) &
> +		      BTRFS_BLOCK_FLAG_FULL_BACKREF) &&
> +		    ptr < end) {
> +			struct btrfs_extent_inline_ref *iref;
> +			int type;
> +
> +			iref = (struct btrfs_extent_inline_ref *)ptr;
> +			type = btrfs_get_extent_inline_ref_type(eb, iref,
> +							BTRFS_REF_TYPE_BLOCK);
> +			if (type == BTRFS_REF_TYPE_INVALID)
> +				return -EINVAL;
> +			if (type == BTRFS_TREE_BLOCK_REF_KEY)
> +				owner = btrfs_extent_inline_ref_offset(eb,
> +								       iref);
> +		}
>  	} else if (unlikely(item_size == sizeof(struct btrfs_extent_item_v0))) {
>  		btrfs_print_v0_err(eb->fs_info);
>  		btrfs_handle_fs_error(eb->fs_info, -EINVAL, NULL);
> @@ -2837,6 +2877,7 @@ static int add_tree_block(struct reloc_control *rc,
>  	block->key.offset = generation;
>  	block->level = level;
>  	block->key_ready = 0;
> +	block->owner = owner;
>  
>  	rb_node = rb_simple_insert(blocks, block->bytenr, &block->rb_node);
>  	if (rb_node)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance
  2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
@ 2020-12-03  2:06   ` Qu Wenruo
  2020-12-03  8:44   ` Johannes Thumshirn
  2020-12-03  9:00   ` Nikolay Borisov
  2 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:06 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1268 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> I was attempting to reproduce a problem that Zygo hit, but my error
> injection wasn't firing for a few of the common calls to
> btrfs_should_cancel_balance.  This is because the compiler decided to
> inline it at these spots.  Keep this from happening by explicitly
> noinline'ing the function so that error injection will always work.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Oh, I should have added noinline for the error injection I introduced.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks for the fix.
Qu

> ---
>  fs/btrfs/relocation.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 2b30e39e922a..ce935139d87b 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2617,7 +2617,7 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
>  /*
>   * Allow error injection to test balance cancellation
>   */
> -int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
> +noinline int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>  {
>  	return atomic_read(&fs_info->balance_cancel_req) ||
>  		fatal_signal_pending(current);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
  2020-12-02 19:50 ` [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Josef Bacik
@ 2020-12-03  2:08   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:08 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 4307 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> Zygo reported the following panic when testing my error handling patches
> for relocation
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/backref.c:2545!
> invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G        W 14
> Hardware name: QEMU Standard PC (i440FX + PIIX,
> 
> Call Trace:
>  btrfs_backref_error_cleanup+0x4df/0x530
>  build_backref_tree+0x1a5/0x700
>  ? _raw_spin_unlock+0x22/0x30
>  ? release_extent_buffer+0x225/0x280
>  ? free_extent_buffer.part.52+0xd7/0x140
>  relocate_tree_blocks+0x2a6/0xb60
>  ? kasan_unpoison_shadow+0x35/0x50
>  ? do_relocation+0xc10/0xc10
>  ? kasan_kmalloc+0x9/0x10
>  ? kmem_cache_alloc_trace+0x6a3/0xcb0
>  ? free_extent_buffer.part.52+0xd7/0x140
>  ? rb_insert_color+0x342/0x360
>  ? add_tree_block.isra.36+0x236/0x2b0
>  relocate_block_group+0x2eb/0x780
>  ? merge_reloc_roots+0x470/0x470
>  btrfs_relocate_block_group+0x26e/0x4c0
>  btrfs_relocate_chunk+0x52/0x120
>  btrfs_balance+0xe2e/0x18f0
>  ? pvclock_clocksource_read+0xeb/0x190
>  ? btrfs_relocate_chunk+0x120/0x120
>  ? lock_contended+0x620/0x6e0
>  ? do_raw_spin_lock+0x1e0/0x1e0
>  ? do_raw_spin_unlock+0xa8/0x140
>  btrfs_ioctl_balance+0x1f9/0x460
>  btrfs_ioctl+0x24c8/0x4380
>  ? __kasan_check_read+0x11/0x20
>  ? check_chain_key+0x1f4/0x2f0
>  ? __asan_loadN+0xf/0x20
>  ? btrfs_ioctl_get_supported_features+0x30/0x30
>  ? kvm_sched_clock_read+0x18/0x30
>  ? check_chain_key+0x1f4/0x2f0
>  ? lock_downgrade+0x3f0/0x3f0
>  ? handle_mm_fault+0xad6/0x2150
>  ? do_vfs_ioctl+0xfc/0x9d0
>  ? ioctl_file_clone+0xe0/0xe0
>  ? check_flags.part.50+0x6c/0x1e0
>  ? check_flags.part.50+0x6c/0x1e0
>  ? check_flags+0x26/0x30
>  ? lock_is_held_type+0xc3/0xf0
>  ? syscall_enter_from_user_mode+0x1b/0x60
>  ? do_syscall_64+0x13/0x80
>  ? rcu_read_lock_sched_held+0xa1/0xd0
>  ? __kasan_check_read+0x11/0x20
>  ? __fget_light+0xae/0x110
>  __x64_sys_ioctl+0xc3/0x100
>  do_syscall_64+0x37/0x80
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> This occurs because of this check
> 
> if (RB_EMPTY_NODE(&upper->rb_node))
> 	BUG_ON(!list_empty(&node->upper));
> 
> As we are dropping the backref node, if we discover that our upper node
> in the edge we just cleaned up isn't linked into the cache that we are
> now done with this node, thus the BUG_ON().
> 
> However this is an erroneous assumption, as we will look up all the
> references for a node first, and then process the pending edges.  All of
> the 'upper' nodes in our pending edges won't be in the cache's rb_tree
> yet, because they haven't been processed.  We could very well have many
> edges still left to cleanup on this node.
> 
> The fact is we simply do not need this check, we can just process all of
> the edges only for this node, because below this check we do the
> following
> 
> if (list_empty(&upper->lower)) {
> 	list_add_tail(&upper->lower, &cache->leaves);
> 	upper->lowest = 1;
> }
> 
> If the upper node truly isn't used yet, then we add it to the
> cache->leaves list to be cleaned up later.  If it is still used then the
> last child node that has it linked into its node will add it to the
> leaves list and then it will be cleaned up.

That's tree.

> 
> Fix this problem by dropping this logic altogether.  With this fix I no
> longer see the panic when testing with error injection in the backref
> code.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/backref.c | 7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
> index 02d7d7b2563b..56f7c840031e 100644
> --- a/fs/btrfs/backref.c
> +++ b/fs/btrfs/backref.c
> @@ -2541,13 +2541,6 @@ void btrfs_backref_cleanup_node(struct btrfs_backref_cache *cache,
>  		list_del(&edge->list[UPPER]);
>  		btrfs_backref_free_edge(cache, edge);
>  
> -		if (RB_EMPTY_NODE(&upper->rb_node)) {
> -			BUG_ON(!list_empty(&node->upper));
> -			btrfs_backref_drop_node(cache, node);
> -			node = upper;
> -			node->lowest = 1;
> -			continue;
> -		}
>  		/*
>  		 * Add the node to leaf node list if no other child block
>  		 * cached.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups
  2020-12-02 19:50 ` [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
@ 2020-12-03  2:13   ` Qu Wenruo
  2020-12-03  8:58   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:13 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1097 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> If we fail to update a block group item in the loop we'll break, however
> we'll do btrfs_run_delayed_refs and lose our error value in ret, and
> thus not clean up properly.  Fix this by only running the delayed refs
> if there was no failure.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/block-group.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 52f2198d44c9..0886e81e5540 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
>  	 * Go through delayed refs for all the stuff we've just kicked off
>  	 * and then loop back (just once)
>  	 */
> -	ret = btrfs_run_delayed_refs(trans, 0);
> +	if (!ret)
> +		ret = btrfs_run_delayed_refs(trans, 0);
>  	if (!ret && loops == 0) {
>  		loops++;
>  		spin_lock(&cur_trans->dirty_bgs_lock);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation
  2020-12-02 19:50 ` [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation Josef Bacik
@ 2020-12-03  2:14   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:14 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1867 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> A few of these are checking for correctness, and won't be triggered by
> corrupted file systems, so convert them to ASSERT() instead of BUG_ON()
> and add a comment explaining their existence.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index ce935139d87b..d0ce771a2a8d 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2183,7 +2183,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>  	int slot;
>  	int ret = 0;
>  
> -	BUG_ON(lowest && node->eb);
> +	/*
> +	 * If we are lowest then this is the first time we're processing this
> +	 * block, and thus shouldn't have an eb associated with it yet.
> +	 */
> +	ASSERT(!lowest || !node->eb);
>  
>  	path->lowest_level = node->level + 1;
>  	rc->backref_cache.path[node->level] = node;
> @@ -2268,7 +2272,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>  			free_extent_buffer(eb);
>  			if (ret < 0)
>  				goto next;
> -			BUG_ON(node->eb != eb);
> +			/*
> +			 * We've just cow'ed this block, it should have updated
> +			 * the correct backref node entry.
> +			 */
> +			ASSERT(node->eb == eb);
>  		} else {
>  			btrfs_set_node_blockptr(upper->eb, slot,
>  						node->eb->start);
> @@ -2304,7 +2312,12 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>  	}
>  
>  	path->lowest_level = 0;
> -	BUG_ON(ret == -ENOSPC);
> +
> +	/*
> +	 * We should have allocated all of our space in the block rsv and thus
> +	 * shouldn't ENOSPC.
> +	 */
> +	ASSERT(ret != -ENOSPC);
>  	return ret;
>  }
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block
  2020-12-02 19:50 ` [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block Josef Bacik
@ 2020-12-03  2:15   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:15 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2044 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We have a couple of BUG_ON()'s in relocate_tree_block() that can be
> tripped if we have file system corruption.  Convert these to ASSERT()'s
> so developers still get yelled at when they break the backref code, but
> error out nicely for users so the whole box doesn't go down.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index d0ce771a2a8d..4333ee329290 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2456,8 +2456,28 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
>  
>  	if (root) {
>  		if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
> -			BUG_ON(node->new_bytenr);
> -			BUG_ON(!list_empty(&node->list));
> +			/*
> +			 * This block was the root block of a root, and this is
> +			 * the first time we're processing the block and thus it
> +			 * should not have had the ->new_bytenr modified and
> +			 * should have not been included on the changed list.
> +			 *
> +			 * However in the case of corruption we could have
> +			 * multiple refs pointing to the same block improperly,
> +			 * and thus we would trip over these checks.  ASSERT()
> +			 * for the developer case, because it could indicate a
> +			 * bug in the backref code, however error out for a
> +			 * normal user in the case of corruption.
> +			 */
> +			ASSERT(node->new_bytenr == 0);
> +			ASSERT(list_empty(&node->list));
> +			if (node->new_bytenr || !list_empty(&node->list)) {
> +				btrfs_err(root->fs_info,
> +				  "bytenr %llu has improper references to it",
> +					  node->bytenr);
> +				ret = -EUCLEAN;
> +				goto out;
> +			}
>  			btrfs_record_root_in_trans(trans, root);
>  			root = root->reloc_root;
>  			node->new_bytenr = root->node->start;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans
  2020-12-02 19:50 ` [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
@ 2020-12-03  2:20   ` Qu Wenruo
  2020-12-03 13:50   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:20 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1733 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We can create a reloc root when we record the root in the trans, which
> can fail for all sorts of different reasons.  Propagate this error up
> the chain of callers.  Future patches will fix the callers of
> btrfs_record_root_in_trans() to handle the error.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But an unrelated thing inlined below.
> ---
>  fs/btrfs/transaction.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index a614f7699ce4..28e7a7464b60 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -400,6 +400,7 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
>  			       int force)
>  {
>  	struct btrfs_fs_info *fs_info = root->fs_info;
> +	int ret = 0;
>  
>  	if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
>  	    root->last_trans < trans->transid) || force) {
> @@ -448,11 +449,11 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
>  		 * lock.  smp_wmb() makes sure that all the writes above are
>  		 * done before we pop in the zero below
>  		 */

The large block of comment is not really for btrfs_init_reloc_root(),
but more for ROOT_IN_TRANS_SETUP and root->last_trans.

This is a little confusing, and may be it's a good idea to move them a
little in another patch.

Thanks,
Qu

> -		btrfs_init_reloc_root(trans, root);
> +		ret = btrfs_init_reloc_root(trans, root);
>  		smp_mb__before_atomic();
>  		clear_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
>  	}
> -	return 0;
> +	return ret;
>  }
>  
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 13/54] btrfs: handle errors from select_reloc_root()
  2020-12-02 19:50 ` [PATCH v3 13/54] btrfs: handle errors from select_reloc_root() Josef Bacik
@ 2020-12-03  2:23   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:23 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2006 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> Currently select_reloc_root() doesn't return an error, but followup
> patches will make it possible for it to return an error.  We do have
> proper error recovery in do_relocation however, so handle the
> possibility of select_reloc_root() having an error properly instead of
> BUG_ON(!root).  I've also adjusted select_reloc_root() to return
> ERR_PTR(-ENOENT) if we don't find a root, instead of NULL, to make the
> error case easier to deal with.  I've replaced the BUG_ON(!root) with an
> ASSERT(ret != -ENOENT), as this indicates we messed up the backref
> walking code, but could indicate corruption so we do not want to have a
> BUG_ON() here.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 4333ee329290..66515ccc04fe 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2027,7 +2027,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  			break;
>  	}
>  	if (!root)
> -		return NULL;
> +		return ERR_PTR(-ENOENT);
>  
>  	next = node;
>  	/* setup backref node path for btrfs_reloc_cow_block */
> @@ -2198,7 +2198,18 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>  
>  		upper = edge->node[UPPER];
>  		root = select_reloc_root(trans, rc, upper, edges);
> -		BUG_ON(!root);
> +		if (IS_ERR(root)) {
> +			ret = PTR_ERR(root);
> +
> +			/*
> +			 * This can happen if there's fs corruption, but if we
> +			 * have ASSERT()'s on then we're developers and we
> +			 * likely made a logic mistake in the backref code, so
> +			 * check for this error condition.
> +			 */
> +			ASSERT(ret != -ENOENT);
> +			goto next;
> +		}
>  
>  		if (upper->eb && !upper->locked) {
>  			if (!lowest) {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors
  2020-12-02 19:50 ` [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors Josef Bacik
@ 2020-12-03  2:29   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:29 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 3946 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We have several BUG_ON()'s in select_reloc_root() that can be tripped if
> you have extent tree corruption.  Convert these to ASSERT()'s, because
> if we hit it during testing it really is bad, or could indicate a
> problem with the backref walking code.
> 
> However if users hit these problems it generally indicates corruption,
> I've hit a few machines in the fleet that trip over these with clearly
> corrupted extent trees, so be nice and spit out an error message and
> return an error instead of bringing the whole box down.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 51 +++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 47 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 66515ccc04fe..bf4e1018356a 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1996,8 +1996,35 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  		cond_resched();
>  		next = walk_up_backref(next, edges, &index);
>  		root = next->root;
> -		BUG_ON(!root);
> -		BUG_ON(!test_bit(BTRFS_ROOT_SHAREABLE, &root->state));
> +
> +		/*
> +		 * If there is no root, then our references for this block are
> +		 * incomplete, as we should be able to walk all the way up to a
> +		 * block that is owned by a root.
> +		 *
> +		 * This path is only for SHAREABLE roots, so if we come upon a
> +		 * non-SHAREABLE root then we have backrefs that resolve
> +		 * improperly.
> +		 *
> +		 * Both of these cases indicate file system corruption, or a bug
> +		 * in the backref walking code.  The ASSERT() is to make sure
> +		 * developers get bitten as soon as possible, proper error
> +		 * handling is for users who may have corrupt file systems.
> +		 */
> +		if (!root) {
> +			ASSERT(root);

ASSERT(0); maybe a little less confusing.

> +			btrfs_err(trans->fs_info,
> +		"bytenr %llu doesn't have a backref path ending in a root",
> +				  node->bytenr);
> +			return ERR_PTR(-EUCLEAN);
> +		}
> +		if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
> +			ASSERT(test_bit(BTRFS_ROOT_SHAREABLE, &root->state));
Same here.

> +			btrfs_err(trans->fs_info,
> +"bytenr %llu has multiple refs with one ending in a non shareable root",
> +				  node->bytenr);
> +			return ERR_PTR(-EUCLEAN);
> +		}
>  
>  		if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
>  			record_reloc_root_in_trans(trans, root);
> @@ -2008,8 +2035,24 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  		root = root->reloc_root;
>  
>  		if (next->new_bytenr != root->node->start) {
> -			BUG_ON(next->new_bytenr);
> -			BUG_ON(!list_empty(&next->list));
> +			/*
> +			 * We just created the reloc root, so we shouldn't have
> +			 * ->new_bytenr set and this shouldn't be in the changed
> +			 *  list.  If it is then we have multiple roots pointing
> +			 *  at the same bytenr, or we've made a mistake in the
> +			 *  backref walking code.  ASSERT() for developers,
> +			 *  error out for users, as it indicates corruption or a
> +			 *  bad bug.

The ASSERT() comment mentioned everywhere seems a little overkilled.

> +			 */
> +			ASSERT(next->new_bytenr == 0);
> +			ASSERT(list_empty(&next->list));
> +			if (next->new_bytenr || !list_empty(&next->list)) {

Just ASSERT(0); here would be good enough.

Despite that, the new ASSERT() for developer and do error handling
properly is really awesome behavior.

Thanks,
Qu

> +				btrfs_err(trans->fs_info,
> +"bytenr %llu possibly has multiple roots pointing at the same bytenr %llu",
> +					  node->bytenr, next->bytenr);
> +				return ERR_PTR(-EUCLEAN);
> +			}
> +
>  			next->new_bytenr = root->node->start;
>  			btrfs_put_root(next->root);
>  			next->root = btrfs_grab_root(root);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root
  2020-12-02 19:50 ` [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root Josef Bacik
@ 2020-12-03  2:33   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:33 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1468 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We will record the fs root or the reloc root in the trans in
> select_reloc_root.  These will actually return errors in the following
> patches, so check their return value here and return it up the stack.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index bf4e1018356a..d663d8fc085d 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1990,6 +1990,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  	struct btrfs_backref_node *next;
>  	struct btrfs_root *root;
>  	int index = 0;
> +	int ret;
>  
>  	next = node;
>  	while (1) {
> @@ -2027,11 +2028,15 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  		}
>  
>  		if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
> -			record_reloc_root_in_trans(trans, root);
> +			ret = record_reloc_root_in_trans(trans, root);
> +			if (ret)
> +				return ERR_PTR(ret);
>  			break;
>  		}
>  
> -		btrfs_record_root_in_trans(trans, root);
> +		ret = btrfs_record_root_in_trans(trans, root);
> +		if (ret)
> +			return ERR_PTR(ret);
>  		root = root->reloc_root;
>  
>  		if (next->new_bytenr != root->node->start) {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans
  2020-12-02 19:50 ` [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans Josef Bacik
@ 2020-12-03  2:39   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:39 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2257 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> Generally speaking this shouldn't ever fail, the corresponding fs root
> for the reloc root will already be in memory, so we won't get -ENOMEM
> here.
> 
> However if there is no corresponding root for the reloc root then we
> could get -ENOMEM when we try to allocate it or we could get -ENOENT
> when we look it up and see that it doesn't exist.
> 
> Convert these BUG_ON()'s into ASSERT()'s + proper error handling for the
> case of corruption.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index d663d8fc085d..5a4b44857522 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1973,8 +1973,30 @@ static int record_reloc_root_in_trans(struct btrfs_trans_handle *trans,
>  		return 0;
>  
>  	root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset, false);
> -	BUG_ON(IS_ERR(root));
> -	BUG_ON(root->reloc_root != reloc_root);
> +
> +	/*
> +	 * This should succeed, since we can't have a reloc root without having
> +	 * already looked up the actual root and created the reloc root for this
> +	 * root.
> +	 *
> +	 * However if there's some sort of corruption where we have a ref to a
> +	 * reloc root without a corresponding root this could return -ENOENT.
> +	 *
> +	 * The ASSERT()'s are to catch this case in testing, because it could
> +	 * indicate a bug, but for non-developers it indicates corruption and we
> +	 * should error out.

The same mention of ASSERT() now looks really overkilled.
> +	 */
> +	ASSERT(!IS_ERR(root));
> +	ASSERT(root->reloc_root == reloc_root);
> +	if (IS_ERR(root))
> +		return PTR_ERR(root);
> +	if (root->reloc_root != reloc_root) {

ASSERT(0) would be easier to read here IMHO.

Despite that looks good to me.

Thanks,
Qu
> +		btrfs_err(fs_info,
> +			  "root %llu has two reloc roots associated with it",
> +			  reloc_root->root_key.offset);
> +		btrfs_put_root(root);
> +		return -EUCLEAN;
> +	}
>  	ret = btrfs_record_root_in_trans(trans, root);
>  	btrfs_put_root(root);
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange
  2020-12-02 19:50 ` [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange Josef Bacik
@ 2020-12-03  2:40   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:40 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 926 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in btrfs_rename_exchange.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/inode.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 0ce42d52d53e..d34cba37a08f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -8878,8 +8878,11 @@ static int btrfs_rename_exchange(struct inode *old_dir,
>  		goto out_notrans;
>  	}
>  
> -	if (dest != root)
> -		btrfs_record_root_in_trans(trans, dest);
> +	if (dest != root) {
> +		ret = btrfs_record_root_in_trans(trans, dest);
> +		if (ret)
> +			goto out_fail;
> +	}
>  
>  	/*
>  	 * We need to find a free sequence number both in the source and
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume
  2020-12-02 19:50 ` [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume Josef Bacik
@ 2020-12-03  2:41   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:41 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 951 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in btrfs_delete_subvolume.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reveiwed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/inode.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 40601a0ff4f2..1f9fa63ef194 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4157,7 +4157,11 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
>  		goto out_end_trans;
>  	}
>  
> -	btrfs_record_root_in_trans(trans, dest);
> +	ret = btrfs_record_root_in_trans(trans, dest);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto out_end_trans;
> +	}
>  
>  	memset(&dest->root_item.drop_progress, 0,
>  		sizeof(dest->root_item.drop_progress));
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees
  2020-12-02 19:50 ` [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees Josef Bacik
@ 2020-12-03  2:42   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:42 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1331 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in btrfs_recover_log_trees.
> 
> This appears tricky, however we have a reference count on the
> destination root, so if this fails we need to continue on in the loop to
> make sure the properly cleanup is done.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/tree-log.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 254c2ee43aae..77adeb3c988d 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -6286,8 +6286,12 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
>  		}
>  
>  		wc.replay_dest->log_root = log;
> -		btrfs_record_root_in_trans(trans, wc.replay_dest);
> -		ret = walk_log_tree(trans, log, &wc);
> +		ret = btrfs_record_root_in_trans(trans, wc.replay_dest);
> +		if (ret)
> +			btrfs_handle_fs_error(fs_info, ret,
> +				"Couldn't record the root in the transaction.");
> +		else
> +			ret = walk_log_tree(trans, log, &wc);
>  
>  		if (!ret && wc.stage == LOG_WALK_REPLAY_ALL) {
>  			ret = fixup_inode_link_counts(trans, wc.replay_dest,
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol
  2020-12-02 19:50 ` [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol Josef Bacik
@ 2020-12-03  2:43   ` Qu Wenruo
  2020-12-03 16:06     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:43 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1075 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in create_subvol.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ioctl.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 703212ff50a5..ad50e654ee64 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -714,7 +714,11 @@ static noinline int create_subvol(struct inode *dir,
>  	/* Freeing will be done in btrfs_put_root() of new_root */
>  	anon_dev = 0;
>  
> -	btrfs_record_root_in_trans(trans, new_root);
> +	ret = btrfs_record_root_in_trans(trans, new_root);
> +	if (ret) {

Dont' we need to call btrfs_put_root()? Or since we're going to abort
transaction anyway, it doesn't matter that much any more?

Thanks,
Qu
> +		btrfs_abort_transaction(trans, ret);
> +		goto fail;
> +	}
>  
>  	ret = btrfs_create_subvol_root(trans, new_root, root, new_dirid);
>  	btrfs_put_root(new_root);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block
  2020-12-02 19:50 ` [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block Josef Bacik
@ 2020-12-03  2:44   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 939 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in relocate_tree_block.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 5a4b44857522..e9d445899818 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2559,7 +2559,9 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
>  				ret = -EUCLEAN;
>  				goto out;
>  			}
> -			btrfs_record_root_in_trans(trans, root);
> +			ret = btrfs_record_root_in_trans(trans, root);
> +			if (ret)
> +				goto out;
>  			root = root->reloc_root;
>  			node->new_bytenr = root->node->start;
>  			btrfs_put_root(node->root);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction
  2020-12-02 19:50 ` [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction Josef Bacik
@ 2020-12-03  2:47   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:47 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 957 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_record_root_in_trans will return errors in the future, so handle
> the error properly in start_transaction.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

THanks,
Qu
> ---
>  fs/btrfs/transaction.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 28e7a7464b60..c17ab5194f5a 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -734,7 +734,11 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
>  	 * Thus it need to be called after current->journal_info initialized,
>  	 * or we can deadlock.
>  	 */
> -	btrfs_record_root_in_trans(h, root);
> +	ret = btrfs_record_root_in_trans(h, root);
> +	if (ret) {
> +		btrfs_end_transaction(h);
> +		return ERR_PTR(ret);
> +	}
>  
>  	return h;
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot
  2020-12-02 19:50 ` [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot Josef Bacik
@ 2020-12-03  2:48   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:48 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1193 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> record_root_in_trans can fail currently, so handle this failure
> properly.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/transaction.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index c17ab5194f5a..db676d99b098 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1436,7 +1436,9 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
>  	 * recorded root will never be updated again, causing an outdated root
>  	 * item.
>  	 */
> -	record_root_in_trans(trans, src, 1);
> +	ret = record_root_in_trans(trans, src, 1);
> +	if (ret)
> +		return ret;
>  
>  	/*
>  	 * We are going to commit transaction, see btrfs_commit_transaction()
> @@ -1488,7 +1490,7 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
>  	 * insert_dir_item()
>  	 */
>  	if (!ret)
> -		record_root_in_trans(trans, parent, 1);
> +		ret = record_root_in_trans(trans, parent, 1);
>  	return ret;
>  }
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot
  2020-12-02 19:50 ` [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot Josef Bacik
@ 2020-12-03  2:56   ` Qu Wenruo
  2020-12-03 16:14     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  2:56 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1559 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> record_root_in_trans can currently fail, so handle this failure
> properly.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But I guess it would be better to folding patch 17~26 into one big patch.

Since each of them are really small.

Thanks,
Qu

> ---
>  fs/btrfs/transaction.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 087d919de9fb..5393c0c4926c 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1568,8 +1568,9 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
>  	dentry = pending->dentry;
>  	parent_inode = pending->dir;
>  	parent_root = BTRFS_I(parent_inode)->root;
> -	record_root_in_trans(trans, parent_root, 0);
> -
> +	ret = record_root_in_trans(trans, parent_root, 0);
> +	if (ret)
> +		goto fail;
>  	cur_time = current_time(parent_inode);
>  
>  	/*
> @@ -1605,7 +1606,11 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
>  		goto fail;
>  	}
>  
> -	record_root_in_trans(trans, root, 0);
> +	ret = record_root_in_trans(trans, root, 0);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto fail;
> +	}
>  	btrfs_set_root_last_snapshot(&root->root_item, trans->transid);
>  	memcpy(new_root_item, &root->root_item, sizeof(*new_root_item));
>  	btrfs_check_and_init_root_item(new_root_item);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root
  2020-12-02 19:50 ` [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root Josef Bacik
@ 2020-12-03  3:00   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  3:00 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1337 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> If we have a duplicate entry for a reloc root then we could have fs
> corruption that resulted in a double allocation.  This shouldn't happen
> generally so leave an ASSERT() for this case, but return an error
> instead of panicing in the normal user case.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Despite the same comment on using ASSERT(0) inside the error branch, it
looks fine.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index e9d445899818..7993a34a46ca 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -637,10 +637,12 @@ static int __must_check __add_reloc_root(struct btrfs_root *root)
>  	rb_node = rb_simple_insert(&rc->reloc_root_tree.rb_root,
>  				   node->bytenr, &node->rb_node);
>  	spin_unlock(&rc->reloc_root_tree.lock);
> +	ASSERT(rb_node == NULL);
>  	if (rb_node) {
> -		btrfs_panic(fs_info, -EEXIST,
> +		btrfs_err(fs_info,
>  			    "Duplicate root found for start=%llu while inserting into relocation tree",
>  			    node->bytenr);
> +		return -EEXIST;
>  	}
>  
>  	list_add_tail(&root->root_list, &rc->reloc_roots);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root
  2020-12-02 19:50 ` [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root Josef Bacik
@ 2020-12-03  3:29   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  3:29 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2622 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We do memory allocations here, read blocks from disk, all sorts of
> operations that could easily fail at any given point.  Instead of
> panicing the box, simply return the error back up the chain, all callers
> at this point have proper error handling.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/relocation.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 6d3a80d54b32..cebf8e9d7d96 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -737,10 +737,11 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
>  	struct extent_buffer *eb;
>  	struct btrfs_root_item *root_item;
>  	struct btrfs_key root_key;
> -	int ret;
> +	int ret = 0;
>  
>  	root_item = kmalloc(sizeof(*root_item), GFP_NOFS);
> -	BUG_ON(!root_item);
> +	if (!root_item)
> +		return ERR_PTR(-ENOMEM);
>  
>  	root_key.objectid = BTRFS_TREE_RELOC_OBJECTID;
>  	root_key.type = BTRFS_ROOT_ITEM_KEY;
> @@ -752,7 +753,9 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
>  		/* called by btrfs_init_reloc_root */
>  		ret = btrfs_copy_root(trans, root, root->commit_root, &eb,
>  				      BTRFS_TREE_RELOC_OBJECTID);
> -		BUG_ON(ret);
> +		if (ret)
> +			goto fail;
> +
>  		/*
>  		 * Set the last_snapshot field to the generation of the commit
>  		 * root - like this ctree.c:btrfs_block_can_be_shared() behaves
> @@ -773,7 +776,8 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
>  		 */
>  		ret = btrfs_copy_root(trans, root, root->node, &eb,
>  				      BTRFS_TREE_RELOC_OBJECTID);
> -		BUG_ON(ret);
> +		if (ret)
> +			goto fail;
>  	}
>  
>  	memcpy(root_item, &root->root_item, sizeof(*root_item));
> @@ -793,14 +797,20 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
>  
>  	ret = btrfs_insert_root(trans, fs_info->tree_root,
>  				&root_key, root_item);
> -	BUG_ON(ret);
> +	if (ret)
> +		goto fail;
> +
>  	kfree(root_item);
>  
>  	reloc_root = btrfs_read_tree_root(fs_info->tree_root, &root_key);
> -	BUG_ON(IS_ERR(reloc_root));
> +	if (IS_ERR(reloc_root))
> +		return reloc_root;
>  	set_bit(BTRFS_ROOT_SHAREABLE, &reloc_root->state);
>  	reloc_root->last_trans = trans->transid;
>  	return reloc_root;
> +fail:
> +	kfree(root_item);
> +	return ERR_PTR(ret);
>  }
>  
>  /*
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans
  2020-12-02 19:50 ` [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans Josef Bacik
@ 2020-12-03  4:49   ` Qu Wenruo
  2020-12-03 16:18     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  4:49 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Zygo Blaxell


[-- Attachment #1.1: Type: text/plain, Size: 2181 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> If we fail to setup a ->reloc_root in a different thread that path will
> error out, however it still leaves root->reloc_root NULL but would still
> appear set up in the transaction.  Subsequent calls to
> btrfs_record_root_in_transaction would succeed without attempting to
> create the reloc root, as the transid has already been update.  Handle
> this case by making sure we have a root->reloc_root set after a
> btrfs_record_root_in_transaction call so we don't end up deref'ing a
> NULL pointer.
> 
> Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

The fix here is mostly based on the fact that pointer assignment is atomic.

But I'm wondering if we can do it better by using something like
spinlock to make it more explicit.
Or is such root->reloc_lock too overkilled?

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index cebf8e9d7d96..c9df05f02649 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2078,6 +2078,13 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
>  			return ERR_PTR(ret);
>  		root = root->reloc_root;
>  
> +		/*
> +		 * We could have raced with another thread which failed, so
> +		 * ->reloc_root may not be set, return -ENOENT in this case.
> +		 */
> +		if (!root)
> +			return ERR_PTR(-ENOENT);
> +
>  		if (next->new_bytenr != root->node->start) {
>  			/*
>  			 * We just created the reloc root, so we shouldn't have
> @@ -2579,6 +2586,14 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
>  			ret = btrfs_record_root_in_trans(trans, root);
>  			if (ret)
>  				goto out;
> +			/*
> +			 * Another thread could have failed, need to check if we
> +			 * have ->reloc_root actually set.
> +			 */
> +			if (!root->reloc_root) {
> +				ret = -ENOENT;
> +				goto out;
> +			}
>  			root = root->reloc_root;
>  			node->new_bytenr = root->node->start;
>  			btrfs_put_root(node->root);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots
  2020-12-02 19:50 ` [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots Josef Bacik
@ 2020-12-03  4:51   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  4:51 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1124 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> btrfs_update_reloc_root will will return errors in the future, so handle
> the error properly in commit_fs_roots.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

The patch itself is OK.

Reviewed-by: Qu Wenruo <wqu@suse.com>

But it would really help more if all the btrfs_update_reloc_root() error
handling patch can be merged into one.

Thanks,
Qu
> ---
>  fs/btrfs/transaction.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 5393c0c4926c..5064beff3f9f 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1344,7 +1344,9 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>  			spin_unlock(&fs_info->fs_roots_radix_lock);
>  
>  			btrfs_free_log(trans, root);
> -			btrfs_update_reloc_root(trans, root);
> +			err = btrfs_update_reloc_root(trans, root);
> +			if (err)
> +				return err;
>  
>  			/* see comments in should_cow_block() */
>  			clear_bit(BTRFS_ROOT_FORCE_COW, &root->state);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root
  2020-12-02 19:50 ` [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root Josef Bacik
@ 2020-12-03  4:54   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  4:54 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1344 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We call btrfs_update_root in btrfs_update_reloc_root, which can fail for
> all sorts of reasons, including IO errors.  Instead of panicing the box
> lets return the error, now that all callers properly handle those
> errors.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But a little surprised that, btrfs_update_reloc_root() has int return
value but we still uses BUG_ON() for error handling.

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index e41d14958b8b..2fcb07bc8450 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -894,7 +894,7 @@ int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
>  	int ret;
>  
>  	if (!have_reloc_root(root))
> -		goto out;
> +		return 0;
>  
>  	reloc_root = root->reloc_root;
>  	root_item = &reloc_root->root_item;
> @@ -927,10 +927,8 @@ int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
>  
>  	ret = btrfs_update_root(trans, fs_info->tree_root,
>  				&reloc_root->root_key, root_item);
> -	BUG_ON(ret);
>  	btrfs_put_root(reloc_root);
> -out:
> -	return 0;
> +	return ret;
>  }
>  
>  /*
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s
  2020-12-02 19:50 ` [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s Josef Bacik
@ 2020-12-03  4:55   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  4:55 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1450 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> A few BUG_ON()'s in replace_path are purely to keep us from making
> logical mistakes, so replace them with ASSERT()'s.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Indeed, these are really just to prevent developers passing wrong
parameters.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 2fcb07bc8450..b872a64de8bb 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1202,8 +1202,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  	int ret;
>  	int slot;
>  
> -	BUG_ON(src->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID);
> -	BUG_ON(dest->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID);
> +	ASSERT(src->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID);
> +	ASSERT(dest->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID);
>  
>  	last_snapshot = btrfs_root_last_snapshot(&src->root_item);
>  again:
> @@ -1234,7 +1234,7 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  	parent = eb;
>  	while (1) {
>  		level = btrfs_header_level(parent);
> -		BUG_ON(level < lowest_level);
> +		ASSERT(level >= lowest_level);
>  
>  		ret = btrfs_bin_search(parent, &key, &slot);
>  		if (ret < 0)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path
  2020-12-02 19:50 ` [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path Josef Bacik
@ 2020-12-03  5:05   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:05 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1413 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> If we error out cow'ing the root node when doing a replace_path then we
> simply unlock and free the buffer and return the error.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

One unrelated thing inlined below.
> ---
>  fs/btrfs/relocation.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index b872a64de8bb..52d6e7ab4265 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1222,7 +1222,11 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  	if (cow) {
>  		ret = btrfs_cow_block(trans, dest, eb, NULL, 0, &eb,
>  				      BTRFS_NESTING_COW);

Is it only me that such btrfs_cow_block() call using eb and source and
again eb as dest looks pretty strange?

Although I have seen a lot of callers in ctree.c doing the same thing,
in fact, ALL btrfs_cow_block() calls uses the same eb for its source and
dest.

Either it means we can remove one parameter of btrfs_cow_block() or it's
really confusing and we should avoid such use case.

Anyway, it would be another patch.

Thanks,
Qu

> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_tree_unlock(eb);
> +			free_extent_buffer(eb);
> +			return ret;
> +		}
>  	}
>  
>  	if (next_key) {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 38/54] btrfs: handle the loop btrfs_cow_block error in replace_path
  2020-12-02 19:50 ` [PATCH v3 38/54] btrfs: handle the loop " Josef Bacik
@ 2020-12-03  5:11   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:11 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1110 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> As we loop through the path to replace it, we will have to cow each node
> we hit on the path down to the lowest_level.  If this fails we simply
> unlock and free the block and break from the loop.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

There are two btrfs_cow_block() calls in replace_path().
It would be better to handle them in the same patch.

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 52d6e7ab4265..781908f3a3af 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1286,7 +1286,11 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  				ret = btrfs_cow_block(trans, dest, eb, parent,
>  						      slot, &eb,
>  						      BTRFS_NESTING_COW);
> -				BUG_ON(ret);
> +				if (ret) {
> +					btrfs_tree_unlock(eb);
> +					free_extent_buffer(eb);
> +					break;
> +				}
>  			}
>  
>  			btrfs_tree_unlock(parent);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure in replace_path
  2020-12-02 19:50 ` [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure " Josef Bacik
@ 2020-12-03  5:13   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:13 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 850 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> This can fail for any number of reasons, why bring the whole box down
> with it?
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

THanks,
Qu
> ---
>  fs/btrfs/relocation.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 781908f3a3af..8c407ebc5500 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1314,7 +1314,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  		path->lowest_level = level;
>  		ret = btrfs_search_slot(trans, src, &key, path, 0, 1);
>  		path->lowest_level = 0;
> -		BUG_ON(ret);
> +		if (ret)
> +			break;
>  
>  		/*
>  		 * Info qgroup to trace both subtrees.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 40/54] btrfs: handle errors in reference count manipulation in replace_path
  2020-12-02 19:50 ` [PATCH v3 40/54] btrfs: handle errors in reference count manipulation " Josef Bacik
@ 2020-12-03  5:14   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:14 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2143 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> If any of the reference count manipulation stuff fails in replace_path
> we need to abort the transaction, as we've modified the blocks already.
> We can simply break at this point and everything will be cleaned up.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 8c407ebc5500..ef33b89e352e 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1355,27 +1355,39 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
>  		ref.skip_qgroup = true;
>  		btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid);
>  		ret = btrfs_inc_extent_ref(trans, &ref);
> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			break;
> +		}
>  		btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, new_bytenr,
>  				       blocksize, 0);
>  		ref.skip_qgroup = true;
>  		btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid);
>  		ret = btrfs_inc_extent_ref(trans, &ref);
> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			break;
> +		}
>  
>  		btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, new_bytenr,
>  				       blocksize, path->nodes[level]->start);
>  		btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid);
>  		ref.skip_qgroup = true;
>  		ret = btrfs_free_extent(trans, &ref);
> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			break;
> +		}
>  
>  		btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, old_bytenr,
>  				       blocksize, 0);
>  		btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid);
>  		ref.skip_qgroup = true;
>  		ret = btrfs_free_extent(trans, &ref);
> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			break;
> +		}
>  
>  		btrfs_unlock_up_safe(path, 0);
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation
  2020-12-02 19:50 ` [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation Josef Bacik
@ 2020-12-03  5:15   ` Qu Wenruo
  2020-12-03 16:26     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:15 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1258 bytes --]



On 2020/12/3 上午3:50, Josef Bacik wrote:
> We can already deal with errors appropriately from do_relocation, simply
> handle any errors that come from changing the refs at this point
> cleanly.  We have to abort the transaction if we fail here as we've
> modified metadata at this point.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index ef33b89e352e..3159f6517588 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2433,10 +2433,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>  			btrfs_init_tree_ref(&ref, node->level,
>  					    btrfs_header_owner(upper->eb));
>  			ret = btrfs_inc_extent_ref(trans, &ref);
> -			BUG_ON(ret);
> -
> -			ret = btrfs_drop_subtree(trans, root, eb, upper->eb);
> -			BUG_ON(ret);
> +			if (ret) {
> +				btrfs_abort_transaction(trans, ret);
> +				goto next;
> +			}
> +			btrfs_drop_subtree(trans, root, eb, upper->eb);

Wait for second. Now we don't handle the error for btrfs_drop_subtree()
completely?

Thanks,
Qu
>  		}
>  next:
>  		if (!upper->pending)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly
  2020-12-02 19:51 ` [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly Josef Bacik
@ 2020-12-03  5:19   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1116 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> We need to validate that a data extent item does not have the
> FULL_BACKREF flag set on it's flags.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But an idea for new patch inlined below.
> ---
>  fs/btrfs/tree-checker.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 028e733e42f3..39714aeb9b36 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -1283,6 +1283,11 @@ static int check_extent_item(struct extent_buffer *leaf,
>  				   key->offset, fs_info->sectorsize);
>  			return -EUCLEAN;
>  		}
> +		if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
> +			extent_err(leaf, slot,
> +			"invalid extent flag, data has full backref set");
> +			return -EUCLEAN;
> +		}

Since we're already in tree-checker, another possible check is, to
ensure COW tree only have one inline ref, and no keyed ref.

Thanks,
Qu
>  	}
>  	ptr = (unsigned long)(struct btrfs_extent_item *)(ei + 1);
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group
  2020-12-02 19:51 ` [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group Josef Bacik
@ 2020-12-03  5:20   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:20 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2347 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> These checks are all taken care of for us by the tree checker code.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Yeah! Finally see a day where tree-checker is involved in removing
duplicated checks.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/relocation.c | 29 +----------------------------
>  1 file changed, 1 insertion(+), 28 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 3159f6517588..8f4f1e21c770 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -3370,20 +3370,6 @@ static void unset_reloc_control(struct reloc_control *rc)
>  	mutex_unlock(&fs_info->reloc_mutex);
>  }
>  
> -static int check_extent_flags(u64 flags)
> -{
> -	if ((flags & BTRFS_EXTENT_FLAG_DATA) &&
> -	    (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK))
> -		return 1;
> -	if (!(flags & BTRFS_EXTENT_FLAG_DATA) &&
> -	    !(flags & BTRFS_EXTENT_FLAG_TREE_BLOCK))
> -		return 1;
> -	if ((flags & BTRFS_EXTENT_FLAG_DATA) &&
> -	    (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF))
> -		return 1;
> -	return 0;
> -}
> -
>  static noinline_for_stack
>  int prepare_to_relocate(struct reloc_control *rc)
>  {
> @@ -3435,7 +3421,6 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
>  	struct btrfs_path *path;
>  	struct btrfs_extent_item *ei;
>  	u64 flags;
> -	u32 item_size;
>  	int ret;
>  	int err = 0;
>  	int progress = 0;
> @@ -3484,19 +3469,7 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
>  
>  		ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
>  				    struct btrfs_extent_item);
> -		item_size = btrfs_item_size_nr(path->nodes[0], path->slots[0]);
> -		if (item_size >= sizeof(*ei)) {
> -			flags = btrfs_extent_flags(path->nodes[0], ei);
> -			ret = check_extent_flags(flags);
> -			BUG_ON(ret);
> -		} else if (unlikely(item_size == sizeof(struct btrfs_extent_item_v0))) {
> -			err = -EINVAL;
> -			btrfs_print_v0_err(trans->fs_info);
> -			btrfs_abort_transaction(trans, err);
> -			break;
> -		} else {
> -			BUG();
> -		}
> +		flags = btrfs_extent_flags(path->nodes[0], ei);
>  
>  		if (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK) {
>  			ret = add_tree_block(rc, &key, path, &blocks);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode
  2020-12-02 19:51 ` [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode Josef Bacik
@ 2020-12-03  5:25   ` Qu Wenruo
  2020-12-03 16:34     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:25 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1470 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> We already handle some errors in this function, and the callers do the
> correct error handling, so clean up the rest of the function to do the
> appropriate error handling.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 8f4f1e21c770..bcced4e436af 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -3634,10 +3634,15 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>  		goto out;
>  
>  	err = __insert_orphan_inode(trans, root, objectid);
> -	BUG_ON(err);
> +	if (err)
> +		goto out;
>  
>  	inode = btrfs_iget(fs_info->sb, objectid, root);
> -	BUG_ON(IS_ERR(inode));
> +	if (IS_ERR(inode)) {

When error happens here, we have already inserted an inode item into the
data reloc root, without the orphan item to clean it up.

It won't cause any problem, since we have u64 to store almost endless
inodes in a mostly empty tree.

But I guess we'd still better try to delete the inserted inode item, or
data reloc tree may one day become a landfill with all those inode items.

Thanks,
Qu
> +		err = PTR_ERR(inode);
> +		inode = NULL;
> +		goto out;
> +	}
>  	BTRFS_I(inode)->index_cnt = group->start;
>  
>  	err = btrfs_orphan_add(trans, BTRFS_I(inode));
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation
  2020-12-02 19:51 ` [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation Josef Bacik
@ 2020-12-03  5:32   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:32 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1188 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> We can already handle errors appropriately from this function, deal with
> an error coming from __add_reloc_root appropriately.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

It turns out that we can do less cleanups, as if we error out here, the
fs won't be mounted any way.

Thus things like reloc tree don't need to be dropped.

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index bcced4e436af..6315e74c1da0 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -3984,7 +3984,12 @@ int btrfs_recover_relocation(struct btrfs_root *root)
>  		}
>  
>  		err = __add_reloc_root(reloc_root);
> -		BUG_ON(err < 0); /* -ENOMEM or logic error */
> +		if (err) {
> +			list_add_tail(&reloc_root->root_list, &reloc_roots);
> +			btrfs_put_root(fs_root);
> +			btrfs_end_transaction(trans);
> +			goto out_unset;
> +		}
>  		fs_root->reloc_root = btrfs_grab_root(reloc_root);
>  		btrfs_put_root(fs_root);
>  	}
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot
  2020-12-02 19:51 ` [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot Josef Bacik
@ 2020-12-03  5:34   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:34 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 989 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> If we fail to add the reloc root, drop it and return the error.  All
> callers of this function already handle errors appropriately.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Better to fold into previous patch.

Despite that,

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/relocation.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 6315e74c1da0..695a52cd07b0 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -4204,7 +4204,10 @@ int btrfs_reloc_post_snapshot(struct btrfs_trans_handle *trans,
>  		return PTR_ERR(reloc_root);
>  
>  	ret = __add_reloc_root(reloc_root);
> -	BUG_ON(ret < 0);
> +	if (ret) {
> +		btrfs_put_root(reloc_root);
> +		return ret;
> +	}
>  	new_root->reloc_root = btrfs_grab_root(reloc_root);
>  
>  	if (rc->create_reloc_tree)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge
  2020-12-02 19:51 ` [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge Josef Bacik
@ 2020-12-03  5:39   ` Qu Wenruo
  2020-12-03 16:53     ` Josef Bacik
  0 siblings, 1 reply; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:39 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1527 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> This probably can't happen even with a corrupt file system, because we
> would have failed much earlier on than here.  However there's no reason
> we can't just check and bail out as appropriate, so do that and convert
> the correctness BUG_ON() to an ASSERT().
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

The handling it self is kinda OK.

Reviewed-by: Qu Wenruo <wqu@suse.com>

But still some (maybe unrelated) question inlined below.
> ---
>  fs/btrfs/relocation.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 695a52cd07b0..d4656a8f507d 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1870,8 +1870,14 @@ int prepare_to_merge(struct reloc_control *rc, int err)
>  
>  		root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
>  				false);
> -		BUG_ON(IS_ERR(root));
> -		BUG_ON(root->reloc_root != reloc_root);
> +		if (IS_ERR(root)) {
> +			list_add(&reloc_root->root_list, &reloc_roots);

I found it pretty strange that even if prepare_to_merge() failed, we
still go merge_reloc_roots().

I guess we'd better handle that first?

Thanks,
Qu
> +			btrfs_abort_transaction(trans, (int)PTR_ERR(root));
> +			if (!err)
> +				err = PTR_ERR(root);
> +			break;
> +		}
> +		ASSERT(root->reloc_root == reloc_root);
>  
>  		/*
>  		 * set reference count to 1, so btrfs_recover_relocation
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly
  2020-12-02 19:51 ` [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly Josef Bacik
@ 2020-12-03  5:40   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:40 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1822 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> In corruption cases we could have paths from a block up to no root at
> all, and thus we'll BUG_ON(!root) in select_one_root.  Handle this by
> adding an ASSERT() for developers, and returning an error for normal
> users.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index d4656a8f507d..91479979d2a7 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2200,7 +2200,16 @@ struct btrfs_root *select_one_root(struct btrfs_backref_node *node)
>  		cond_resched();
>  		next = walk_up_backref(next, edges, &index);
>  		root = next->root;
> -		BUG_ON(!root);
> +
> +		/*
> +		 * This can occur if we have incomplete extent refs leading all
> +		 * the way up a particular path, in this case return -EUCLEAN.
> +		 * However leave as an ASSERT() for developers, because it could
> +		 * indicate a bug in the backref code.
> +		 */
> +		ASSERT(root);
> +		if (!root)
> +			return ERR_PTR(-EUCLEAN);

Just the same comment on using ASSERT(0) in the error branch.

Despite that looks OK to me.

Thanks,
Qu
>  
>  		/* No other choice for non-shareable tree */
>  		if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
> @@ -2598,8 +2607,12 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
>  
>  	BUG_ON(node->processed);
>  	root = select_one_root(node);
> -	if (root == ERR_PTR(-ENOENT)) {
> -		update_processed_blocks(rc, node);
> +	if (IS_ERR(root)) {
> +		ret = PTR_ERR(root);
> +		if (ret == -ENOENT) {
> +			ret = 0;
> +			update_processed_blocks(rc, node);
> +		}
>  		goto out;
>  	}
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots
  2020-12-02 19:51 ` [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots Josef Bacik
@ 2020-12-03  5:42   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:42 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1561 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> We have a BUG_ON() if we get an error back from btrfs_get_fs_root().
> This honestly should never fail, as at this point we have a solid
> coordination of fs root to reloc root, and these roots will all be in
> memory.  But in the name of killing BUG_ON()'s remove this one and
> handle the error properly.  Change the remaining BUG_ON() to an
> ASSERT().
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 91479979d2a7..099a64b47020 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1949,9 +1949,18 @@ void merge_reloc_roots(struct reloc_control *rc)
>  
>  		root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
>  					 false);
> +		if (IS_ERR(root)) {
> +			/*
> +			 * This likely won't happen, since we would have failed
> +			 * at a higher level.  However for correctness sake
> +			 * handle the error anyway.
> +			 */

Maybe another ASSERT(0)?

Despite that looks good.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> +			ret = PTR_ERR(root);
> +			goto out;
> +		}
> +
>  		if (btrfs_root_refs(&reloc_root->root_item) > 0) {
> -			BUG_ON(IS_ERR(root));
> -			BUG_ON(root->reloc_root != reloc_root);
> +			ASSERT(root->reloc_root == reloc_root);
>  			ret = merge_reloc_root(rc, root);
>  			btrfs_put_root(root);
>  			if (ret) {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation
  2020-12-02 19:51 ` [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation Josef Bacik
@ 2020-12-03  5:42   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:42 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 1625 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> There's a few places where we don't check the return value of
> btrfs_commit_transaction in relocation.c.  Thankfully all these places
> have straightforward error handling, so simply change all of the sites
> at once.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/relocation.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 099a64b47020..15b6e54394b7 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1905,7 +1905,7 @@ int prepare_to_merge(struct reloc_control *rc, int err)
>  	list_splice(&reloc_roots, &rc->reloc_roots);
>  
>  	if (!err)
> -		btrfs_commit_transaction(trans);
> +		err = btrfs_commit_transaction(trans);
>  	else
>  		btrfs_end_transaction(trans);
>  	return err;
> @@ -3436,8 +3436,7 @@ int prepare_to_relocate(struct reloc_control *rc)
>  		 */
>  		return PTR_ERR(trans);
>  	}
> -	btrfs_commit_transaction(trans);
> -	return 0;
> +	return btrfs_commit_transaction(trans);
>  }
>  
>  static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
> @@ -3596,7 +3595,9 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
>  		err = PTR_ERR(trans);
>  		goto out_free;
>  	}
> -	btrfs_commit_transaction(trans);
> +	ret = btrfs_commit_transaction(trans);
> +	if (ret && !err)
> +		err = ret;
>  out_free:
>  	ret = clean_dirty_subvols(rc);
>  	if (ret < 0 && !err)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name
  2020-12-02 19:51 ` [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name Josef Bacik
@ 2020-12-03  5:44   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2669 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> We're supposed to print the root_key.offset in btrfs_root_name in the
> case of a reloc root, not the objectid.  Fix this helper to take the key
> so we have access to the offset when we need it.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

THanks,
Qu

> ---
>  fs/btrfs/disk-io.c    |  2 +-
>  fs/btrfs/print-tree.c | 10 +++++-----
>  fs/btrfs/print-tree.h |  2 +-
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 46dd9e0b077e..c73d172aa1f7 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1458,7 +1458,7 @@ void btrfs_check_leaked_roots(struct btrfs_fs_info *fs_info)
>  		root = list_first_entry(&fs_info->allocated_roots,
>  					struct btrfs_root, leak_list);
>  		btrfs_err(fs_info, "leaked root %s refcount %d",
> -			  btrfs_root_name(root->root_key.objectid, buf),
> +			  btrfs_root_name(&root->root_key, buf),
>  			  refcount_read(&root->refs));
>  		while (refcount_read(&root->refs) > 1)
>  			btrfs_put_root(root);
> diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
> index fe5e0026129d..b8137dbf6a3a 100644
> --- a/fs/btrfs/print-tree.c
> +++ b/fs/btrfs/print-tree.c
> @@ -26,22 +26,22 @@ static const struct root_name_map root_map[] = {
>  	{ BTRFS_DATA_RELOC_TREE_OBJECTID,	"DATA_RELOC_TREE"	},
>  };
>  
> -const char *btrfs_root_name(u64 objectid, char *buf)
> +const char *btrfs_root_name(struct btrfs_key *key, char *buf)
>  {
>  	int i;
>  
> -	if (objectid == BTRFS_TREE_RELOC_OBJECTID) {
> +	if (key->objectid == BTRFS_TREE_RELOC_OBJECTID) {
>  		snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN,
> -			 "TREE_RELOC offset=%llu", objectid);
> +			 "TREE_RELOC offset=%llu", key->offset);
>  		return buf;
>  	}
>  
>  	for (i = 0; i < ARRAY_SIZE(root_map); i++) {
> -		if (root_map[i].id == objectid)
> +		if (root_map[i].id == key->objectid)
>  			return root_map[i].name;
>  	}
>  
> -	snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN, "%llu", objectid);
> +	snprintf(buf, BTRFS_ROOT_NAME_BUF_LEN, "%llu", key->objectid);
>  	return buf;
>  }
>  
> diff --git a/fs/btrfs/print-tree.h b/fs/btrfs/print-tree.h
> index 78b99385a503..802628dd1a6e 100644
> --- a/fs/btrfs/print-tree.h
> +++ b/fs/btrfs/print-tree.h
> @@ -11,6 +11,6 @@
>  
>  void btrfs_print_leaf(struct extent_buffer *l);
>  void btrfs_print_tree(struct extent_buffer *c, bool follow);
> -const char *btrfs_root_name(u64 objectid, char *buf);
> +const char *btrfs_root_name(struct btrfs_key *key, char *buf);
>  
>  #endif
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list
  2020-12-02 19:51 ` [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list Josef Bacik
@ 2020-12-03  5:47   ` Qu Wenruo
  0 siblings, 0 replies; 114+ messages in thread
From: Qu Wenruo @ 2020-12-03  5:47 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team


[-- Attachment #1.1: Type: text/plain, Size: 2861 bytes --]



On 2020/12/3 上午3:51, Josef Bacik wrote:
> While doing error injection testing with my relocation patches I hit the
> following ASSERT()
> 
> assertion failed: list_empty(&block_group->dirty_list), in fs/btrfs/block-group.c:3356
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/ctree.h:3357!
> invalid opcode: 0000 [#1] SMP NOPTI
> CPU: 0 PID: 24351 Comm: umount Tainted: G        W         5.10.0-rc3+ #193
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
> RIP: 0010:assertfail.constprop.0+0x18/0x1a
> RSP: 0018:ffffa09b019c7e00 EFLAGS: 00010282
> RAX: 0000000000000056 RBX: ffff8f6492c18000 RCX: 0000000000000000
> RDX: ffff8f64fbc27c60 RSI: ffff8f64fbc19050 RDI: ffff8f64fbc19050
> RBP: ffff8f6483bbdc00 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffa09b019c7c38 R11: ffffffff85d70928 R12: ffff8f6492c18100
> R13: ffff8f6492c18148 R14: ffff8f6483bbdd70 R15: dead000000000100
> FS:  00007fbfda4cdc40(0000) GS:ffff8f64fbc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbfda666fd0 CR3: 000000013cf66002 CR4: 0000000000370ef0
> Call Trace:
>  btrfs_free_block_groups.cold+0x55/0x55
>  close_ctree+0x2c5/0x306
>  ? fsnotify_destroy_marks+0x14/0x100
>  generic_shutdown_super+0x6c/0x100
>  kill_anon_super+0x14/0x30
>  btrfs_kill_super+0x12/0x20
>  deactivate_locked_super+0x36/0xa0
>  cleanup_mnt+0x12d/0x190
>  task_work_run+0x5c/0xa0
>  exit_to_user_mode_prepare+0x1b1/0x1d0
>  syscall_exit_to_user_mode+0x54/0x280
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> This happened because I injected an error in btrfs_cow_block() while
> running the dirty block groups.  When we run the dirty block groups, we
> splice the list onto a local list to process.  However if an error
> occurs, we only cleanup the transactions dirty block group list, not any
> pending block groups we have on our locally spliced list.  Fix this by
> splicing the list back onto the transactions dirty block group list, so
> any remaining block groups are cleaned up.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>  fs/btrfs/block-group.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 0886e81e5540..5cfa52b1a3b8 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2685,6 +2685,9 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
>  		}
>  		spin_unlock(&cur_trans->dirty_bgs_lock);
>  	} else if (ret < 0) {
> +		spin_lock(&cur_trans->dirty_bgs_lock);
> +		list_splice_init(&dirty, &cur_trans->dirty_bgs);
> +		spin_unlock(&cur_trans->dirty_bgs_lock);
>  		btrfs_cleanup_dirty_bgs(cur_trans, fs_info);
>  	}
>  
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots
  2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
  2020-12-03  1:45   ` Qu Wenruo
@ 2020-12-03  8:09   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03  8:09 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 02/12/2020 20:54, Josef Bacik wrote:
> While doing error injection I would sometimes get a corrupt file system.
> This is because I was injecting errors at btrfs_search_slot, but would
> only do it one time per stack.  This uncovered a problem in
> commit_fs_roots, where if we get an error we would just break.  However
> we're in a nested loop, the main loop being a loop to find all the dirty
> fs roots, and then subsequent root updates would succeed clearing the
> error value.
> 
> This isn't likely to happen in real scenarios, however we could
> potentially get a random ENOMEM once and then not again, and we'd end up
> with a corrupted file system.  Fix this by moving the error checking
> around a bit to the nested loop, as this is the only place where
> something will fail, and return the error as soon as it occurs.
> 
> With this patch my reproducer no longer corrupts the file system.

Better to abort the transaction than to corrupt the FS,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  2020-12-02 19:50 ` [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
  2020-12-03  1:48   ` Qu Wenruo
@ 2020-12-03  8:21   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03  8:21 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation
  2020-12-02 19:50 ` [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
  2020-12-03  1:49   ` Qu Wenruo
@ 2020-12-03  8:44   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03  8:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance
  2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
  2020-12-03  2:06   ` Qu Wenruo
@ 2020-12-03  8:44   ` Johannes Thumshirn
  2020-12-03  9:00   ` Nikolay Borisov
  2 siblings, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03  8:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups
  2020-12-02 19:50 ` [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
  2020-12-03  2:13   ` Qu Wenruo
@ 2020-12-03  8:58   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03  8:58 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance
  2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
  2020-12-03  2:06   ` Qu Wenruo
  2020-12-03  8:44   ` Johannes Thumshirn
@ 2020-12-03  9:00   ` Nikolay Borisov
  2020-12-03 17:04     ` Josef Bacik
  2 siblings, 1 reply; 114+ messages in thread
From: Nikolay Borisov @ 2020-12-03  9:00 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2.12.20 г. 21:50 ч., Josef Bacik wrote:
> I was attempting to reproduce a problem that Zygo hit, but my error
> injection wasn't firing for a few of the common calls to
> btrfs_should_cancel_balance.  This is because the compiler decided to
> inline it at these spots.  Keep this from happening by explicitly
> noinline'ing the function so that error injection will always work.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/relocation.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 2b30e39e922a..ce935139d87b 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2617,7 +2617,7 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
>  /*
>   * Allow error injection to test balance cancellation
>   */
> -int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
> +noinline int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>  {
>  	return atomic_read(&fs_info->balance_cancel_req) ||
>  		fatal_signal_pending(current);
> 

I'd really like to not pay the cost of non-inlining in case error injection is disabled. How about you introduce a new noinline_for_err  define that would add noinline in case error injection is enabled or be optimized away when injection is off. Alternatively, though that would be slightly more work, the ALLOW_ERRO_INJECTION macro can be modified so that all functions that want error injection could be declared as : 

ALLOW_ERROR_INJECTION(ftdec, fname, _etype) 
// same body as before
//
//
//

noinline ftdec 


so functions could be defined as :


ALLOW_ERROR_INJECTION(int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info), btrfs_should_cancel_balance,  ERRNO)

Though that seems a bit unwieldy TBH. 

I'm making the case that we shouldn't introduce extra overhead when it's not required. 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans
  2020-12-02 19:50 ` [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
  2020-12-03  2:20   ` Qu Wenruo
@ 2020-12-03 13:50   ` Johannes Thumshirn
  1 sibling, 0 replies; 114+ messages in thread
From: Johannes Thumshirn @ 2020-12-03 13:50 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 02/12/2020 20:54, Josef Bacik wrote:

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

In a separate future patch we might also want to reverse this if 
statement, so we save a level of indent.

>  	if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
>  	    root->last_trans < trans->transid) || force) {


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads
  2020-12-03  2:04   ` Qu Wenruo
@ 2020-12-03 15:55     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 15:55 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/2/20 9:04 PM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:50, Josef Bacik wrote:
>> While testing the error paths in relocation, I hit the following lockdep
>> splat
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 5.10.0-rc3+ #206 Not tainted
>> ------------------------------------------------------
>> btrfs-balance/1571 is trying to acquire lock:
>> ffff8cdbcc8f77d0 (&head_ref->mutex){+.+.}-{3:3}, at: btrfs_lookup_extent_info+0x156/0x3b0
>>
>> but task is already holding lock:
>> ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #2 (btrfs-tree-00){++++}-{3:3}:
>>         down_write_nested+0x43/0x80
>>         __btrfs_tree_lock+0x27/0x100
>>         btrfs_search_slot+0x248/0x890
>>         relocate_tree_blocks+0x490/0x650
>>         relocate_block_group+0x1ba/0x5d0
>>         kretprobe_trampoline+0x0/0x50
>>
>> -> #1 (btrfs-csum-01){++++}-{3:3}:
>>         down_read_nested+0x43/0x130
>>         __btrfs_tree_read_lock+0x27/0x100
>>         btrfs_read_lock_root_node+0x31/0x40
>>         btrfs_search_slot+0x5ab/0x890
>>         btrfs_del_csums+0x10b/0x3c0
>>         __btrfs_free_extent+0x49d/0x8e0
>>         __btrfs_run_delayed_refs+0x283/0x11f0
>>         btrfs_run_delayed_refs+0x86/0x220
>>         btrfs_start_dirty_block_groups+0x2ba/0x520
>>         kretprobe_trampoline+0x0/0x50
>>
>> -> #0 (&head_ref->mutex){+.+.}-{3:3}:
>>         __lock_acquire+0x1167/0x2150
>>         lock_acquire+0x116/0x3e0
>>         __mutex_lock+0x7e/0x7b0
>>         btrfs_lookup_extent_info+0x156/0x3b0
>>         walk_down_proc+0x1c3/0x280
>>         walk_down_tree+0x64/0xe0
>>         btrfs_drop_subtree+0x182/0x260
>>         do_relocation+0x52e/0x660
>>         relocate_tree_blocks+0x2ae/0x650
>>         relocate_block_group+0x1ba/0x5d0
>>         kretprobe_trampoline+0x0/0x50
>>
>> other info that might help us debug this:
>>
>> Chain exists of:
>>    &head_ref->mutex --> btrfs-csum-01 --> btrfs-tree-00
>>
>>   Possible unsafe locking scenario:
>>
>>         CPU0                    CPU1
>>         ----                    ----
>>    lock(btrfs-tree-00);
>>                                 lock(btrfs-csum-01);
>>                                 lock(btrfs-tree-00);
> 
> I found it a little confusing that, subv trees got the name "tree".
> 
> Maybe another patch to rename it to something like "fs" or "subv" would
> be better?
> 
> [...]
>>
>> As you can see this is bogus, we never take another tree's lock under
>> the csum lock.  This happens because sometimes we have to read tree
>> blocks from disk without knowing which root they belong to during
>> relocation.  We defaulted to an owner of 0, which translates to an fs
>> tree.  This is fine as all fs trees have the same class, but obviously
>> isn't fine if the block belongs to a cow only tree.
>>
>> Thankfully cow only trees only have their owners root as a reference to
>> them, and since we already look up the extent information during
>> relocation, go ahead and check and see if this block might belong to a
>> cow only tree, and if so save the owner in the struct tree_block.  This
>> allows us to read_tree_block with the proper owner, which gets rid of
>> this lockdep splat.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> The fix is OK, although some extra comment inlined below.
>> ---
>>   fs/btrfs/relocation.c | 47 ++++++++++++++++++++++++++++++++++++++++---
>>   1 file changed, 44 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 19b7db8b2117..2b30e39e922a 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -98,6 +98,7 @@ struct tree_block {
>>   		u64 bytenr;
>>   	}; /* Use rb_simple_node for search/insert */
>>   	struct btrfs_key key;
>> +	u64 owner;
>>   	unsigned int level:8;
>>   	unsigned int key_ready:1;
>>   };
>> @@ -2393,8 +2394,8 @@ static int get_tree_block_key(struct btrfs_fs_info *fs_info,
>>   {
>>   	struct extent_buffer *eb;
>>   
>> -	eb = read_tree_block(fs_info, block->bytenr, 0, block->key.offset,
>> -			     block->level, NULL);
>> +	eb = read_tree_block(fs_info, block->bytenr, block->owner,
>> +			     block->key.offset, block->level, NULL);
>>   	if (IS_ERR(eb)) {
>>   		return PTR_ERR(eb);
>>   	} else if (!extent_buffer_uptodate(eb)) {
>> @@ -2493,7 +2494,8 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans,
>>   	/* Kick in readahead for tree blocks with missing keys */
>>   	rbtree_postorder_for_each_entry_safe(block, next, blocks, rb_node) {
>>   		if (!block->key_ready)
>> -			btrfs_readahead_tree_block(fs_info, block->bytenr, 0, 0,
>> +			btrfs_readahead_tree_block(fs_info, block->bytenr,
>> +						   block->owner, 0,
>>   						   block->level);
>>   	}
>>   
>> @@ -2801,21 +2803,59 @@ static int add_tree_block(struct reloc_control *rc,
>>   	u32 item_size;
>>   	int level = -1;
>>   	u64 generation;
>> +	u64 owner = 0;
>>   
>>   	eb =  path->nodes[0];
>>   	item_size = btrfs_item_size_nr(eb, path->slots[0]);
>>   
>>   	if (extent_key->type == BTRFS_METADATA_ITEM_KEY ||
>>   	    item_size >= sizeof(*ei) + sizeof(*bi)) {
>> +		unsigned long ptr = 0, end;
> 
> Do we really need that end to iterate through the extent item?
> 
> For cow-only trees, we only cow them to do the balance, which means
> metadata/extent item for them should only contain one inline item and no
> way to have keyed item.
> 
> If the metadata/extent item has more than one inline ref, it must not be
> for COW trees.
> 
> Can't we use extent item size as a quick check?

If you look further down you'll see that I only check the first inline ref, I 
don't loop through all of them.  I also don't bother to check if num_refs > 1, 
or if FULL_BACKREF is set.  The only time we actually check is if there is only 
one inline ref.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol
  2020-12-03  2:43   ` Qu Wenruo
@ 2020-12-03 16:06     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:06 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/2/20 9:43 PM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:50, Josef Bacik wrote:
>> btrfs_record_root_in_trans will return errors in the future, so handle
>> the error properly in create_subvol.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/ioctl.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 703212ff50a5..ad50e654ee64 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -714,7 +714,11 @@ static noinline int create_subvol(struct inode *dir,
>>   	/* Freeing will be done in btrfs_put_root() of new_root */
>>   	anon_dev = 0;
>>   
>> -	btrfs_record_root_in_trans(trans, new_root);
>> +	ret = btrfs_record_root_in_trans(trans, new_root);
>> +	if (ret) {
> 
> Dont' we need to call btrfs_put_root()? Or since we're going to abort
> transaction anyway, it doesn't matter that much any more?
> 

Nope you're right, and in fact it's a little broken without this patch as well, 
I'll fix the existing brokenness and fix this mistake as well.  Good catch!

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot
  2020-12-03  2:56   ` Qu Wenruo
@ 2020-12-03 16:14     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/2/20 9:56 PM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:50, Josef Bacik wrote:
>> record_root_in_trans can currently fail, so handle this failure
>> properly.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> But I guess it would be better to folding patch 17~26 into one big patch.
> 
> Since each of them are really small.
> 

I don't like to do that because it makes it easier for us to just gloss over the 
change rather than checking each site.  You prove my point by noticing that I 
wasn't dropping the new_root ref in the error case for

   btrfs: handle btrfs_record_root_in_trans failure in create_subvol

It would have been easy for you to gloss over that change if it were in a giant 
patch.  I find it nice to have it in distinct patches so I'm forced to check the 
context of every patch I'm reviewing.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans
  2020-12-03  4:49   ` Qu Wenruo
@ 2020-12-03 16:18     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:18 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team; +Cc: Zygo Blaxell

On 12/2/20 11:49 PM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:50, Josef Bacik wrote:
>> If we fail to setup a ->reloc_root in a different thread that path will
>> error out, however it still leaves root->reloc_root NULL but would still
>> appear set up in the transaction.  Subsequent calls to
>> btrfs_record_root_in_transaction would succeed without attempting to
>> create the reloc root, as the transid has already been update.  Handle
>> this case by making sure we have a root->reloc_root set after a
>> btrfs_record_root_in_transaction call so we don't end up deref'ing a
>> NULL pointer.
>>
>> Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> The fix here is mostly based on the fact that pointer assignment is atomic.
> 
> But I'm wondering if we can do it better by using something like
> spinlock to make it more explicit.
> Or is such root->reloc_lock too overkilled?

We are essentially doing that already, as these checks are _after_ the 
btrfs_record_root_in_trans, which does the appropriate locking and such.  The 
"race" is resolved inside of that function itself, so we will either have 
->reloc_root set, or we won't, but it won't magically change between exiting 
btrfs_record_root_in_trans() and us checking root->reloc_root.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation
  2020-12-03  5:15   ` Qu Wenruo
@ 2020-12-03 16:26     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:26 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/3/20 12:15 AM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:50, Josef Bacik wrote:
>> We can already deal with errors appropriately from do_relocation, simply
>> handle any errors that come from changing the refs at this point
>> cleanly.  We have to abort the transaction if we fail here as we've
>> modified metadata at this point.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/relocation.c | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index ef33b89e352e..3159f6517588 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -2433,10 +2433,11 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>>   			btrfs_init_tree_ref(&ref, node->level,
>>   					    btrfs_header_owner(upper->eb));
>>   			ret = btrfs_inc_extent_ref(trans, &ref);
>> -			BUG_ON(ret);
>> -
>> -			ret = btrfs_drop_subtree(trans, root, eb, upper->eb);
>> -			BUG_ON(ret);
>> +			if (ret) {
>> +				btrfs_abort_transaction(trans, ret);
>> +				goto next;
>> +			}
>> +			btrfs_drop_subtree(trans, root, eb, upper->eb);
> 
> Wait for second. Now we don't handle the error for btrfs_drop_subtree()
> completely?
> 

Lol I saw this yesterday and cleaned it up already, idk how the hell that 
happened, thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode
  2020-12-03  5:25   ` Qu Wenruo
@ 2020-12-03 16:34     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:34 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/3/20 12:25 AM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:51, Josef Bacik wrote:
>> We already handle some errors in this function, and the callers do the
>> correct error handling, so clean up the rest of the function to do the
>> appropriate error handling.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/relocation.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 8f4f1e21c770..bcced4e436af 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -3634,10 +3634,15 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>>   		goto out;
>>   
>>   	err = __insert_orphan_inode(trans, root, objectid);
>> -	BUG_ON(err);
>> +	if (err)
>> +		goto out;
>>   
>>   	inode = btrfs_iget(fs_info->sb, objectid, root);
>> -	BUG_ON(IS_ERR(inode));
>> +	if (IS_ERR(inode)) {
> 
> When error happens here, we have already inserted an inode item into the
> data reloc root, without the orphan item to clean it up.
> 
> It won't cause any problem, since we have u64 to store almost endless
> inodes in a mostly empty tree.
> 
> But I guess we'd still better try to delete the inserted inode item, or
> data reloc tree may one day become a landfill with all those inode items.
> 

Yeah we shouldn't be in the business of leaving random artifacts around, I'll 
fix this up.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge
  2020-12-03  5:39   ` Qu Wenruo
@ 2020-12-03 16:53     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 16:53 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kernel-team

On 12/3/20 12:39 AM, Qu Wenruo wrote:
> 
> 
> On 2020/12/3 上午3:51, Josef Bacik wrote:
>> This probably can't happen even with a corrupt file system, because we
>> would have failed much earlier on than here.  However there's no reason
>> we can't just check and bail out as appropriate, so do that and convert
>> the correctness BUG_ON() to an ASSERT().
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> The handling it self is kinda OK.
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> But still some (maybe unrelated) question inlined below.
>> ---
>>   fs/btrfs/relocation.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 695a52cd07b0..d4656a8f507d 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -1870,8 +1870,14 @@ int prepare_to_merge(struct reloc_control *rc, int err)
>>   
>>   		root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
>>   				false);
>> -		BUG_ON(IS_ERR(root));
>> -		BUG_ON(root->reloc_root != reloc_root);
>> +		if (IS_ERR(root)) {
>> +			list_add(&reloc_root->root_list, &reloc_roots);
> 
> I found it pretty strange that even if prepare_to_merge() failed, we
> still go merge_reloc_roots().
> 
> I guess we'd better handle that first?
> 

This is because the cleaning up of the rc->reloc_roots is dealt with in 
merge_reloc_roots().  It's kinda shitty, but something I'm going to address 
later when I rework all of this code.  I tried to limit the scope of this 
patchset to purely the error handling, and then I'll clean up the awfulness in a 
more complicated follow up patchset.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance
  2020-12-03  9:00   ` Nikolay Borisov
@ 2020-12-03 17:04     ` Josef Bacik
  0 siblings, 0 replies; 114+ messages in thread
From: Josef Bacik @ 2020-12-03 17:04 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 12/3/20 4:00 AM, Nikolay Borisov wrote:
> 
> 
> On 2.12.20 г. 21:50 ч., Josef Bacik wrote:
>> I was attempting to reproduce a problem that Zygo hit, but my error
>> injection wasn't firing for a few of the common calls to
>> btrfs_should_cancel_balance.  This is because the compiler decided to
>> inline it at these spots.  Keep this from happening by explicitly
>> noinline'ing the function so that error injection will always work.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/relocation.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 2b30e39e922a..ce935139d87b 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -2617,7 +2617,7 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
>>   /*
>>    * Allow error injection to test balance cancellation
>>    */
>> -int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>> +noinline int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>>   {
>>   	return atomic_read(&fs_info->balance_cancel_req) ||
>>   		fatal_signal_pending(current);
>>
> 
> I'd really like to not pay the cost of non-inlining in case error injection is disabled. How about you introduce a new noinline_for_err  define that would add noinline in case error injection is enabled or be optimized away when injection is off. Alternatively, though that would be slightly more work, the ALLOW_ERRO_INJECTION macro can be modified so that all functions that want error injection could be declared as :
> 
> ALLOW_ERROR_INJECTION(ftdec, fname, _etype)
> // same body as before
> //
> //
> //
> 
> noinline ftdec
> 
> 
> so functions could be defined as :
> 
> 
> ALLOW_ERROR_INJECTION(int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info), btrfs_should_cancel_balance,  ERRNO)
> 
> Though that seems a bit unwieldy TBH.
> 
> I'm making the case that we shouldn't introduce extra overhead when it's not required.
> 

This is something we could address later on a global scale, but this isn't a hot 
path function.  Alexei had to do something similar for 
__add_to_page_cache_locked, for now lets stick with this and I'll work with the 
bpf guys to figure out a reasonable solution.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 114+ messages in thread

end of thread, other threads:[~2020-12-03 17:05 UTC | newest]

Thread overview: 114+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-02 19:50 [PATCH v3 00/54] Cleanup error handling in relocation Josef Bacik
2020-12-02 19:50 ` [PATCH v3 01/54] btrfs: fix error handling in commit_fs_roots Josef Bacik
2020-12-03  1:45   ` Qu Wenruo
2020-12-03  8:09   ` Johannes Thumshirn
2020-12-02 19:50 ` [PATCH v3 02/54] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
2020-12-03  1:48   ` Qu Wenruo
2020-12-03  8:21   ` Johannes Thumshirn
2020-12-02 19:50 ` [PATCH v3 03/54] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
2020-12-03  1:49   ` Qu Wenruo
2020-12-03  8:44   ` Johannes Thumshirn
2020-12-02 19:50 ` [PATCH v3 04/54] btrfs: keep track of the root owner for relocation reads Josef Bacik
2020-12-03  2:04   ` Qu Wenruo
2020-12-03 15:55     ` Josef Bacik
2020-12-02 19:50 ` [PATCH v3 05/54] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
2020-12-03  2:06   ` Qu Wenruo
2020-12-03  8:44   ` Johannes Thumshirn
2020-12-03  9:00   ` Nikolay Borisov
2020-12-03 17:04     ` Josef Bacik
2020-12-02 19:50 ` [PATCH v3 06/54] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Josef Bacik
2020-12-03  2:08   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 07/54] btrfs: pass down the tree block level through ref-verify Josef Bacik
2020-12-02 19:50 ` [PATCH v3 08/54] btrfs: make sure owner is set in ref-verify Josef Bacik
2020-12-02 19:50 ` [PATCH v3 09/54] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
2020-12-03  2:13   ` Qu Wenruo
2020-12-03  8:58   ` Johannes Thumshirn
2020-12-02 19:50 ` [PATCH v3 10/54] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation Josef Bacik
2020-12-03  2:14   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 11/54] btrfs: convert BUG_ON()'s in relocate_tree_block Josef Bacik
2020-12-03  2:15   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 12/54] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
2020-12-03  2:20   ` Qu Wenruo
2020-12-03 13:50   ` Johannes Thumshirn
2020-12-02 19:50 ` [PATCH v3 13/54] btrfs: handle errors from select_reloc_root() Josef Bacik
2020-12-03  2:23   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 14/54] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors Josef Bacik
2020-12-03  2:29   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 15/54] btrfs: check record_root_in_trans related failures in select_reloc_root Josef Bacik
2020-12-03  2:33   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 16/54] btrfs: do proper error handling in record_reloc_root_in_trans Josef Bacik
2020-12-03  2:39   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 17/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange Josef Bacik
2020-12-03  2:40   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 18/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename Josef Bacik
2020-12-02 19:50 ` [PATCH v3 19/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume Josef Bacik
2020-12-03  2:41   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 20/54] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees Josef Bacik
2020-12-03  2:42   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 21/54] btrfs: handle btrfs_record_root_in_trans failure in create_subvol Josef Bacik
2020-12-03  2:43   ` Qu Wenruo
2020-12-03 16:06     ` Josef Bacik
2020-12-02 19:50 ` [PATCH v3 22/54] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block Josef Bacik
2020-12-03  2:44   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 23/54] btrfs: handle btrfs_record_root_in_trans failure in start_transaction Josef Bacik
2020-12-03  2:47   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 24/54] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot Josef Bacik
2020-12-03  2:48   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 25/54] btrfs: handle record_root_in_trans failure in btrfs_record_root_in_trans Josef Bacik
2020-12-02 19:50 ` [PATCH v3 26/54] btrfs: handle record_root_in_trans failure in create_pending_snapshot Josef Bacik
2020-12-03  2:56   ` Qu Wenruo
2020-12-03 16:14     ` Josef Bacik
2020-12-02 19:50 ` [PATCH v3 27/54] btrfs: do not panic in __add_reloc_root Josef Bacik
2020-12-03  3:00   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 28/54] btrfs: have proper error handling in btrfs_init_reloc_root Josef Bacik
2020-12-02 19:50 ` [PATCH v3 29/54] btrfs: do proper error handling in create_reloc_root Josef Bacik
2020-12-03  3:29   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 30/54] btrfs: validate ->reloc_root after recording root in trans Josef Bacik
2020-12-03  4:49   ` Qu Wenruo
2020-12-03 16:18     ` Josef Bacik
2020-12-02 19:50 ` [PATCH v3 31/54] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots Josef Bacik
2020-12-03  4:51   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 32/54] btrfs: change insert_dirty_subvol to return errors Josef Bacik
2020-12-02 19:50 ` [PATCH v3 33/54] btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol Josef Bacik
2020-12-02 19:50 ` [PATCH v3 34/54] btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge Josef Bacik
2020-12-02 19:50 ` [PATCH v3 35/54] btrfs: do proper error handling in btrfs_update_reloc_root Josef Bacik
2020-12-03  4:54   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 36/54] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s Josef Bacik
2020-12-03  4:55   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 37/54] btrfs: handle initial btrfs_cow_block error in replace_path Josef Bacik
2020-12-03  5:05   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 38/54] btrfs: handle the loop " Josef Bacik
2020-12-03  5:11   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 39/54] btrfs: handle btrfs_search_slot failure " Josef Bacik
2020-12-03  5:13   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 40/54] btrfs: handle errors in reference count manipulation " Josef Bacik
2020-12-03  5:14   ` Qu Wenruo
2020-12-02 19:50 ` [PATCH v3 41/54] btrfs: handle extent reference errors in do_relocation Josef Bacik
2020-12-03  5:15   ` Qu Wenruo
2020-12-03 16:26     ` Josef Bacik
2020-12-02 19:51 ` [PATCH v3 42/54] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly Josef Bacik
2020-12-03  5:19   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 43/54] btrfs: remove the extent item sanity checks in relocate_block_group Josef Bacik
2020-12-03  5:20   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 44/54] btrfs: do proper error handling in create_reloc_inode Josef Bacik
2020-12-03  5:25   ` Qu Wenruo
2020-12-03 16:34     ` Josef Bacik
2020-12-02 19:51 ` [PATCH v3 45/54] btrfs: handle __add_reloc_root failure in btrfs_recover_relocation Josef Bacik
2020-12-03  5:32   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 46/54] btrfs: handle __add_reloc_root failure in btrfs_reloc_post_snapshot Josef Bacik
2020-12-03  5:34   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 47/54] btrfs: cleanup error handling in prepare_to_merge Josef Bacik
2020-12-03  5:39   ` Qu Wenruo
2020-12-03 16:53     ` Josef Bacik
2020-12-02 19:51 ` [PATCH v3 48/54] btrfs: handle extent corruption with select_one_root properly Josef Bacik
2020-12-03  5:40   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 49/54] btrfs: do proper error handling in merge_reloc_roots Josef Bacik
2020-12-03  5:42   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 50/54] btrfs: check return value of btrfs_commit_transaction in relocation Josef Bacik
2020-12-03  5:42   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 51/54] btrfs: do not WARN_ON() if we can't find the reloc root Josef Bacik
2020-12-02 19:51 ` [PATCH v3 52/54] btrfs: print the actual offset in btrfs_root_name Josef Bacik
2020-12-03  5:44   ` Qu Wenruo
2020-12-02 19:51 ` [PATCH v3 53/54] btrfs: fix reloc root leak with 0 ref reloc roots on recovery Josef Bacik
2020-12-02 19:51 ` [PATCH v3 54/54] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list Josef Bacik
2020-12-03  5:47   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.