All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: balance root leak and runaway balance fix
@ 2020-05-20  6:58 Qu Wenruo
  2020-05-20  6:58 ` [PATCH 1/2] btrfs: relocation: Fix reloc root leakage and the NULL pointer reference caused by the leakage Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-05-20  6:58 UTC (permalink / raw)
  To: linux-btrfs

This patchset will fix the most wanted balance bug, runaway balance.
All my fault, and all small fixes.

The first patch fixes the root leakage and NULL pointer dereference
caused by it.

The second patch will fix the runaway balance and add alerting system to
prevent such problem from happening again.
The runaway fix depends on the root leakage fix, thus they are sent in a
patchset.

The first patch is just resent without any modification.

For backport to older kernels, the first patch needs small modification
to use atomic_t other than refcount_t.

Qu Wenruo (2):
  btrfs: relocation: Fix reloc root leakage and the NULL pointer
    reference caused by the leakage
  btrfs: relocation: Clear the DEAD_RELOC_TREE bit for orphan roots to
    prevent runaway balance

 fs/btrfs/disk-io.c    |  1 +
 fs/btrfs/relocation.c | 15 ++++++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] btrfs: relocation: Fix reloc root leakage and the NULL pointer reference caused by the leakage
  2020-05-20  6:58 [PATCH 0/2] btrfs: balance root leak and runaway balance fix Qu Wenruo
@ 2020-05-20  6:58 ` Qu Wenruo
  2020-05-20  6:58 ` [PATCH 2/2] btrfs: relocation: Clear the DEAD_RELOC_TREE bit for orphan roots to prevent runaway balance Qu Wenruo
  2020-05-22 11:13 ` [PATCH 0/2] btrfs: balance root leak and runaway balance fix David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-05-20  6:58 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When balance is canceled, there is a pretty high chance that unmounting
the fs can lead to lead the NULL pointer dereference:

  BTRFS warning (device dm-3): page private not zero on page 223158272
  ...
  BTRFS warning (device dm-3): page private not zero on page 223162368
  BTRFS error (device dm-3): leaked root 18446744073709551608-304 refcount 1
  BUG: kernel NULL pointer dereference, address: 0000000000000168
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 2 PID: 5793 Comm: umount Tainted: G           O      5.7.0-rc5-custom+ #53
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__lock_acquire+0x5dc/0x24c0
  Call Trace:
   lock_acquire+0xab/0x390
   _raw_spin_lock+0x39/0x80
   btrfs_release_extent_buffer_pages+0xd7/0x200 [btrfs]
   release_extent_buffer+0xb2/0x170 [btrfs]
   free_extent_buffer+0x66/0xb0 [btrfs]
   btrfs_put_root+0x8e/0x130 [btrfs]
   btrfs_check_leaked_roots.cold+0x5/0x5d [btrfs]
   btrfs_free_fs_info+0xe5/0x120 [btrfs]
   btrfs_kill_super+0x1f/0x30 [btrfs]
   deactivate_locked_super+0x3b/0x80
   deactivate_super+0x3e/0x50
   cleanup_mnt+0x109/0x160
   __cleanup_mnt+0x12/0x20
   task_work_run+0x67/0xa0
   exit_to_usermode_loop+0xc5/0xd0
   syscall_return_slowpath+0x205/0x360
   do_syscall_64+0x6e/0xb0
   entry_SYSCALL_64_after_hwframe+0x49/0xb3
  RIP: 0033:0x7fd028ef740b

[CAUSE]
When balance is canceled, all reloc roots are marked orphan, and orphan
reloc roots are going to be cleaned up.

However for orphan reloc roots and merged reloc roots, their lifespan
are quite different:
	Merged reloc roots	|	Orphan reloc roots by cancel
--------------------------------------------------------------------
create_reloc_root()		| create_reloc_root()
|- refs == 1			| |- refs == 1
				|
btrfs_grab_root(reloc_root);	| btrfs_grab_root(reloc_root);
|- refs == 2			| |- refs == 2
				|
root->reloc_root = reloc_root;	| root->reloc_root = reloc_root;
		>>> No difference so far <<<
				|
prepare_to_merge()		| prepare_to_merge()
|- btrfs_set_root_refs(item, 1);| |- if (!err) (err == -EINTR)
				|
merge_reloc_roots()		| merge_reloc_roots()
|- merge_reloc_root()		| |- Doing nothing to put reloc root
   |- insert_dirty_subvol()	| |- refs == 2
      |- __del_reloc_root()	|
         |- btrfs_put_root()	|
            |- refs == 1	|
		>>> Now orphan reloc roots still have refs 2 <<<
				|
clean_dirty_subvols()		| clean_dirty_subvols()
|- btrfs_drop_snapshot()	| |- btrfS_drop_snapshot()
   |- reloc_root get freed	|    |- reloc_root still has refs 2
				|	related ebs get freed, but
				|	reloc_root still recorded in
				|	allocated_roots
btrfs_check_leaked_roots()	| btrfs_check_leaked_roots()
|- No leaked roots		| |- Leaked reloc_roots detected
				| |- btrfs_put_root()
				|    |- free_extent_buffer(root->node);
				|       |- eb already freed, caused NULL
				|	   pointer dereference

[FIX]
The fix is to clear fs_root->reloc_root and put it at
merge_reloc_roots() time, so that we won't leak reloc roots.

Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 9afc1a6928cf..420348606123 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1917,12 +1917,11 @@ void merge_reloc_roots(struct reloc_control *rc)
 		reloc_root = list_entry(reloc_roots.next,
 					struct btrfs_root, root_list);
 
+		root = read_fs_root(fs_info,
+				    reloc_root->root_key.offset);
 		if (btrfs_root_refs(&reloc_root->root_item) > 0) {
-			root = read_fs_root(fs_info,
-					    reloc_root->root_key.offset);
 			BUG_ON(IS_ERR(root));
 			BUG_ON(root->reloc_root != reloc_root);
-
 			ret = merge_reloc_root(rc, root);
 			btrfs_put_root(root);
 			if (ret) {
@@ -1932,6 +1931,14 @@ void merge_reloc_roots(struct reloc_control *rc)
 				goto out;
 			}
 		} else {
+			if (!IS_ERR(root)) {
+				if (root->reloc_root == reloc_root) {
+					root->reloc_root = NULL;
+					btrfs_put_root(reloc_root);
+				}
+				btrfs_put_root(root);
+			}
+
 			list_del_init(&reloc_root->root_list);
 			/* Don't forget to queue this reloc root for cleanup */
 			list_add_tail(&reloc_root->reloc_dirty_list,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] btrfs: relocation: Clear the DEAD_RELOC_TREE bit for orphan roots to prevent runaway balance
  2020-05-20  6:58 [PATCH 0/2] btrfs: balance root leak and runaway balance fix Qu Wenruo
  2020-05-20  6:58 ` [PATCH 1/2] btrfs: relocation: Fix reloc root leakage and the NULL pointer reference caused by the leakage Qu Wenruo
@ 2020-05-20  6:58 ` Qu Wenruo
  2020-05-22 11:13 ` [PATCH 0/2] btrfs: balance root leak and runaway balance fix David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-05-20  6:58 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
There are several reported runaway balance, that balance is flooding the
kernel with "Found X extents" where the X never changes.

[CAUSE]
Commit d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after
merge_reloc_roots") introduced BTRFS_ROOT_DEAD_RELOC_TREE bit to
indicate that one subvolume has finished its tree blocks swap with its
reloc tree.

However if balance is canceled or hits ENOSPC halfway, we didn't clear
the BTRFS_ROOT_DEAD_RELOC_TREE bit, leaving that bit hanging forever
until unmount.

Any subvolume root with that bit, would cause backref cache to skip this
tree block, as it has finished its tree block swap.
This would cause all tree blocks of that root be ignored by balance,
leading to runaway balance.

[FIX]
Fix the problem by also clearing the BTRFS_ROOT_DEAD_RELOC_TREE bit for
the original subvolume of orphan reloc root.

Furthermore to avoid such damn bug to bother anybody anymore, add
unmount time check to detect and warn about this bit.

Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c    | 1 +
 fs/btrfs/relocation.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index fced949b150c..e6def8fc87dd 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1996,6 +1996,7 @@ void btrfs_put_root(struct btrfs_root *root)
 
 	if (refcount_dec_and_test(&root->refs)) {
 		WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
+		WARN_ON(test_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state));
 		if (root->anon_dev)
 			free_anon_bdev(root->anon_dev);
 		btrfs_drew_lock_destroy(&root->snapshot_lock);
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 420348606123..595097c4a060 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1936,6 +1936,8 @@ void merge_reloc_roots(struct reloc_control *rc)
 					root->reloc_root = NULL;
 					btrfs_put_root(reloc_root);
 				}
+				clear_bit(BTRFS_ROOT_DEAD_RELOC_TREE,
+					  &root->state);
 				btrfs_put_root(root);
 			}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] btrfs: balance root leak and runaway balance fix
  2020-05-20  6:58 [PATCH 0/2] btrfs: balance root leak and runaway balance fix Qu Wenruo
  2020-05-20  6:58 ` [PATCH 1/2] btrfs: relocation: Fix reloc root leakage and the NULL pointer reference caused by the leakage Qu Wenruo
  2020-05-20  6:58 ` [PATCH 2/2] btrfs: relocation: Clear the DEAD_RELOC_TREE bit for orphan roots to prevent runaway balance Qu Wenruo
@ 2020-05-22 11:13 ` David Sterba
  2020-07-23 21:54   ` Zygo Blaxell
  2 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2020-05-22 11:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, May 20, 2020 at 02:58:49PM +0800, Qu Wenruo wrote:
> This patchset will fix the most wanted balance bug, runaway balance.
> All my fault, and all small fixes.

Well, that happens.

d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")

is the most broken patch in recent history (5.1+), there were so many
fixups but hopefully this is the last one. I've tagged the patches for
5.1+ stable but we'll need manual backports due to the root refcount
changes in 5.7.

I reproduced the umount crash and verified the fix, the runaway balance
does not happen anymore in the test so I guess we have all the needed
fixes in place to allow the fast balance cancel. Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] btrfs: balance root leak and runaway balance fix
  2020-05-22 11:13 ` [PATCH 0/2] btrfs: balance root leak and runaway balance fix David Sterba
@ 2020-07-23 21:54   ` Zygo Blaxell
  2020-07-24  0:05     ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Zygo Blaxell @ 2020-07-23 21:54 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs

On Fri, May 22, 2020 at 01:13:47PM +0200, David Sterba wrote:
> On Wed, May 20, 2020 at 02:58:49PM +0800, Qu Wenruo wrote:
> > This patchset will fix the most wanted balance bug, runaway balance.
> > All my fault, and all small fixes.
> 
> Well, that happens.
> 
> d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
> 
> is the most broken patch in recent history (5.1+), there were so many
> fixups but hopefully this is the last one. I've tagged the patches for
> 5.1+ stable but we'll need manual backports due to the root refcount
> changes in 5.7.

The patch 1dae7e0e58b4 "btrfs: reloc: clear DEAD_RELOC_TREE bit for
orphan roots to prevent runaway balance" does apply to 5.7 itself, but
it is not present in 5.7.10.  I've been running it in test (and even a
few pre-prod) systems since May.

We still get someone in IRC with a runaway balance every week or so.
Currently we can only tell them to wait for 5.8, or roll all the way
back to 4.19.

> I reproduced the umount crash and verified the fix, the runaway balance
> does not happen anymore in the test so I guess we have all the needed
> fixes in place to allow the fast balance cancel. Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] btrfs: balance root leak and runaway balance fix
  2020-07-23 21:54   ` Zygo Blaxell
@ 2020-07-24  0:05     ` Qu Wenruo
  2020-07-24  9:33       ` David Sterba
  0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2020-07-24  0:05 UTC (permalink / raw)
  To: Zygo Blaxell, dsterba, Qu Wenruo, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1497 bytes --]



On 2020/7/24 上午5:54, Zygo Blaxell wrote:
> On Fri, May 22, 2020 at 01:13:47PM +0200, David Sterba wrote:
>> On Wed, May 20, 2020 at 02:58:49PM +0800, Qu Wenruo wrote:
>>> This patchset will fix the most wanted balance bug, runaway balance.
>>> All my fault, and all small fixes.
>>
>> Well, that happens.
>>
>> d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
>>
>> is the most broken patch in recent history (5.1+), there were so many
>> fixups but hopefully this is the last one. I've tagged the patches for
>> 5.1+ stable but we'll need manual backports due to the root refcount
>> changes in 5.7.
> 
> The patch 1dae7e0e58b4 "btrfs: reloc: clear DEAD_RELOC_TREE bit for
> orphan roots to prevent runaway balance" does apply to 5.7 itself, but
> it is not present in 5.7.10.  I've been running it in test (and even a
> few pre-prod) systems since May.

Strange, I see no mail about merge failure nor merge success.

I'll send the backport manually to all older branches.

BTW, what's the proper tag for stable branch ranges?

Thanks,
Qu

> 
> We still get someone in IRC with a runaway balance every week or so.
> Currently we can only tell them to wait for 5.8, or roll all the way
> back to 4.19.
> 
>> I reproduced the umount crash and verified the fix, the runaway balance
>> does not happen anymore in the test so I guess we have all the needed
>> fixes in place to allow the fast balance cancel. Thanks.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] btrfs: balance root leak and runaway balance fix
  2020-07-24  0:05     ` Qu Wenruo
@ 2020-07-24  9:33       ` David Sterba
  0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2020-07-24  9:33 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Zygo Blaxell, dsterba, Qu Wenruo, linux-btrfs

On Fri, Jul 24, 2020 at 08:05:16AM +0800, Qu Wenruo wrote:
> 
> 
> On 2020/7/24 上午5:54, Zygo Blaxell wrote:
> > On Fri, May 22, 2020 at 01:13:47PM +0200, David Sterba wrote:
> >> On Wed, May 20, 2020 at 02:58:49PM +0800, Qu Wenruo wrote:
> >>> This patchset will fix the most wanted balance bug, runaway balance.
> >>> All my fault, and all small fixes.
> >>
> >> Well, that happens.
> >>
> >> d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
> >>
> >> is the most broken patch in recent history (5.1+), there were so many
> >> fixups but hopefully this is the last one. I've tagged the patches for
> >> 5.1+ stable but we'll need manual backports due to the root refcount
> >> changes in 5.7.
> > 
> > The patch 1dae7e0e58b4 "btrfs: reloc: clear DEAD_RELOC_TREE bit for
> > orphan roots to prevent runaway balance" does apply to 5.7 itself, but
> > it is not present in 5.7.10.  I've been running it in test (and even a
> > few pre-prod) systems since May.
> 
> Strange, I see no mail about merge failure nor merge success.
> 
> I'll send the backport manually to all older branches.
> 
> BTW, what's the proper tag for stable branch ranges?

For inspiration look at subjects at https://lore.kernel.org/stable/ ,
something like, the version needs to be visible without looking to the
patch.

"[PATCH for 5.4] btrfs: ...."

You can send it as a thread with various versions in case the patches
differ, or use [PATCH for 5.4+].

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-24  9:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20  6:58 [PATCH 0/2] btrfs: balance root leak and runaway balance fix Qu Wenruo
2020-05-20  6:58 ` [PATCH 1/2] btrfs: relocation: Fix reloc root leakage and the NULL pointer reference caused by the leakage Qu Wenruo
2020-05-20  6:58 ` [PATCH 2/2] btrfs: relocation: Clear the DEAD_RELOC_TREE bit for orphan roots to prevent runaway balance Qu Wenruo
2020-05-22 11:13 ` [PATCH 0/2] btrfs: balance root leak and runaway balance fix David Sterba
2020-07-23 21:54   ` Zygo Blaxell
2020-07-24  0:05     ` Qu Wenruo
2020-07-24  9:33       ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.