linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes
@ 2019-09-26 12:29 Josef Bacik
  2019-09-26 12:33 ` Nikolay Borisov
  2019-09-30 17:34 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Josef Bacik @ 2019-09-26 12:29 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We hit the following warning while running down a different problem

[ 6197.175850] ------------[ cut here ]------------
[ 6197.185082] refcount_t: underflow; use-after-free.
[ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
[ 6197.521792] Call Trace:
[ 6197.526687]  __btrfs_release_delayed_node+0x76/0x1c0
[ 6197.536615]  btrfs_kill_all_delayed_nodes+0xec/0x130
[ 6197.546532]  ? __btrfs_btree_balance_dirty+0x60/0x60
[ 6197.556482]  btrfs_clean_one_deleted_snapshot+0x71/0xd0
[ 6197.566910]  cleaner_kthread+0xfa/0x120
[ 6197.574573]  kthread+0x111/0x130
[ 6197.581022]  ? kthread_create_on_node+0x60/0x60
[ 6197.590086]  ret_from_fork+0x1f/0x30
[ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---

This is because the free side drops the ref without the lock, and then
takes the lock if our refcount is 0.  So you can have nodes on the tree
that have a refcount of 0.  Fix this by zero'ing out that element in our
temporary array so we don't try to kill it again.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
v2->v3:
- always git show after a revision kids, to make sure your new code still makes
  sense with the original code, not your previous version.

 fs/btrfs/delayed-inode.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 1f7f39b10bd0..ac71615a480c 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1949,12 +1949,15 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
 		}
 
 		inode_id = delayed_nodes[n - 1]->inode_id + 1;
-
-		for (i = 0; i < n; i++)
-			refcount_inc(&delayed_nodes[i]->refs);
+		for (i = 0; i < n; i++) {
+			if (!refcount_inc_not_zero(&delayed_nodes[i]->refs))
+				delayed_nodes[i] = NULL;
+		}
 		spin_unlock(&root->inode_lock);
 
 		for (i = 0; i < n; i++) {
+			if (!delayed_nodes[i])
+				continue;
 			__btrfs_kill_delayed_node(delayed_nodes[i]);
 			btrfs_release_delayed_node(delayed_nodes[i]);
 		}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes
  2019-09-26 12:29 [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes Josef Bacik
@ 2019-09-26 12:33 ` Nikolay Borisov
  2019-09-30 17:34 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: Nikolay Borisov @ 2019-09-26 12:33 UTC (permalink / raw)
  To: Josef Bacik, kernel-team, linux-btrfs



On 26.09.19 г. 15:29 ч., Josef Bacik wrote:
> We hit the following warning while running down a different problem
> 
> [ 6197.175850] ------------[ cut here ]------------
> [ 6197.185082] refcount_t: underflow; use-after-free.
> [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
> [ 6197.521792] Call Trace:
> [ 6197.526687]  __btrfs_release_delayed_node+0x76/0x1c0
> [ 6197.536615]  btrfs_kill_all_delayed_nodes+0xec/0x130
> [ 6197.546532]  ? __btrfs_btree_balance_dirty+0x60/0x60
> [ 6197.556482]  btrfs_clean_one_deleted_snapshot+0x71/0xd0
> [ 6197.566910]  cleaner_kthread+0xfa/0x120
> [ 6197.574573]  kthread+0x111/0x130
> [ 6197.581022]  ? kthread_create_on_node+0x60/0x60
> [ 6197.590086]  ret_from_fork+0x1f/0x30
> [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---
> 
> This is because the free side drops the ref without the lock, and then
> takes the lock if our refcount is 0.  So you can have nodes on the tree
> that have a refcount of 0.  Fix this by zero'ing out that element in our
> temporary array so we don't try to kill it again.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
> v2->v3:
> - always git show after a revision kids, to make sure your new code still makes
>   sense with the original code, not your previous version.
> 
>  fs/btrfs/delayed-inode.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index 1f7f39b10bd0..ac71615a480c 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -1949,12 +1949,15 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
>  		}
>  
>  		inode_id = delayed_nodes[n - 1]->inode_id + 1;
> -
> -		for (i = 0; i < n; i++)
> -			refcount_inc(&delayed_nodes[i]->refs);
> +		for (i = 0; i < n; i++) {
> +			if (!refcount_inc_not_zero(&delayed_nodes[i]->refs))
> +				delayed_nodes[i] = NULL;
> +		}
>  		spin_unlock(&root->inode_lock);
>  
>  		for (i = 0; i < n; i++) {
> +			if (!delayed_nodes[i])
> +				continue;
>  			__btrfs_kill_delayed_node(delayed_nodes[i]);
>  			btrfs_release_delayed_node(delayed_nodes[i]);
>  		}
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes
  2019-09-26 12:29 [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes Josef Bacik
  2019-09-26 12:33 ` Nikolay Borisov
@ 2019-09-30 17:34 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2019-09-30 17:34 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Sep 26, 2019 at 08:29:32AM -0400, Josef Bacik wrote:
> We hit the following warning while running down a different problem
> 
> [ 6197.175850] ------------[ cut here ]------------
> [ 6197.185082] refcount_t: underflow; use-after-free.
> [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
> [ 6197.521792] Call Trace:
> [ 6197.526687]  __btrfs_release_delayed_node+0x76/0x1c0
> [ 6197.536615]  btrfs_kill_all_delayed_nodes+0xec/0x130
> [ 6197.546532]  ? __btrfs_btree_balance_dirty+0x60/0x60
> [ 6197.556482]  btrfs_clean_one_deleted_snapshot+0x71/0xd0
> [ 6197.566910]  cleaner_kthread+0xfa/0x120
> [ 6197.574573]  kthread+0x111/0x130
> [ 6197.581022]  ? kthread_create_on_node+0x60/0x60
> [ 6197.590086]  ret_from_fork+0x1f/0x30
> [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---
> 
> This is because the free side drops the ref without the lock, and then
> takes the lock if our refcount is 0.  So you can have nodes on the tree
> that have a refcount of 0.

This sounds like breaking the assumptions of the refcounts, if the
object is in the tree it should not be accessible by anything else than
the freeing code, with refs == 0.

Now the delayed nodes do not follow that and there were bugs where the 0
was increased again (ec35e48b286959991c "btrfs: fix refcount_t usage
when deleting btrfs_delayed_nodes").

Your patch fixes another instance, I still see some other refcount_inc
that seem to be safe but the call to refcount_inc_not_zero should have a
comment, like is in the mentioned commit.

> Fix this by zero'ing out that element in our
> temporary array so we don't try to kill it again.

The fix looks correct.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-09-30 21:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-26 12:29 [PATCH][v3] btrfs: use refcount_inc_not_zero in kill_all_nodes Josef Bacik
2019-09-26 12:33 ` Nikolay Borisov
2019-09-30 17:34 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).