* [PATCH] btrfs: use refcount_inc_not_zero in kill_all_nodes
@ 2019-09-26 10:54 Josef Bacik
2019-09-26 11:13 ` Nikolay Borisov
0 siblings, 1 reply; 3+ messages in thread
From: Josef Bacik @ 2019-09-26 10:54 UTC (permalink / raw)
To: kernel-team, linux-btrfs
We hit the following warning while running down a different problem
[ 6197.175850] ------------[ cut here ]------------
[ 6197.185082] refcount_t: underflow; use-after-free.
[ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
[ 6197.521792] Call Trace:
[ 6197.526687] __btrfs_release_delayed_node+0x76/0x1c0
[ 6197.536615] btrfs_kill_all_delayed_nodes+0xec/0x130
[ 6197.546532] ? __btrfs_btree_balance_dirty+0x60/0x60
[ 6197.556482] btrfs_clean_one_deleted_snapshot+0x71/0xd0
[ 6197.566910] cleaner_kthread+0xfa/0x120
[ 6197.574573] kthread+0x111/0x130
[ 6197.581022] ? kthread_create_on_node+0x60/0x60
[ 6197.590086] ret_from_fork+0x1f/0x30
[ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---
This is because we're unconditionally grabbing a ref to every node, but
there could be nodes with a 0 refcount. Fix this to instead use
refcount_inc_not_zero() and only process the list for the nodes we get a
refcount on.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/delayed-inode.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 1f7f39b10bd0..320503750896 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1936,7 +1936,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
{
u64 inode_id = 0;
struct btrfs_delayed_node *delayed_nodes[8];
- int i, n;
+ int i, n, count;
while (1) {
spin_lock(&root->inode_lock);
@@ -1948,13 +1948,16 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
break;
}
- inode_id = delayed_nodes[n - 1]->inode_id + 1;
-
- for (i = 0; i < n; i++)
- refcount_inc(&delayed_nodes[i]->refs);
+ count = 0;
+ for (i = 0; i < n; i++) {
+ if (!refcount_inc_not_zero(&delayed_nodes[i]->refs))
+ break;
+ count++;
+ }
+ inode_id = delayed_nodes[count - 1]->inode_id + 1;
spin_unlock(&root->inode_lock);
- for (i = 0; i < n; i++) {
+ for (i = 0; i < count; i++) {
__btrfs_kill_delayed_node(delayed_nodes[i]);
btrfs_release_delayed_node(delayed_nodes[i]);
}
--
2.21.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] btrfs: use refcount_inc_not_zero in kill_all_nodes
2019-09-26 10:54 [PATCH] btrfs: use refcount_inc_not_zero in kill_all_nodes Josef Bacik
@ 2019-09-26 11:13 ` Nikolay Borisov
2019-09-26 11:28 ` Josef Bacik
0 siblings, 1 reply; 3+ messages in thread
From: Nikolay Borisov @ 2019-09-26 11:13 UTC (permalink / raw)
To: Josef Bacik, kernel-team, linux-btrfs
On 26.09.19 г. 13:54 ч., Josef Bacik wrote:
> We hit the following warning while running down a different problem
>
> [ 6197.175850] ------------[ cut here ]------------
> [ 6197.185082] refcount_t: underflow; use-after-free.
> [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
> [ 6197.521792] Call Trace:
> [ 6197.526687] __btrfs_release_delayed_node+0x76/0x1c0
> [ 6197.536615] btrfs_kill_all_delayed_nodes+0xec/0x130
> [ 6197.546532] ? __btrfs_btree_balance_dirty+0x60/0x60
> [ 6197.556482] btrfs_clean_one_deleted_snapshot+0x71/0xd0
> [ 6197.566910] cleaner_kthread+0xfa/0x120
> [ 6197.574573] kthread+0x111/0x130
> [ 6197.581022] ? kthread_create_on_node+0x60/0x60
> [ 6197.590086] ret_from_fork+0x1f/0x30
> [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---
>
> This is because we're unconditionally grabbing a ref to every node, but
> there could be nodes with a 0 refcount. Fix this to instead use
> refcount_inc_not_zero() and only process the list for the nodes we get a
> refcount on.
I'd also add the detail that __btrfs_release_delayed_node actually does
the refcount_dec_and_test outside of &root->inode_lock which allows this
scenario.
And this bug seems rather old, ever since :
16cdcec736cd ("btrfs: implement delayed inode items operation")
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> fs/btrfs/delayed-inode.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index 1f7f39b10bd0..320503750896 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -1936,7 +1936,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
> {
> u64 inode_id = 0;
> struct btrfs_delayed_node *delayed_nodes[8];
> - int i, n;
> + int i, n, count;
>
> while (1) {
> spin_lock(&root->inode_lock);
> @@ -1948,13 +1948,16 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
> break;
> }
>
> - inode_id = delayed_nodes[n - 1]->inode_id + 1;
> -
> - for (i = 0; i < n; i++)
> - refcount_inc(&delayed_nodes[i]->refs);
> + count = 0;
> + for (i = 0; i < n; i++) {
> + if (!refcount_inc_not_zero(&delayed_nodes[i]->refs))
> + break;
> + count++;
This is buggy, if the very first inode in the gang causes the break
statement then the code does delayed_nodes[0 - 1]->inode_id. E.g. the
increment should be before the refcount_inc_not_zero.
> + }
> + inode_id = delayed_nodes[count - 1]->inode_id + 1;
> spin_unlock(&root->inode_lock);
>
> - for (i = 0; i < n; i++) {
> + for (i = 0; i < count; i++) {
> __btrfs_kill_delayed_node(delayed_nodes[i]);
> btrfs_release_delayed_node(delayed_nodes[i]);
> }
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] btrfs: use refcount_inc_not_zero in kill_all_nodes
2019-09-26 11:13 ` Nikolay Borisov
@ 2019-09-26 11:28 ` Josef Bacik
0 siblings, 0 replies; 3+ messages in thread
From: Josef Bacik @ 2019-09-26 11:28 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: Josef Bacik, kernel-team, linux-btrfs
On Thu, Sep 26, 2019 at 02:13:44PM +0300, Nikolay Borisov wrote:
>
>
> On 26.09.19 г. 13:54 ч., Josef Bacik wrote:
> > We hit the following warning while running down a different problem
> >
> > [ 6197.175850] ------------[ cut here ]------------
> > [ 6197.185082] refcount_t: underflow; use-after-free.
> > [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
> > [ 6197.521792] Call Trace:
> > [ 6197.526687] __btrfs_release_delayed_node+0x76/0x1c0
> > [ 6197.536615] btrfs_kill_all_delayed_nodes+0xec/0x130
> > [ 6197.546532] ? __btrfs_btree_balance_dirty+0x60/0x60
> > [ 6197.556482] btrfs_clean_one_deleted_snapshot+0x71/0xd0
> > [ 6197.566910] cleaner_kthread+0xfa/0x120
> > [ 6197.574573] kthread+0x111/0x130
> > [ 6197.581022] ? kthread_create_on_node+0x60/0x60
> > [ 6197.590086] ret_from_fork+0x1f/0x30
> > [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]---
> >
> > This is because we're unconditionally grabbing a ref to every node, but
> > there could be nodes with a 0 refcount. Fix this to instead use
> > refcount_inc_not_zero() and only process the list for the nodes we get a
> > refcount on.
>
>
> I'd also add the detail that __btrfs_release_delayed_node actually does
> the refcount_dec_and_test outside of &root->inode_lock which allows this
> scenario.
>
> And this bug seems rather old, ever since :
>
> 16cdcec736cd ("btrfs: implement delayed inode items operation")
>
>
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> > fs/btrfs/delayed-inode.c | 15 +++++++++------
> > 1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> > index 1f7f39b10bd0..320503750896 100644
> > --- a/fs/btrfs/delayed-inode.c
> > +++ b/fs/btrfs/delayed-inode.c
> > @@ -1936,7 +1936,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
> > {
> > u64 inode_id = 0;
> > struct btrfs_delayed_node *delayed_nodes[8];
> > - int i, n;
> > + int i, n, count;
> >
> > while (1) {
> > spin_lock(&root->inode_lock);
> > @@ -1948,13 +1948,16 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
> > break;
> > }
> >
> > - inode_id = delayed_nodes[n - 1]->inode_id + 1;
> > -
> > - for (i = 0; i < n; i++)
> > - refcount_inc(&delayed_nodes[i]->refs);
> > + count = 0;
> > + for (i = 0; i < n; i++) {
> > + if (!refcount_inc_not_zero(&delayed_nodes[i]->refs))
> > + break;
> > + count++;
>
> This is buggy, if the very first inode in the gang causes the break
> statement then the code does delayed_nodes[0 - 1]->inode_id. E.g. the
> increment should be before the refcount_inc_not_zero.
>
Sigh, no patches from me before breakfast. I'll fix this up after I eat.
Thanks,
Josef
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-26 11:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-26 10:54 [PATCH] btrfs: use refcount_inc_not_zero in kill_all_nodes Josef Bacik
2019-09-26 11:13 ` Nikolay Borisov
2019-09-26 11:28 ` Josef Bacik
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.