All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
@ 2020-12-04  1:24 Qu Wenruo
  2020-12-04  1:24 ` [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-12-04  1:24 UTC (permalink / raw)
  To: linux-btrfs

There is a chance of racing for qgroup flushing which may lead to
deadlock:

	Thread A		|	Thread B
   (no trans handler hold)	|  (already hold a trans handler)
--------------------------------+--------------------------------
__btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
|- try_flush_qgroup()		| |- try_flushing_qgroup()
   |- QGROUP_FLUSHING bit set   |    |
   |				|    |- test_and_set_bit()
   |				|    |- wait_event()
   |- btrfs_join_transaction()	|
   |- btrfs_commit_transaction()|

			!!! DEAD LOCK !!!

Since thread A want to commit transaction, but thread B is hold a
transaction handler, blocking the commit.
At the same time, thread B is waiting thread A to finish it commit.

This is just a hot fix, and would lead to more EDQUOT when we're near
the qgroup limit.

The root fix would to make all metadata/data reservation to happen
without a transaction handler hold.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index fe3046007f52..7785dfa348d2 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3530,16 +3530,6 @@ static int try_flush_qgroup(struct btrfs_root *root)
 	int ret;
 	bool can_commit = true;
 
-	/*
-	 * We don't want to run flush again and again, so if there is a running
-	 * one, we won't try to start a new flush, but exit directly.
-	 */
-	if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
-		wait_event(root->qgroup_flush_wait,
-			!test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
-		return 0;
-	}
-
 	/*
 	 * If current process holds a transaction, we shouldn't flush, as we
 	 * assume all space reservation happens before a transaction handle is
@@ -3554,6 +3544,27 @@ static int try_flush_qgroup(struct btrfs_root *root)
 	    current->journal_info != BTRFS_SEND_TRANS_STUB)
 		can_commit = false;
 
+	/*
+	 * We don't want to run flush again and again, so if there is a running
+	 * one, we won't try to start a new flush, but exit directly.
+	 */
+	if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
+		/*
+		 * We are already holding a trans, thus we can block other
+		 * threads from flushing.
+		 * So exit right now. This increases the chance of EDQUOT for
+		 * heavy load and near limit cases.
+		 * But we can argue that if we're already near limit, EDQUOT
+		 * is unavoidable anyway.
+		 */
+		if (!can_commit)
+			return 0;
+
+		wait_event(root->qgroup_flush_wait,
+			!test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
+		return 0;
+	}
+
 	ret = btrfs_start_delalloc_snapshot(root);
 	if (ret < 0)
 		goto out;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle
  2020-12-04  1:24 [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Qu Wenruo
@ 2020-12-04  1:24 ` Qu Wenruo
  2020-12-04  7:37   ` Nikolay Borisov
  2020-12-04 11:48 ` [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Filipe Manana
  2020-12-04 17:28 ` David Sterba
  2 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2020-12-04  1:24 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana, David Sterba

[BUG]
When running the following script, btrfs will trigger an ASSERT():

  #/bin/bash
  mkfs.btrfs -f $dev
  mount $dev $mnt
  xfs_io -f -c "pwrite 0 1G" $mnt/file
  sync
  btrfs quota enable $mnt
  btrfs quota rescan -w $mnt

  # Manually set the limit below current usage
  btrfs qgroup limit 512M $mnt $mnt

  # Crash happens
  touch $mnt/file

The dmesg looks like this:

  assertion failed: refcount_read(&trans->use_count) == 1, in fs/btrfs/transaction.c:2022
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.h:3230!
  invalid opcode: 0000 [#1] SMP PTI
  RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
   btrfs_commit_transaction.cold+0x11/0x5d [btrfs]
   try_flush_qgroup+0x67/0x100 [btrfs]
   __btrfs_qgroup_reserve_meta+0x3a/0x60 [btrfs]
   btrfs_delayed_update_inode+0xaa/0x350 [btrfs]
   btrfs_update_inode+0x9d/0x110 [btrfs]
   btrfs_dirty_inode+0x5d/0xd0 [btrfs]
   touch_atime+0xb5/0x100
   iterate_dir+0xf1/0x1b0
   __x64_sys_getdents64+0x78/0x110
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7fb5afe588db

[CAUSE]
In try_flush_qgroup(), we assume we don't hold a transaction handle at
all.  This is true for data reservation and mostly true for metadata.
Since data space reservation always happens before we start a
transaction, and for most metadata operation we reserve space in
start_transaction().

But there is an exception, btrfs_delayed_inode_reserve_metadata().
It holds a transaction handle, while still trying to reserve extra
metadata space.

When we hit EDQUOT inside btrfs_delayed_inode_reserve_metadata(), we
will join current transaction and commit, while we still have
transaction handle from qgroup code.

[FIX]
Let's check current->journal before we join the transaction.

If current->journal is unset or BTRFS_SEND_TRANS_STUB, it means
we are not holding a transaction, thus are able to join and then commit
transaction.

If current->journal is a valid transaction handle, we avoid committing
transaction and just end it

This is less effective than committing current transaction, as it won't
free metadata reserved space, but we may still free some data space
before new data writes.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1178634
Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/qgroup.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 77c54749f432..4621b8043021 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3512,6 +3512,7 @@ static int try_flush_qgroup(struct btrfs_root *root)
 {
 	struct btrfs_trans_handle *trans;
 	int ret;
+	bool can_commit = true;
 
 	/*
 	 * We don't want to run flush again and again, so if there is a running
@@ -3523,6 +3524,20 @@ static int try_flush_qgroup(struct btrfs_root *root)
 		return 0;
 	}
 
+	/*
+	 * If current process holds a transaction, we shouldn't flush, as we
+	 * assume all space reservation happens before a transaction handle is
+	 * held.
+	 *
+	 * But there are cases like btrfs_delayed_item_reserve_metadata() where
+	 * we try to reserve space with one transction handle already held.
+	 * In that case we can't commit transaction, but at least try to end it
+	 * and hope the started data writes can free some space.
+	 */
+	if (current->journal_info &&
+	    current->journal_info != BTRFS_SEND_TRANS_STUB)
+		can_commit = false;
+
 	ret = btrfs_start_delalloc_snapshot(root);
 	if (ret < 0)
 		goto out;
@@ -3534,7 +3549,10 @@ static int try_flush_qgroup(struct btrfs_root *root)
 		goto out;
 	}
 
-	ret = btrfs_commit_transaction(trans);
+	if (can_commit)
+		ret = btrfs_commit_transaction(trans);
+	else
+		ret = btrfs_end_transaction(trans);
 out:
 	clear_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state);
 	wake_up(&root->qgroup_flush_wait);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle
  2020-12-04  1:24 ` [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle Qu Wenruo
@ 2020-12-04  7:37   ` Nikolay Borisov
  2020-12-04  7:46     ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2020-12-04  7:37 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: Filipe Manana, David Sterba



On 4.12.20 г. 3:24 ч., Qu Wenruo wrote:
> [BUG]
> When running the following script, btrfs will trigger an ASSERT():
> 
>   #/bin/bash
>   mkfs.btrfs -f $dev
>   mount $dev $mnt
>   xfs_io -f -c "pwrite 0 1G" $mnt/file
>   sync
>   btrfs quota enable $mnt
>   btrfs quota rescan -w $mnt
> 
>   # Manually set the limit below current usage
>   btrfs qgroup limit 512M $mnt $mnt
> 
>   # Crash happens
>   touch $mnt/file
> 
> The dmesg looks like this:
> 
>   assertion failed: refcount_read(&trans->use_count) == 1, in fs/btrfs/transaction.c:2022
>   ------------[ cut here ]------------
>   kernel BUG at fs/btrfs/ctree.h:3230!
>   invalid opcode: 0000 [#1] SMP PTI
>   RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
>    btrfs_commit_transaction.cold+0x11/0x5d [btrfs]
>    try_flush_qgroup+0x67/0x100 [btrfs]
>    __btrfs_qgroup_reserve_meta+0x3a/0x60 [btrfs]
>    btrfs_delayed_update_inode+0xaa/0x350 [btrfs]
>    btrfs_update_inode+0x9d/0x110 [btrfs]
>    btrfs_dirty_inode+0x5d/0xd0 [btrfs]
>    touch_atime+0xb5/0x100
>    iterate_dir+0xf1/0x1b0
>    __x64_sys_getdents64+0x78/0x110
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7fb5afe588db
> 
> [CAUSE]
> In try_flush_qgroup(), we assume we don't hold a transaction handle at
> all.  This is true for data reservation and mostly true for metadata.
> Since data space reservation always happens before we start a
> transaction, and for most metadata operation we reserve space in
> start_transaction().
> 
> But there is an exception, btrfs_delayed_inode_reserve_metadata().
> It holds a transaction handle, while still trying to reserve extra
> metadata space.
> 
> When we hit EDQUOT inside btrfs_delayed_inode_reserve_metadata(), we
> will join current transaction and commit, while we still have
> transaction handle from qgroup code.
> 
> [FIX]
> Let's check current->journal before we join the transaction.
> 
> If current->journal is unset or BTRFS_SEND_TRANS_STUB, it means
> we are not holding a transaction, thus are able to join and then commit
> transaction.
> 
> If current->journal is a valid transaction handle, we avoid committing
> transaction and just end it
> 
> This is less effective than committing current transaction, as it won't
> free metadata reserved space, but we may still free some data space
> before new data writes.
> 
> Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1178634
> Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>

Wasn't this submitted already? Also are you going to turn the example
script into a fstest?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle
  2020-12-04  7:37   ` Nikolay Borisov
@ 2020-12-04  7:46     ` Qu Wenruo
  0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-12-04  7:46 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs; +Cc: Filipe Manana, David Sterba



On 2020/12/4 下午3:37, Nikolay Borisov wrote:
>
>
> On 4.12.20 г. 3:24 ч., Qu Wenruo wrote:
>> [BUG]
>> When running the following script, btrfs will trigger an ASSERT():
>>
>>   #/bin/bash
>>   mkfs.btrfs -f $dev
>>   mount $dev $mnt
>>   xfs_io -f -c "pwrite 0 1G" $mnt/file
>>   sync
>>   btrfs quota enable $mnt
>>   btrfs quota rescan -w $mnt
>>
>>   # Manually set the limit below current usage
>>   btrfs qgroup limit 512M $mnt $mnt
>>
>>   # Crash happens
>>   touch $mnt/file
>>
>> The dmesg looks like this:
>>
>>   assertion failed: refcount_read(&trans->use_count) == 1, in fs/btrfs/transaction.c:2022
>>   ------------[ cut here ]------------
>>   kernel BUG at fs/btrfs/ctree.h:3230!
>>   invalid opcode: 0000 [#1] SMP PTI
>>   RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
>>    btrfs_commit_transaction.cold+0x11/0x5d [btrfs]
>>    try_flush_qgroup+0x67/0x100 [btrfs]
>>    __btrfs_qgroup_reserve_meta+0x3a/0x60 [btrfs]
>>    btrfs_delayed_update_inode+0xaa/0x350 [btrfs]
>>    btrfs_update_inode+0x9d/0x110 [btrfs]
>>    btrfs_dirty_inode+0x5d/0xd0 [btrfs]
>>    touch_atime+0xb5/0x100
>>    iterate_dir+0xf1/0x1b0
>>    __x64_sys_getdents64+0x78/0x110
>>    do_syscall_64+0x33/0x80
>>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>   RIP: 0033:0x7fb5afe588db
>>
>> [CAUSE]
>> In try_flush_qgroup(), we assume we don't hold a transaction handle at
>> all.  This is true for data reservation and mostly true for metadata.
>> Since data space reservation always happens before we start a
>> transaction, and for most metadata operation we reserve space in
>> start_transaction().
>>
>> But there is an exception, btrfs_delayed_inode_reserve_metadata().
>> It holds a transaction handle, while still trying to reserve extra
>> metadata space.
>>
>> When we hit EDQUOT inside btrfs_delayed_inode_reserve_metadata(), we
>> will join current transaction and commit, while we still have
>> transaction handle from qgroup code.
>>
>> [FIX]
>> Let's check current->journal before we join the transaction.
>>
>> If current->journal is unset or BTRFS_SEND_TRANS_STUB, it means
>> we are not holding a transaction, thus are able to join and then commit
>> transaction.
>>
>> If current->journal is a valid transaction handle, we avoid committing
>> transaction and just end it
>>
>> This is less effective than committing current transaction, as it won't
>> free metadata reserved space, but we may still free some data space
>> before new data writes.
>>
>> Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1178634
>> Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
>> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> Signed-off-by: David Sterba <dsterba@suse.com>
>
> Wasn't this submitted already? Also are you going to turn the example
> script into a fstest?
>
Sorry, I forgot to cleanup my patches directory. (Facepalm

The fstests is already submitted:
https://patchwork.kernel.org/project/linux-btrfs/patch/20201111113152.136729-1-wqu@suse.com/

Thanks,
Qu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
  2020-12-04  1:24 [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Qu Wenruo
  2020-12-04  1:24 ` [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle Qu Wenruo
@ 2020-12-04 11:48 ` Filipe Manana
  2020-12-04 17:28 ` David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: Filipe Manana @ 2020-12-04 11:48 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Dec 4, 2020 at 1:29 AM Qu Wenruo <wqu@suse.com> wrote:
>
> There is a chance of racing for qgroup flushing which may lead to
> deadlock:
>
>         Thread A                |       Thread B
>    (no trans handler hold)      |  (already hold a trans handler)

handler -> handle

"not holding a trans handle" and "holding a trans handle" respectively.

> --------------------------------+--------------------------------
> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
> |- try_flush_qgroup()           | |- try_flushing_qgroup()

try_flushing_qgroup() -> try_flush_qgroup()

>    |- QGROUP_FLUSHING bit set   |    |
>    |                            |    |- test_and_set_bit()
>    |                            |    |- wait_event()
>    |- btrfs_join_transaction()  |
>    |- btrfs_commit_transaction()|
>
>                         !!! DEAD LOCK !!!
>
> Since thread A want to commit transaction, but thread B is hold a
> transaction handler, blocking the commit.

"thread B is hold a transaction handler" -> "thread B is holding a
transaction handle"

> At the same time, thread B is waiting thread A to finish it commit.

waiting for, it -> its

>
> This is just a hot fix, and would lead to more EDQUOT when we're near
> the qgroup limit.
>
> The root fix would to make all metadata/data reservation to happen

would -> would be

> without a transaction handler hold.

without a transaction handler hold -> without holding a transaction handle

>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Other than the grammar, it looks fine, thanks.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/qgroup.c | 31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index fe3046007f52..7785dfa348d2 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3530,16 +3530,6 @@ static int try_flush_qgroup(struct btrfs_root *root)
>         int ret;
>         bool can_commit = true;
>
> -       /*
> -        * We don't want to run flush again and again, so if there is a running
> -        * one, we won't try to start a new flush, but exit directly.
> -        */
> -       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
> -               wait_event(root->qgroup_flush_wait,
> -                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
> -               return 0;
> -       }
> -
>         /*
>          * If current process holds a transaction, we shouldn't flush, as we
>          * assume all space reservation happens before a transaction handle is
> @@ -3554,6 +3544,27 @@ static int try_flush_qgroup(struct btrfs_root *root)
>             current->journal_info != BTRFS_SEND_TRANS_STUB)
>                 can_commit = false;
>
> +       /*
> +        * We don't want to run flush again and again, so if there is a running
> +        * one, we won't try to start a new flush, but exit directly.
> +        */
> +       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
> +               /*
> +                * We are already holding a trans, thus we can block other
> +                * threads from flushing.
> +                * So exit right now. This increases the chance of EDQUOT for
> +                * heavy load and near limit cases.
> +                * But we can argue that if we're already near limit, EDQUOT
> +                * is unavoidable anyway.
> +                */
> +               if (!can_commit)
> +                       return 0;
> +
> +               wait_event(root->qgroup_flush_wait,
> +                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
> +               return 0;
> +       }
> +
>         ret = btrfs_start_delalloc_snapshot(root);
>         if (ret < 0)
>                 goto out;
> --
> 2.29.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
  2020-12-04  1:24 [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Qu Wenruo
  2020-12-04  1:24 ` [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle Qu Wenruo
  2020-12-04 11:48 ` [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Filipe Manana
@ 2020-12-04 17:28 ` David Sterba
  2020-12-05  2:55   ` Qu Wenruo
  2 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2020-12-04 17:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Dec 04, 2020 at 09:24:47AM +0800, Qu Wenruo wrote:
> There is a chance of racing for qgroup flushing which may lead to
> deadlock:
> 
> 	Thread A		|	Thread B
>    (no trans handler hold)	|  (already hold a trans handler)
> --------------------------------+--------------------------------
> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
> |- try_flush_qgroup()		| |- try_flushing_qgroup()
>    |- QGROUP_FLUSHING bit set   |    |
>    |				|    |- test_and_set_bit()
>    |				|    |- wait_event()
>    |- btrfs_join_transaction()	|
>    |- btrfs_commit_transaction()|
> 
> 			!!! DEAD LOCK !!!
> 
> Since thread A want to commit transaction, but thread B is hold a
> transaction handler, blocking the commit.
> At the same time, thread B is waiting thread A to finish it commit.
> 
> This is just a hot fix, and would lead to more EDQUOT when we're near
> the qgroup limit.
> 
> The root fix would to make all metadata/data reservation to happen
> without a transaction handler hold.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
  2020-12-04 17:28 ` David Sterba
@ 2020-12-05  2:55   ` Qu Wenruo
  0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2020-12-05  2:55 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1367 bytes --]



On 2020/12/5 上午1:28, David Sterba wrote:
> On Fri, Dec 04, 2020 at 09:24:47AM +0800, Qu Wenruo wrote:
>> There is a chance of racing for qgroup flushing which may lead to
>> deadlock:
>>
>> 	Thread A		|	Thread B
>>    (no trans handler hold)	|  (already hold a trans handler)
>> --------------------------------+--------------------------------
>> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
>> |- try_flush_qgroup()		| |- try_flushing_qgroup()
>>    |- QGROUP_FLUSHING bit set   |    |
>>    |				|    |- test_and_set_bit()
>>    |				|    |- wait_event()
>>    |- btrfs_join_transaction()	|
>>    |- btrfs_commit_transaction()|
>>
>> 			!!! DEAD LOCK !!!
>>
>> Since thread A want to commit transaction, but thread B is hold a
>> transaction handler, blocking the commit.
>> At the same time, thread B is waiting thread A to finish it commit.
>>
>> This is just a hot fix, and would lead to more EDQUOT when we're near
>> the qgroup limit.
>>
>> The root fix would to make all metadata/data reservation to happen
>> without a transaction handler hold.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> 
> Added to misc-next, thanks.
> 
Sorry, forgot the fixes tag:

Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we
get -EDQUOT")

Mind to add that in the branch?

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-05  2:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04  1:24 [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Qu Wenruo
2020-12-04  1:24 ` [PATCH] btrfs: qgroup: don't commit transaction when we already hold the handle Qu Wenruo
2020-12-04  7:37   ` Nikolay Borisov
2020-12-04  7:46     ` Qu Wenruo
2020-12-04 11:48 ` [PATCH] btrfs: qgroup: don't try to wait flushing if we're already holding a transaction Filipe Manana
2020-12-04 17:28 ` David Sterba
2020-12-05  2:55   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.