linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: qgroup: don't commit transaction when we have already hold a transaction handler
@ 2020-11-11  9:30 Qu Wenruo
  2020-11-11 10:10 ` Filipe Manana
  0 siblings, 1 reply; 2+ messages in thread
From: Qu Wenruo @ 2020-11-11  9:30 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
There is a report about v5.9.6 hitting ASSERT() when running fstests:

 assertion failed: refcount_read(&trans->use_count) == 1, in fs/btrfs/transaction.c:2022
 ------------[ cut here ]------------
 kernel BUG at fs/btrfs/ctree.h:3230!
 invalid opcode: 0000 [#1] SMP PTI
 RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
  btrfs_commit_transaction.cold+0x11/0x5d [btrfs]
  try_flush_qgroup+0x67/0x100 [btrfs]
  __btrfs_qgroup_reserve_meta+0x3a/0x60 [btrfs]
  btrfs_delayed_update_inode+0xaa/0x350 [btrfs]
  btrfs_update_inode+0x9d/0x110 [btrfs]
  btrfs_dirty_inode+0x5d/0xd0 [btrfs]
  touch_atime+0xb5/0x100
  iterate_dir+0xf1/0x1b0
  __x64_sys_getdents64+0x78/0x110
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fb5afe588db

[CAUSE]
In try_flush_qgroup(), we assume we don't hold a transaction handler at
all.
This is true for data reservation and mostly true for metadata.
Since data space reservation always happens before we start a
transaction, and for most metadata operation we reserve space in
start_transaction().

But there is an exception, btrfs_delayed_inode_reserve_metadata().
It holds a transaction handler, while still try to reserve extra
metadata space.

When we hit -EDQUOT inside btrfs_delayed_inode_reserve_metadata(), we
will join into current transaction, and commit current transaction,
while we still have transaction handler out of qgroup code.

[FIX]
Let's check current->journal before we join the transaction.

If current->journal is empty or BTRFS_SEND_TRANS_STUB, it means
ourselves are not holding a transaction, thus is able to join and then
commit transaction.

If current->journal is a valid transaction handler, we avoid committing
transaction, but just end current transaction.

This is less effective than committing current transaction, as it won't
free metadata reserved space, but we may still free some data space by
the incoming data write.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1178634
Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index bf4b02a40ecc..e320bb574421 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3478,6 +3478,7 @@ static int try_flush_qgroup(struct btrfs_root *root)
 {
 	struct btrfs_trans_handle *trans;
 	int ret;
+	bool can_commit = true;
 
 	/*
 	 * We don't want to run flush again and again, so if there is a running
@@ -3489,6 +3490,19 @@ static int try_flush_qgroup(struct btrfs_root *root)
 		return 0;
 	}
 
+	/*
+	 * If current process holds a transaction, we shouldn't flush, as we
+	 * assume all space reservation happens before a trans handler is hold.
+	 *
+	 * But there are cases like btrfs_delayed_item_reserve_metadata() where
+	 * we try to reserve space with one trans handler already hold.
+	 * In that case we can't commit transaction. but at most end
+	 * transaction, and hope the started data writes can free some space.
+	 */
+	if (current->journal_info &&
+	    current->journal_info != BTRFS_SEND_TRANS_STUB)
+		can_commit = false;
+
 	ret = btrfs_start_delalloc_snapshot(root);
 	if (ret < 0)
 		goto out;
@@ -3500,7 +3514,10 @@ static int try_flush_qgroup(struct btrfs_root *root)
 		goto out;
 	}
 
-	ret = btrfs_commit_transaction(trans);
+	if (can_commit)
+		ret = btrfs_commit_transaction(trans);
+	else
+		ret = btrfs_end_transaction(trans);
 out:
 	clear_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state);
 	wake_up(&root->qgroup_flush_wait);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] btrfs: qgroup: don't commit transaction when we have already hold a transaction handler
  2020-11-11  9:30 [PATCH] btrfs: qgroup: don't commit transaction when we have already hold a transaction handler Qu Wenruo
@ 2020-11-11 10:10 ` Filipe Manana
  0 siblings, 0 replies; 2+ messages in thread
From: Filipe Manana @ 2020-11-11 10:10 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Nov 11, 2020 at 9:32 AM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> There is a report about v5.9.6 hitting ASSERT() when running fstests:

Please mention the test case which triggered the failure.

>
>  assertion failed: refcount_read(&trans->use_count) == 1, in fs/btrfs/transaction.c:2022
>  ------------[ cut here ]------------
>  kernel BUG at fs/btrfs/ctree.h:3230!
>  invalid opcode: 0000 [#1] SMP PTI
>  RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
>   btrfs_commit_transaction.cold+0x11/0x5d [btrfs]
>   try_flush_qgroup+0x67/0x100 [btrfs]
>   __btrfs_qgroup_reserve_meta+0x3a/0x60 [btrfs]
>   btrfs_delayed_update_inode+0xaa/0x350 [btrfs]
>   btrfs_update_inode+0x9d/0x110 [btrfs]
>   btrfs_dirty_inode+0x5d/0xd0 [btrfs]
>   touch_atime+0xb5/0x100
>   iterate_dir+0xf1/0x1b0
>   __x64_sys_getdents64+0x78/0x110
>   do_syscall_64+0x33/0x80
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  RIP: 0033:0x7fb5afe588db
>
> [CAUSE]
> In try_flush_qgroup(), we assume we don't hold a transaction handler at
> all.
> This is true for data reservation and mostly true for metadata.
> Since data space reservation always happens before we start a
> transaction, and for most metadata operation we reserve space in
> start_transaction().
>
> But there is an exception, btrfs_delayed_inode_reserve_metadata().
> It holds a transaction handler, while still try to reserve extra
> metadata space.
>
> When we hit -EDQUOT inside btrfs_delayed_inode_reserve_metadata(), we
> will join into current transaction, and commit current transaction,
> while we still have transaction handler out of qgroup code.
>
> [FIX]
> Let's check current->journal before we join the transaction.
>
> If current->journal is empty or BTRFS_SEND_TRANS_STUB, it means
> ourselves are not holding a transaction, thus is able to join and then
> commit transaction.
>
> If current->journal is a valid transaction handler, we avoid committing
> transaction, but just end current transaction.
>
> This is less effective than committing current transaction, as it won't
> free metadata reserved space, but we may still free some data space by
> the incoming data write.
>
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1178634
> Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> ---
>  fs/btrfs/qgroup.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index bf4b02a40ecc..e320bb574421 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3478,6 +3478,7 @@ static int try_flush_qgroup(struct btrfs_root *root)
>  {
>         struct btrfs_trans_handle *trans;
>         int ret;
> +       bool can_commit = true;
>
>         /*
>          * We don't want to run flush again and again, so if there is a running
> @@ -3489,6 +3490,19 @@ static int try_flush_qgroup(struct btrfs_root *root)
>                 return 0;
>         }
>
> +       /*
> +        * If current process holds a transaction, we shouldn't flush, as we
> +        * assume all space reservation happens before a trans handler is hold.
> +        *
> +        * But there are cases like btrfs_delayed_item_reserve_metadata() where
> +        * we try to reserve space with one trans handler already hold.
> +        * In that case we can't commit transaction. but at most end
> +        * transaction, and hope the started data writes can free some space.
> +        */
> +       if (current->journal_info &&
> +           current->journal_info != BTRFS_SEND_TRANS_STUB)
> +               can_commit = false;
> +
>         ret = btrfs_start_delalloc_snapshot(root);
>         if (ret < 0)
>                 goto out;
> @@ -3500,7 +3514,10 @@ static int try_flush_qgroup(struct btrfs_root *root)
>                 goto out;
>         }
>
> -       ret = btrfs_commit_transaction(trans);
> +       if (can_commit)
> +               ret = btrfs_commit_transaction(trans);
> +       else
> +               ret = btrfs_end_transaction(trans);
>  out:
>         clear_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state);
>         wake_up(&root->qgroup_flush_wait);
> --
> 2.29.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-11-11 10:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11  9:30 [PATCH] btrfs: qgroup: don't commit transaction when we have already hold a transaction handler Qu Wenruo
2020-11-11 10:10 ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).