linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink
@ 2020-06-15  9:38 fdmanana
  2020-06-15 17:48 ` Josef Bacik
  2020-06-16 13:12 ` David Sterba
  0 siblings, 2 replies; 4+ messages in thread
From: fdmanana @ 2020-06-15  9:38 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

This brings back an optimization that commit e678934cbe5f02 ("btrfs:
Remove unnecessary check from join_running_log_trans") removed, but in
a different form. So it's almost equivalent to a revert.

That commit removed an optimization where we avoid locking a root's
log_mutex when there is no log tree created in the current transaction.
The affected code path is triggered through unlink operations.

That commit was based on the assumption that the optimization was not
necessary because we used to have the following checks when the patch
was authored:

  int btrfs_del_dir_entries_in_log(...)
  {
        (...)
        if (dir->logged_trans < trans->transid)
            return 0;

        ret = join_running_log_trans(root);
        (...)
   }

   int btrfs_del_inode_ref_in_log(...)
   {
        (...)
        if (inode->logged_trans < trans->transid)
            return 0;

        ret = join_running_log_trans(root);
        (...)
   }

However before that patch was merged, another patch was merged first which
replaced those checks because they were buggy.

That other patch corresponds to commit 803f0f64d17769 ("Btrfs: fix fsync
not persisting dentry deletions due to inode evictions"). The assumption
that if the logged_trans field of an inode had a smaller value then the
current transaction's generation (transid) meant that the inode was not
logged in the current transaction was only correct if the inode was not
evicted and reloaded in the current transaction. So the corresponding bug
fix changed those checks and replaced them with the following helper
function:

  static bool inode_logged(struct btrfs_trans_handle *trans,
                           struct btrfs_inode *inode)
  {
        if (inode->logged_trans == trans->transid)
                return true;

        if (inode->last_trans == trans->transid &&
            test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) &&
            !test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags))
                return true;

        return false;
  }

So if we have a subvolume without a log tree in the current transaction
(because we had no fsyncs), every time we unlink an inode we can end up
trying to lock the log_mutex of the root through join_running_log_trans()
twice, once for the inode being unlinked (by btrfs_del_inode_ref_in_log())
and once for the parent directory (with btrfs_del_dir_entries_in_log()).

This means if we have several unlink operations happening in parallel for
inodes in the same subvolume, and the those inodes and/or their parent
inode were changed in the current transaction, we end up having a lot of
contention on the log_mutex.

The test robots from intel reported a -30.7% performance regression for
a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check
from join_running_log_trans").

So just bring back the optimization to join_running_log_trans() where we
check first if a log root exists before trying to lock the log_mutex. This
is done by checking for a bit that is set on the root when a log tree is
created and removed when a log tree is freed (at transaction commit time).

Commit e678934cbe5f02 ("btrfs: Remove unnecessary check from
join_running_log_trans") was merged in the 5.4 merge window while commit
803f0f64d17769 ("Btrfs: fix fsync not persisting dentry deletions due to
inode evictions") was merged in the 5.3 merge window. But the first
commit was actually authored before the second commit (May 23 2019 vs
June 19 2019).

Reported-by: kernel test robot <rong.a.chen@intel.com>
Link: https://lore.kernel.org/lkml/20200611090233.GL12456@shao2-debian/
Fixes: e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ctree.h    | 2 ++
 fs/btrfs/tree-log.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 30ce7039bc27..d404cce8ae40 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1009,6 +1009,8 @@ enum {
 	BTRFS_ROOT_DEAD_RELOC_TREE,
 	/* Mark dead root stored on device whose cleanup needs to be resumed */
 	BTRFS_ROOT_DEAD_TREE,
+	/* The root has a log tree. Used only for subvolume roots. */
+	BTRFS_ROOT_HAS_LOG_TREE,
 };
 
 /*
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 920cee312f4e..cd5348f352dd 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -169,6 +169,7 @@ static int start_log_trans(struct btrfs_trans_handle *trans,
 		if (ret)
 			goto out;
 
+		set_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state);
 		clear_bit(BTRFS_ROOT_MULTI_LOG_TASKS, &root->state);
 		root->log_start_pid = current->pid;
 	}
@@ -195,6 +196,9 @@ static int join_running_log_trans(struct btrfs_root *root)
 {
 	int ret = -ENOENT;
 
+	if (!test_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state))
+		return ret;
+
 	mutex_lock(&root->log_mutex);
 	if (root->log_root) {
 		ret = 0;
@@ -3303,6 +3307,7 @@ int btrfs_free_log(struct btrfs_trans_handle *trans, struct btrfs_root *root)
 	if (root->log_root) {
 		free_log_tree(trans, root->log_root);
 		root->log_root = NULL;
+		clear_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state);
 	}
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink
  2020-06-15  9:38 [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink fdmanana
@ 2020-06-15 17:48 ` Josef Bacik
  2020-06-16 13:12 ` David Sterba
  1 sibling, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2020-06-15 17:48 UTC (permalink / raw)
  To: fdmanana, linux-btrfs

On 6/15/20 5:38 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> This brings back an optimization that commit e678934cbe5f02 ("btrfs:
> Remove unnecessary check from join_running_log_trans") removed, but in
> a different form. So it's almost equivalent to a revert.
> 
> That commit removed an optimization where we avoid locking a root's
> log_mutex when there is no log tree created in the current transaction.
> The affected code path is triggered through unlink operations.
> 
> That commit was based on the assumption that the optimization was not
> necessary because we used to have the following checks when the patch
> was authored:
> 
>    int btrfs_del_dir_entries_in_log(...)
>    {
>          (...)
>          if (dir->logged_trans < trans->transid)
>              return 0;
> 
>          ret = join_running_log_trans(root);
>          (...)
>     }
> 
>     int btrfs_del_inode_ref_in_log(...)
>     {
>          (...)
>          if (inode->logged_trans < trans->transid)
>              return 0;
> 
>          ret = join_running_log_trans(root);
>          (...)
>     }
> 
> However before that patch was merged, another patch was merged first which
> replaced those checks because they were buggy.
> 
> That other patch corresponds to commit 803f0f64d17769 ("Btrfs: fix fsync
> not persisting dentry deletions due to inode evictions"). The assumption
> that if the logged_trans field of an inode had a smaller value then the
> current transaction's generation (transid) meant that the inode was not
> logged in the current transaction was only correct if the inode was not
> evicted and reloaded in the current transaction. So the corresponding bug
> fix changed those checks and replaced them with the following helper
> function:
> 
>    static bool inode_logged(struct btrfs_trans_handle *trans,
>                             struct btrfs_inode *inode)
>    {
>          if (inode->logged_trans == trans->transid)
>                  return true;
> 
>          if (inode->last_trans == trans->transid &&
>              test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) &&
>              !test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags))
>                  return true;
> 
>          return false;
>    }
> 
> So if we have a subvolume without a log tree in the current transaction
> (because we had no fsyncs), every time we unlink an inode we can end up
> trying to lock the log_mutex of the root through join_running_log_trans()
> twice, once for the inode being unlinked (by btrfs_del_inode_ref_in_log())
> and once for the parent directory (with btrfs_del_dir_entries_in_log()).
> 
> This means if we have several unlink operations happening in parallel for
> inodes in the same subvolume, and the those inodes and/or their parent
> inode were changed in the current transaction, we end up having a lot of
> contention on the log_mutex.
> 
> The test robots from intel reported a -30.7% performance regression for
> a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check
> from join_running_log_trans").
> 
> So just bring back the optimization to join_running_log_trans() where we
> check first if a log root exists before trying to lock the log_mutex. This
> is done by checking for a bit that is set on the root when a log tree is
> created and removed when a log tree is freed (at transaction commit time).
> 
> Commit e678934cbe5f02 ("btrfs: Remove unnecessary check from
> join_running_log_trans") was merged in the 5.4 merge window while commit
> 803f0f64d17769 ("Btrfs: fix fsync not persisting dentry deletions due to
> inode evictions") was merged in the 5.3 merge window. But the first
> commit was actually authored before the second commit (May 23 2019 vs
> June 19 2019).
> 
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Link: https://lore.kernel.org/lkml/20200611090233.GL12456@shao2-debian/
> Fixes: e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink
  2020-06-15  9:38 [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink fdmanana
  2020-06-15 17:48 ` Josef Bacik
@ 2020-06-16 13:12 ` David Sterba
  2020-06-16 13:21   ` Filipe Manana
  1 sibling, 1 reply; 4+ messages in thread
From: David Sterba @ 2020-06-16 13:12 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Mon, Jun 15, 2020 at 10:38:44AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> This brings back an optimization that commit e678934cbe5f02 ("btrfs:
> Remove unnecessary check from join_running_log_trans") removed, but in
> a different form. So it's almost equivalent to a revert.

I very much prefer the bit to be the synchronization mechanism, the
logic is easy to follow instead of the cryptic barrier.

The original patch came with numbers to support the 'not needed and no
perf impact
(https://lore.kernel.org/linux-btrfs/20190523115126.10532-1-nborisov@suse.com/)
but it probably wasn't triggering the right load.

[...]
> The test robots from intel reported a -30.7% performance regression for
> a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check
> from join_running_log_trans").

Thanks for fixing the perf regression and points for the test robot too.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink
  2020-06-16 13:12 ` David Sterba
@ 2020-06-16 13:21   ` Filipe Manana
  0 siblings, 0 replies; 4+ messages in thread
From: Filipe Manana @ 2020-06-16 13:21 UTC (permalink / raw)
  To: dsterba, Filipe Manana, linux-btrfs

On Tue, Jun 16, 2020 at 2:12 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Mon, Jun 15, 2020 at 10:38:44AM +0100, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > This brings back an optimization that commit e678934cbe5f02 ("btrfs:
> > Remove unnecessary check from join_running_log_trans") removed, but in
> > a different form. So it's almost equivalent to a revert.
>
> I very much prefer the bit to be the synchronization mechanism, the
> logic is easy to follow instead of the cryptic barrier.
>
> The original patch came with numbers to support the 'not needed and no
> perf impact
> (https://lore.kernel.org/linux-btrfs/20190523115126.10532-1-nborisov@suse.com/)
> but it probably wasn't triggering the right load.

The patch was tested without the other patch that fixes bugs in the
"inode was logged before" check, so no regression should happen in
that case.
As I mentioned in the change log, it was fine from a performance point
of view before the bug fix, after which the hot code path can be hit
again.

And 5 processes only will probably not be enough to detect it in many
environments.

>
> [...]
> > The test robots from intel reported a -30.7% performance regression for
> > a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check
> > from join_running_log_trans").
>
> Thanks for fixing the perf regression and points for the test robot too.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-16 13:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-15  9:38 [PATCH] Btrfs: check if a log root exists before locking the log_mutex on unlink fdmanana
2020-06-15 17:48 ` Josef Bacik
2020-06-16 13:12 ` David Sterba
2020-06-16 13:21   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).