All of lore.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@gmail.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>,
	Filipe David Borba Manana <fdmanana@suse.com>
Subject: Re: [PATCH] btrfs: Simplify join_running_log_trans
Date: Mon, 20 May 2019 15:42:58 +0100	[thread overview]
Message-ID: <CAL3q7H6ftyYNwKRH3Nx+pXd1fTavZZ3-yo=nwaLEhi7E82jodA@mail.gmail.com> (raw)
In-Reply-To: <e4c42032-ee2b-1289-3f93-96214c753cc4@suse.com>

On Mon, May 20, 2019 at 10:13 AM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 20.05.19 г. 12:10 ч., Filipe Manana wrote:
> > On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@suse.com> wrote:
> >>
> >>
> >>
> >> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
> >>> This patch removes stray smp_mb before root->log_root from join_running_log_trans
> >>> as well as the unlocked check for root->log_root. log_root is only set in
> >>> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
> >>> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
> >>> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
> >>> Those 2 locking invariants ensure join_running_log_trans don't see invalid
> >>> values of ->log_root.
> >>>
> >>> Additionally this results in around 26% improvement when deleting 500k files/dir.
> >>> All values are in seconds.
> >>>
> >>>       With Patch (real)       With patch (sys)        Without patch (real)    Without patch (sys)
> >>>           80                                  78                                              91                                              90
> >>>               63                                      62                                              93                                              91
> >>>               65                                      64                                              92                                              90
> >>>               67                                      65                                              93                                              90
> >>>               75                                      73                                              90                                              88
> >>>               75                                      73                                              91                                              89
> >>>               75                                      73                                              93                                              90
> >>>               74                                      73                                              89                                              87
> >>>               76                                      74                                              91                                              89
> >>> stddev        5.76146200581454        5.45690184791497        1.42400062421959        1.22474487139159
> >>> mean  72.2222222222222        70.5555555555556        91.4444444444444        89.3333333333333
> >>> median  75                                    73                                              91                                              90
> >>>
> >>     With Patch (real)   With patch (sys)    Without patch (real)    Without patch (sys)
> >>         80                   78                     91                      90
> >>         63                   62                     93                      91
> >>         65                   64                     92                      90
> >>         67                   65                     93                      90
> >>         75                   73                     90                      88
> >>         75                   73                     91                      89
> >>         75                   73                     93                      90
> >>         74                   73                     89                      87
> >>         76                   74                     91                      89
> >> stddev  5.76146200581454    5.45690184791497    1.42400062421959    1.22474487139159
> >> mean    72.2222222222222    70.5555555555556    91.4444444444444    89.3333333333333
> >> median  75                   73                     91                      90
> >
> > Great.
> >
> > How was that test done?
> > Simply deleting the files with nothing else running in parallel?
>
> Yes
>
> >
> > How does it behave if while the files are being deleted,  there are
> > concurrent fsyncs on other files of the same subvolume, that is, while
> > the mutex is held?
> >
> > Because that check without holding the mutex, is likely there for
> > performance reasons.
>
> So I will repeat the test when there are concurrent fsyncs running and
> also with no concurrent fsyncs running but just removing the smp_mb, I
> think the performance gain should be from removing the smp_mb and not
> necessarily the check. In the worst case I can leave the check intact
> and just remove the smp_mb because it doesn't add any value
> correctness-wise.

So the barrier is likely there to make sure we see non-null log_root
after it was set, concurrently by someone calling start_log_trans().
It pairs, implicitly, with the mutex_unlock there, at
start_log_trans(), since that implies a memory barrier.

With your patch, we always end up doing a mutex_lock(), which also
implies a memory barrier.

Sorry I missed this before, but your test doesn't make any sense. How
can a memory barrier have such a big impact on a code path (unlink)
that does lot of much heavier stuff? Like btree searches/deletes,
deleting  delayed items, etc.
Your test case is even flawed because join_running_log_trans() can
never be called.

Look at its only two callers, btrfs_del_inode_ref_in_log() and
btrfs_del_dir_entries_in_log(), they both do:

if (dir->logged_trans < trans->transid)
return 0;

ret = join_running_log_trans(root);
if (ret)
return 0;

and

if (inode->logged_trans < trans->transid)
return 0;

ret = join_running_log_trans(root);
if (ret)
return 0;

Since you never fsync the files nor the directory containing them in
that test, we never end up calling join_running_log_trans().

So I don't know where you got the 26%...

Thanks.

>
> >
> > Thanks.
> >
> >
> >>
> >>
> >> Here's the perf data without being butchered.
> >>
> >>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> >>> ---
> >>>
> >>> This passed full xfstest run and the performance results were obtained with the
> >>> following testcase:
> >>>
> >>> #!/bin/bash
> >>> for i in {1..10}; do
> >>>     echo "Testun run : %i"
> >>>     ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
> >>>     sync
> >>>     time rm -rf /media/scratch/*
> >>>     sync
> >>> done
> >>>
> >>>  fs/btrfs/tree-log.c | 4 ----
> >>>  1 file changed, 4 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> >>> index 6adcd8a2c5c7..61744d8af106 100644
> >>> --- a/fs/btrfs/tree-log.c
> >>> +++ b/fs/btrfs/tree-log.c
> >>> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
> >>>  {
> >>>       int ret = -ENOENT;
> >>>
> >>> -     smp_mb();
> >>> -     if (!root->log_root)
> >>> -             return -ENOENT;
> >>> -
> >>>       mutex_lock(&root->log_mutex);
> >>>       if (root->log_root) {
> >>>               ret = 0;
> >>>
> >
> >
> >
> > --
> > Filipe David Manana,
> >
> > “Whether you think you can, or you think you can't — you're right.”
> >



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

      reply	other threads:[~2019-05-20 14:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-20  8:11 [PATCH] btrfs: Simplify join_running_log_trans Nikolay Borisov
2019-05-20  8:18 ` Nikolay Borisov
2019-05-20  9:10   ` Filipe Manana
2019-05-20  9:13     ` Nikolay Borisov
2019-05-20 14:42       ` Filipe Manana [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL3q7H6ftyYNwKRH3Nx+pXd1fTavZZ3-yo=nwaLEhi7E82jodA@mail.gmail.com' \
    --to=fdmanana@gmail.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.