* [PATCH] btrfs: Simplify join_running_log_trans
@ 2019-05-20 8:11 Nikolay Borisov
2019-05-20 8:18 ` Nikolay Borisov
0 siblings, 1 reply; 5+ messages in thread
From: Nikolay Borisov @ 2019-05-20 8:11 UTC (permalink / raw)
To: linux-btrfs; +Cc: fdmanana, Nikolay Borisov
This patch removes stray smp_mb before root->log_root from join_running_log_trans
as well as the unlocked check for root->log_root. log_root is only set in
btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
Those 2 locking invariants ensure join_running_log_trans don't see invalid
values of ->log_root.
Additionally this results in around 26% improvement when deleting 500k files/dir.
All values are in seconds.
With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
80 78 91 90
63 62 93 91
65 64 92 90
67 65 93 90
75 73 90 88
75 73 91 89
75 73 93 90
74 73 89 87
76 74 91 89
stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
median 75 73 91 90
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
This passed full xfstest run and the performance results were obtained with the
following testcase:
#!/bin/bash
for i in {1..10}; do
echo "Testun run : %i"
./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
sync
time rm -rf /media/scratch/*
sync
done
fs/btrfs/tree-log.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 6adcd8a2c5c7..61744d8af106 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
{
int ret = -ENOENT;
- smp_mb();
- if (!root->log_root)
- return -ENOENT;
-
mutex_lock(&root->log_mutex);
if (root->log_root) {
ret = 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: Simplify join_running_log_trans
2019-05-20 8:11 [PATCH] btrfs: Simplify join_running_log_trans Nikolay Borisov
@ 2019-05-20 8:18 ` Nikolay Borisov
2019-05-20 9:10 ` Filipe Manana
0 siblings, 1 reply; 5+ messages in thread
From: Nikolay Borisov @ 2019-05-20 8:18 UTC (permalink / raw)
To: linux-btrfs; +Cc: fdmanana
On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
> This patch removes stray smp_mb before root->log_root from join_running_log_trans
> as well as the unlocked check for root->log_root. log_root is only set in
> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
> Those 2 locking invariants ensure join_running_log_trans don't see invalid
> values of ->log_root.
>
> Additionally this results in around 26% improvement when deleting 500k files/dir.
> All values are in seconds.
>
> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
> 80 78 91 90
> 63 62 93 91
> 65 64 92 90
> 67 65 93 90
> 75 73 90 88
> 75 73 91 89
> 75 73 93 90
> 74 73 89 87
> 76 74 91 89
> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
> median 75 73 91 90
>
With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
80 78 91 90
63 62 93 91
65 64 92 90
67 65 93 90
75 73 90 88
75 73 91 89
75 73 93 90
74 73 89 87
76 74 91 89
stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
median 75 73 91 90
Here's the perf data without being butchered.
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
>
> This passed full xfstest run and the performance results were obtained with the
> following testcase:
>
> #!/bin/bash
> for i in {1..10}; do
> echo "Testun run : %i"
> ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
> sync
> time rm -rf /media/scratch/*
> sync
> done
>
> fs/btrfs/tree-log.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 6adcd8a2c5c7..61744d8af106 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
> {
> int ret = -ENOENT;
>
> - smp_mb();
> - if (!root->log_root)
> - return -ENOENT;
> -
> mutex_lock(&root->log_mutex);
> if (root->log_root) {
> ret = 0;
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: Simplify join_running_log_trans
2019-05-20 8:18 ` Nikolay Borisov
@ 2019-05-20 9:10 ` Filipe Manana
2019-05-20 9:13 ` Nikolay Borisov
0 siblings, 1 reply; 5+ messages in thread
From: Filipe Manana @ 2019-05-20 9:10 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs, Filipe David Borba Manana
On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
> > This patch removes stray smp_mb before root->log_root from join_running_log_trans
> > as well as the unlocked check for root->log_root. log_root is only set in
> > btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
> > Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
> > which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
> > Those 2 locking invariants ensure join_running_log_trans don't see invalid
> > values of ->log_root.
> >
> > Additionally this results in around 26% improvement when deleting 500k files/dir.
> > All values are in seconds.
> >
> > With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
> > 80 78 91 90
> > 63 62 93 91
> > 65 64 92 90
> > 67 65 93 90
> > 75 73 90 88
> > 75 73 91 89
> > 75 73 93 90
> > 74 73 89 87
> > 76 74 91 89
> > stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
> > mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
> > median 75 73 91 90
> >
> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
> 80 78 91 90
> 63 62 93 91
> 65 64 92 90
> 67 65 93 90
> 75 73 90 88
> 75 73 91 89
> 75 73 93 90
> 74 73 89 87
> 76 74 91 89
> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
> median 75 73 91 90
Great.
How was that test done?
Simply deleting the files with nothing else running in parallel?
How does it behave if while the files are being deleted, there are
concurrent fsyncs on other files of the same subvolume, that is, while
the mutex is held?
Because that check without holding the mutex, is likely there for
performance reasons.
Thanks.
>
>
> Here's the perf data without being butchered.
>
> > Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> > ---
> >
> > This passed full xfstest run and the performance results were obtained with the
> > following testcase:
> >
> > #!/bin/bash
> > for i in {1..10}; do
> > echo "Testun run : %i"
> > ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
> > sync
> > time rm -rf /media/scratch/*
> > sync
> > done
> >
> > fs/btrfs/tree-log.c | 4 ----
> > 1 file changed, 4 deletions(-)
> >
> > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> > index 6adcd8a2c5c7..61744d8af106 100644
> > --- a/fs/btrfs/tree-log.c
> > +++ b/fs/btrfs/tree-log.c
> > @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
> > {
> > int ret = -ENOENT;
> >
> > - smp_mb();
> > - if (!root->log_root)
> > - return -ENOENT;
> > -
> > mutex_lock(&root->log_mutex);
> > if (root->log_root) {
> > ret = 0;
> >
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: Simplify join_running_log_trans
2019-05-20 9:10 ` Filipe Manana
@ 2019-05-20 9:13 ` Nikolay Borisov
2019-05-20 14:42 ` Filipe Manana
0 siblings, 1 reply; 5+ messages in thread
From: Nikolay Borisov @ 2019-05-20 9:13 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs, Filipe David Borba Manana
On 20.05.19 г. 12:10 ч., Filipe Manana wrote:
> On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@suse.com> wrote:
>>
>>
>>
>> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
>>> This patch removes stray smp_mb before root->log_root from join_running_log_trans
>>> as well as the unlocked check for root->log_root. log_root is only set in
>>> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
>>> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
>>> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
>>> Those 2 locking invariants ensure join_running_log_trans don't see invalid
>>> values of ->log_root.
>>>
>>> Additionally this results in around 26% improvement when deleting 500k files/dir.
>>> All values are in seconds.
>>>
>>> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
>>> 80 78 91 90
>>> 63 62 93 91
>>> 65 64 92 90
>>> 67 65 93 90
>>> 75 73 90 88
>>> 75 73 91 89
>>> 75 73 93 90
>>> 74 73 89 87
>>> 76 74 91 89
>>> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
>>> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
>>> median 75 73 91 90
>>>
>> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
>> 80 78 91 90
>> 63 62 93 91
>> 65 64 92 90
>> 67 65 93 90
>> 75 73 90 88
>> 75 73 91 89
>> 75 73 93 90
>> 74 73 89 87
>> 76 74 91 89
>> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
>> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
>> median 75 73 91 90
>
> Great.
>
> How was that test done?
> Simply deleting the files with nothing else running in parallel?
Yes
>
> How does it behave if while the files are being deleted, there are
> concurrent fsyncs on other files of the same subvolume, that is, while
> the mutex is held?
>
> Because that check without holding the mutex, is likely there for
> performance reasons.
So I will repeat the test when there are concurrent fsyncs running and
also with no concurrent fsyncs running but just removing the smp_mb, I
think the performance gain should be from removing the smp_mb and not
necessarily the check. In the worst case I can leave the check intact
and just remove the smp_mb because it doesn't add any value
correctness-wise.
>
> Thanks.
>
>
>>
>>
>> Here's the perf data without being butchered.
>>
>>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>>> ---
>>>
>>> This passed full xfstest run and the performance results were obtained with the
>>> following testcase:
>>>
>>> #!/bin/bash
>>> for i in {1..10}; do
>>> echo "Testun run : %i"
>>> ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
>>> sync
>>> time rm -rf /media/scratch/*
>>> sync
>>> done
>>>
>>> fs/btrfs/tree-log.c | 4 ----
>>> 1 file changed, 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
>>> index 6adcd8a2c5c7..61744d8af106 100644
>>> --- a/fs/btrfs/tree-log.c
>>> +++ b/fs/btrfs/tree-log.c
>>> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
>>> {
>>> int ret = -ENOENT;
>>>
>>> - smp_mb();
>>> - if (!root->log_root)
>>> - return -ENOENT;
>>> -
>>> mutex_lock(&root->log_mutex);
>>> if (root->log_root) {
>>> ret = 0;
>>>
>
>
>
> --
> Filipe David Manana,
>
> “Whether you think you can, or you think you can't — you're right.”
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: Simplify join_running_log_trans
2019-05-20 9:13 ` Nikolay Borisov
@ 2019-05-20 14:42 ` Filipe Manana
0 siblings, 0 replies; 5+ messages in thread
From: Filipe Manana @ 2019-05-20 14:42 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs, Filipe David Borba Manana
On Mon, May 20, 2019 at 10:13 AM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 20.05.19 г. 12:10 ч., Filipe Manana wrote:
> > On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@suse.com> wrote:
> >>
> >>
> >>
> >> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
> >>> This patch removes stray smp_mb before root->log_root from join_running_log_trans
> >>> as well as the unlocked check for root->log_root. log_root is only set in
> >>> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
> >>> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
> >>> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
> >>> Those 2 locking invariants ensure join_running_log_trans don't see invalid
> >>> values of ->log_root.
> >>>
> >>> Additionally this results in around 26% improvement when deleting 500k files/dir.
> >>> All values are in seconds.
> >>>
> >>> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
> >>> 80 78 91 90
> >>> 63 62 93 91
> >>> 65 64 92 90
> >>> 67 65 93 90
> >>> 75 73 90 88
> >>> 75 73 91 89
> >>> 75 73 93 90
> >>> 74 73 89 87
> >>> 76 74 91 89
> >>> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
> >>> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
> >>> median 75 73 91 90
> >>>
> >> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
> >> 80 78 91 90
> >> 63 62 93 91
> >> 65 64 92 90
> >> 67 65 93 90
> >> 75 73 90 88
> >> 75 73 91 89
> >> 75 73 93 90
> >> 74 73 89 87
> >> 76 74 91 89
> >> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
> >> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
> >> median 75 73 91 90
> >
> > Great.
> >
> > How was that test done?
> > Simply deleting the files with nothing else running in parallel?
>
> Yes
>
> >
> > How does it behave if while the files are being deleted, there are
> > concurrent fsyncs on other files of the same subvolume, that is, while
> > the mutex is held?
> >
> > Because that check without holding the mutex, is likely there for
> > performance reasons.
>
> So I will repeat the test when there are concurrent fsyncs running and
> also with no concurrent fsyncs running but just removing the smp_mb, I
> think the performance gain should be from removing the smp_mb and not
> necessarily the check. In the worst case I can leave the check intact
> and just remove the smp_mb because it doesn't add any value
> correctness-wise.
So the barrier is likely there to make sure we see non-null log_root
after it was set, concurrently by someone calling start_log_trans().
It pairs, implicitly, with the mutex_unlock there, at
start_log_trans(), since that implies a memory barrier.
With your patch, we always end up doing a mutex_lock(), which also
implies a memory barrier.
Sorry I missed this before, but your test doesn't make any sense. How
can a memory barrier have such a big impact on a code path (unlink)
that does lot of much heavier stuff? Like btree searches/deletes,
deleting delayed items, etc.
Your test case is even flawed because join_running_log_trans() can
never be called.
Look at its only two callers, btrfs_del_inode_ref_in_log() and
btrfs_del_dir_entries_in_log(), they both do:
if (dir->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
if (ret)
return 0;
and
if (inode->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
if (ret)
return 0;
Since you never fsync the files nor the directory containing them in
that test, we never end up calling join_running_log_trans().
So I don't know where you got the 26%...
Thanks.
>
> >
> > Thanks.
> >
> >
> >>
> >>
> >> Here's the perf data without being butchered.
> >>
> >>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> >>> ---
> >>>
> >>> This passed full xfstest run and the performance results were obtained with the
> >>> following testcase:
> >>>
> >>> #!/bin/bash
> >>> for i in {1..10}; do
> >>> echo "Testun run : %i"
> >>> ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
> >>> sync
> >>> time rm -rf /media/scratch/*
> >>> sync
> >>> done
> >>>
> >>> fs/btrfs/tree-log.c | 4 ----
> >>> 1 file changed, 4 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> >>> index 6adcd8a2c5c7..61744d8af106 100644
> >>> --- a/fs/btrfs/tree-log.c
> >>> +++ b/fs/btrfs/tree-log.c
> >>> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
> >>> {
> >>> int ret = -ENOENT;
> >>>
> >>> - smp_mb();
> >>> - if (!root->log_root)
> >>> - return -ENOENT;
> >>> -
> >>> mutex_lock(&root->log_mutex);
> >>> if (root->log_root) {
> >>> ret = 0;
> >>>
> >
> >
> >
> > --
> > Filipe David Manana,
> >
> > “Whether you think you can, or you think you can't — you're right.”
> >
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-05-20 14:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-20 8:11 [PATCH] btrfs: Simplify join_running_log_trans Nikolay Borisov
2019-05-20 8:18 ` Nikolay Borisov
2019-05-20 9:10 ` Filipe Manana
2019-05-20 9:13 ` Nikolay Borisov
2019-05-20 14:42 ` Filipe Manana
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.