All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [POC for v5.15 0/2] btrfs: defrag: what if v5.15 is doing proper defrag
Date: Tue, 25 Jan 2022 18:55:41 +0800	[thread overview]
Message-ID: <791ca198-d4d0-91b3-ed9b-63cc19c78437@gmx.com> (raw)
In-Reply-To: <Ye/S15/clpSOG3y6@debian9.Home>



On 2022/1/25 18:37, Filipe Manana wrote:
> On Tue, Jan 25, 2022 at 02:50:55PM +0800, Qu Wenruo wrote:
>> ** DON'T MERGE, THIS IS JUST A PROOF OF CONCEPT **
>>
>> There are several reports about v5.16 btrfs autodefrag is causing more
>> IO than v5.15.
>>
>> But it turns out that, commit f458a3873ae ("btrfs: fix race when
>> defragmenting leads to unnecessary IO") is making defrags doing less
>> work than it should.
>> Thus damping the IO for autodefrag.
>>
>> This POC series is to make v5.15 kernel to do proper defrag of all good
>> candidates while still not defrag any hole/preallocated range.
>>
>> The test script here looks like this:
>>
>> 	wipefs -fa $dev
>> 	mkfs.btrfs -f $dev -U $uuid > /dev/null
>> 	mount $dev $mnt -o autodefrag
>> 	$fsstress -w -n 2000 -p 1 -d $mnt -s 1642319517
>> 	sync
>> 	echo "=== baseline ==="
>> 	cat /sys/fs/btrfs/$uuid/debug/io_accounting/data_write
>> 	echo 0 > /sys/fs/btrfs/$uuid/debug/cleaner_trigger
>> 	sleep 3
>> 	sync
>> 	echo "=== after autodefrag ==="
>> 	cat /sys/fs/btrfs/$uuid/debug/io_accounting/data_write
>> 	umount $mnt
>>
>> <uuid>/debug/io_accounting/data_write is the new debug features showing
>> how many bytes has been written for a btrfs.
>> The numbers are before chunk mapping.
>> cleaer_trigger is the trigger to wake up cleaner_kthread so autodefrag
>> can do its work.
>>
>> Now there is result:
>>
>>                  | Data bytes written | Diff to baseline
>> ----------------+--------------------+------------------
>> no autodefrag   | 36896768           | 0
>> v5.15 vanilla   | 40079360           | +8.6%
>> v5.15 POC       | 42491904           | +15.2%
>> v5.16 fixes	| 42536960	     | +15.3%
>>
>> The data shows, although v5.15 vanilla is really causing the least
>> amount of IO for autodefrag, if v5.15 is patched with POC to do proper
>> defrag, the final IO is almost the same as v5.16 with submitted fixes.
>>
>> So this proves that, the v5.15 has lower IO is not a valid default, but
>> a regression which leads to less efficient defrag.
>>
>> And the IO increase is in fact a proof of a regression being fixed.
>
> Are you sure that's the only thing?

Not the only thing, but it's proving the baseline of v5.15 is not a
reliable one.

> Users report massive IO difference, 15% more does not seem to be massive.
> François for example reported a difference of 10 ops/s vs 1k ops/s [1]

This is just for the seed I'm using.
As it provides a reliable and constant baseline where all my previous
testing and debugging are based on.

It can be definitely way more IO if the load involves more full cluster
rejection.

>
> It also does not explain the 100% cpu usage of the cleaner kthread.
> Scanning the whole file based on extent maps and not using
> btrfs_search_forward() anymore, as discussed yesterday on slack, can
> however contribute to much higher cpu usage.

That's definitely one possible reason.

But this particular analyse has io_accounting focused on data_write,
thus metadata can definitely have its part in it.

Nevertheless, this has already shows there are problems in the old
autodefrag path and not really exposed.

Thanks,
Qu

>
> [1] https://lore.kernel.org/linux-btrfs/CAEwRaO4y3PPPUdwYjNDoB9m9CLzfd3DFFk2iK1X6OyyEWG5-mg@mail.gmail.com/
>
> Thanks.
>
>>
>> Qu Wenruo (2):
>>    btrfs: defrag: don't defrag preallocated extents
>>    btrfs: defrag: limit cluster size to the first hole/prealloc range
>>
>>   fs/btrfs/ioctl.c | 48 ++++++++++++++++++++++++++++++++++++++++++------
>>   1 file changed, 42 insertions(+), 6 deletions(-)
>>
>> --
>> 2.34.1
>>

  reply	other threads:[~2022-01-25 10:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25  6:50 [POC for v5.15 0/2] btrfs: defrag: what if v5.15 is doing proper defrag Qu Wenruo
2022-01-25  6:50 ` [POC for v5.15 1/2] btrfs: defrag: don't defrag preallocated extents Qu Wenruo
2022-01-25  6:50 ` [POC for v5.15 2/2] btrfs: defrag: limit cluster size to the first hole/prealloc range Qu Wenruo
2022-01-25 10:37 ` [POC for v5.15 0/2] btrfs: defrag: what if v5.15 is doing proper defrag Filipe Manana
2022-01-25 10:55   ` Qu Wenruo [this message]
2022-01-25 11:05     ` Qu Wenruo
2022-01-25 19:58       ` François-Xavier Thomas
2022-02-01 21:18         ` François-Xavier Thomas
2022-02-02  0:35           ` Qu Wenruo
2022-02-02  1:18             ` Qu Wenruo
2022-02-02 19:01               ` François-Xavier Thomas
2022-02-04  9:32                 ` François-Xavier Thomas
2022-02-04  9:49                   ` Qu Wenruo
2022-02-04 11:05                     ` François-Xavier Thomas
2022-02-04 11:21                       ` Qu Wenruo
2022-02-04 11:27                         ` François-Xavier Thomas
2022-02-04 11:28                           ` François-Xavier Thomas
2022-02-05 10:19                       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=791ca198-d4d0-91b3-ed9b-63cc19c78437@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.