linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Filipe Manana <fdmanana@kernel.org>,
	dsterba@suse.cz, Qu Wenruo <wqu@suse.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism
Date: Thu, 9 Dec 2021 21:25:58 +0800	[thread overview]
Message-ID: <7eb7b1f6-6f2b-ebcd-e5da-f5945843da3f@gmx.com> (raw)
In-Reply-To: <YbHZhGGpBvqoqfiT@debian9.Home>

[-- Attachment #1: Type: text/plain, Size: 4739 bytes --]



On 2021/12/9 18:25, Filipe Manana wrote:
> On Wed, Dec 08, 2021 at 03:04:11PM +0100, David Sterba wrote:
>> On Tue, Dec 07, 2021 at 03:53:22PM +0000, Filipe Manana wrote:
>>>>> I'm doing some tests, in a VM on a dedicated HDD.
>>>>
>>>> There's some measurable difference:
>>>>
>>>> With readahead:
>>>>
>>>> Duration:         0:00:20
>>>> Total to scrub:   7.02GiB
>>>> Rate:             236.92MiB/s
>>>>
>>>> Duration:         0:00:48
>>>> Total to scrub:   12.02GiB
>>>> Rate:             198.02MiB/s
>>>>
>>>> Without readahead:
>>>>
>>>> Duration:         0:00:22
>>>> Total to scrub:   7.02GiB
>>>> Rate:             215.10MiB/s
>>>>
>>>> Duration:         0:00:50
>>>> Total to scrub:   12.02GiB
>>>> Rate:             190.66MiB/s
>>>>
>>>> The setup is: data/single, metadata/dup, no-holes, free-space-tree,
>>>> there are 8 backing devices but all reside on one HDD.
>>>>
>>>> Data generated by fio like
>>>>
>>>> fio --rw=randrw --randrepeat=1 --size=3000m \
>>>>           --bsrange=512b-64k --bs_unaligned \
>>>>           --ioengine=libaio --fsync=1024 \
>>>>           --name=job0 --name=job1 \
>>>>
>>>> and scrub starts right away this. VM has 4G or memory and 4 CPUs.
>>>
>>> How about using bare metal? And was it a debug kernel, or a default
>>> kernel config from a distro?
>>
>> It was the debug config I use for normal testing, I'll try to redo it on
>> another physical box.
>>
>>> Those details often make all the difference (either for the best or
>>> for the worse).
>>>
>>> I'm curious to see as well the results when:
>>>
>>> 1) The reada.c code is changed to work with commit roots;
>>>
>>> 2) The standard btree readahead (struct btrfs_path::reada) is used
>>> instead of the reada.c code.
>>>
>>>>
>>>> The difference is 2 seconds, roughly 4% but the sample is not large
>>>> enough to be conclusive.
>>>
>>> A bit too small.
>>
>> What's worse, I did a few more rounds and the results were too unstable,
>> from 44 seconds to 25 seconds (all on the removed readahead branch), but
>> the machine was not quiescent.
>
> I get such huge variations too when using a debug kernel and virtualized
> disks for any tests, even for single threaded tests.
>
> That's why I use a default, non-debug, kernel config from a popular distro
> and without any virtualization (or at least have qemu use a raw device, not
> a file backed disk on top of another filesystem) when measuring performance.
>
I got my 2.5' HDD installed and tested.

[CONCLUSION]

There is a small but very consistent performance drop for HDD.

Without patchset:	average rate = 106.46 MiB/s
With patchset:		average rate = 100.74 MiB/s

Diff = -5.67%

[TEST ENV]

HDD:	2T 2.5 inch HDD, 5400rpm device-managed SMR
	(WDC WD20SPZX-22UA7T0)
HOST:	CPU:	AMD RYZEN 5900X
	MEM:	32G DDR4 3200, no ECC

	No obvious CPU/IO during the test duration

VM:	CPU:	16 vcore
	MEM:	4G
	CACHE:	none (as even writeback will cause read to be cached)

Although I'm still using VM, the whole disk is passed to VM directly,
and has cache=none option.

The initial fs is using 1 device RAID0, as this will cause more stripe
based scrub, thus more small metadata readahead triggered.

The initial content for the fs is created by the following fio job first:

[scrub-populate]
directory=/mnt/btrfs
nrfiles=16384
openfiles=16
filesize=2k-512k
readwrite=randwrite
ioengine=libaio
fallocate=none
numjobs=4

Then removed 1/16th (4096) files randomly to create enough gaps in
extent tree.

Then run scrub on the fs using both original code, and the patchset with
reada enabled for both extent tree (one new one-line patch) and csum
tree (already enabled in btrfs_lookup_csums_range()).

Both cases get 8 scrubs run each, between each run, all caches are
dropped, and fs get unmounted and re-mounted.

(Yes, this is the perfect situation for the original code, as the fs is
not changed, thus current node is the same as commit root)

Each scrub runs shows every small variants, all the duration difference
is within 1 second.

The result shows results benefit the original code, while with
btrfs_reada_add() removed, the difference is not that large:

[POSSIBLE REASON]

- Synchronous readahead
   Maybe this makes readahead less interruptive for data read?
   As with btrfs_reada_add() removed, path reada is alwasy asynchronous.

- Dedicated readahead thread io priority
   Unlike path reada, the readahead thread has dedicated io priority.

I can definitely rework the framework to make it more modern but still
keeps above two features.

Or is the 5% performance drop acceptable?

Raw scrub test result attached.

Thanks,
Qu

[-- Attachment #2: scrub.log.original --]
[-- Type: text/plain, Size: 1912 bytes --]

scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:22:40 2021
Status:           finished
Duration:         0:02:26
Total to scrub:   18.02GiB
Rate:             106.00MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:25:06 2021
Status:           finished
Duration:         0:02:25
Total to scrub:   18.02GiB
Rate:             106.73MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:27:32 2021
Status:           finished
Duration:         0:02:25
Total to scrub:   18.02GiB
Rate:             106.73MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:29:57 2021
Status:           finished
Duration:         0:02:26
Total to scrub:   18.02GiB
Rate:             106.00MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:32:23 2021
Status:           finished
Duration:         0:02:25
Total to scrub:   18.02GiB
Rate:             106.73MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:34:49 2021
Status:           finished
Duration:         0:02:25
Total to scrub:   18.02GiB
Rate:             106.73MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:37:14 2021
Status:           finished
Duration:         0:02:26
Total to scrub:   18.02GiB
Rate:             106.00MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:39:40 2021
Status:           finished
Duration:         0:02:25
Total to scrub:   18.02GiB
Rate:             106.73MiB/s
Error summary:    no errors found

[-- Attachment #3: scrub.log.removed --]
[-- Type: text/plain, Size: 1912 bytes --]

scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:43:04 2021
Status:           finished
Duration:         0:02:34
Total to scrub:   18.02GiB
Rate:             100.50MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:45:39 2021
Status:           finished
Duration:         0:02:33
Total to scrub:   18.02GiB
Rate:             101.15MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:48:13 2021
Status:           finished
Duration:         0:02:34
Total to scrub:   18.02GiB
Rate:             100.50MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:50:48 2021
Status:           finished
Duration:         0:02:33
Total to scrub:   18.02GiB
Rate:             101.15MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:53:22 2021
Status:           finished
Duration:         0:02:33
Total to scrub:   18.02GiB
Rate:             101.15MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:55:56 2021
Status:           finished
Duration:         0:02:34
Total to scrub:   18.02GiB
Rate:             100.50MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 20:58:30 2021
Status:           finished
Duration:         0:02:34
Total to scrub:   18.02GiB
Rate:             100.50MiB/s
Error summary:    no errors found
scrub done for 16ecd3f9-5466-4f99-854b-2a50a4369a97
Scrub started:    Thu Dec  9 21:01:04 2021
Status:           finished
Duration:         0:02:34
Total to scrub:   18.02GiB
Rate:             100.50MiB/s
Error summary:    no errors found

  reply	other threads:[~2021-12-09 13:26 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07  7:43 [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism Qu Wenruo
2021-12-07  7:43 ` [PATCH RFC 1/2] btrfs: remove the unnecessary path parameter for scrub_raid56_parity() Qu Wenruo
2021-12-07  7:44 ` [PATCH RFC 2/2] btrfs: remove reada mechanism Qu Wenruo
2021-12-07 11:02 ` [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism Filipe Manana
2021-12-07 11:43   ` Qu Wenruo
2021-12-07 11:56     ` Filipe Manana
2021-12-07 12:01       ` Qu Wenruo
2021-12-07 14:53         ` David Sterba
2021-12-07 15:40           ` David Sterba
2021-12-07 15:53             ` Filipe Manana
2021-12-08  0:08               ` Qu Wenruo
2021-12-08 14:04               ` David Sterba
2021-12-09 10:25                 ` Filipe Manana
2021-12-09 13:25                   ` Qu Wenruo [this message]
2021-12-09 14:33                     ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7eb7b1f6-6f2b-ebcd-e5da-f5945843da3f@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).