All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: "Wilson, Ellis" <ellisw@panasas.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Understanding BTRFS RAID0 Performance
Date: Fri, 5 Oct 2018 11:45:32 +0300	[thread overview]
Message-ID: <2da81104-f0c0-96a3-8c8f-98813e5dbeea@suse.com> (raw)
In-Reply-To: <54026c92-9cd1-2ac8-5747-c5405dd82087@panasas.com>



On  5.10.2018 00:33, Wilson, Ellis wrote:
> Hi all,
> 
> I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 
> for large read I/Os across six disks compared with ext4 atop mdadm RAID0.
> 
> Specifically, I achieve performance parity with BTRFS in terms of 
> single-threaded write and read, and multi-threaded write, but poor 
> performance for multi-threaded read.  The relative discrepancy appears 
> to grow as one adds disks.  At 6 disks in a RAID0 (yes, I know, and I do 
> not care about data persistence as I have this solved at a different 
> layer) I see approximately 1.3GB/s for ext4 atop mdadm, but only about 
> 950MB/s for BTRFS, both using four threads to read and write four 
> different large files.  Across a large number of my nodes this 
> aggregates to a sizable performance loss.
> 
> This has been a long and winding road for me, but to keep my question 
> somewhat succinct, I'm down to the level of block tracing and one thing 
> that stands out between the two traces is the number of rather small 
> read I/O's that reach one of the drives in the test is vastly different 
> for mdadm RAID0 vs BTRFS, which I think explains (in part at least) the 
> performance drop off.  The read queue depth for BTRFS hovers in the 
> upper single digits while the ext4/mdadm queue depth is towards 20.  I'm 
> unsure right now if this is related or not.
> 
> Benchmark: FIO was used with the following command:
> fio --name=read --rw=read --bs=1M --direct=0 --size=16G --numjobs=4 
> --runtime=120 --group_reporting

Right, so you are doing sequential reads. Since btrfs uses
generic_read_file_iter as its read-related operations and what it just
calls btrfs_readpage which ends up in:

btrfs_readpage
  extent_read_full_page
   __extent_read_full_page
    __do_readpage
      submit_extent_page <- Here we have some code which is supposed to
detect contiguous bios detection and merging

So my first guess would be to instrument the code around the merging
logic and see if it works as expected and is able to merge the majority
of the bios.

> 
> The block sizes and counts of I/Os at that size I'm seeing for both 
> cases comes in like the following (my max_segment_kb_size is 4K, hence 
> the above typical upper-end):
> 
> BTRFS:
>   Count  Read I/O Size
>    21849 128
>       18 640
>        9 768
>        3 1280
>        9 1408
>        3 2048
>        3 2560
>     1011 2688
>      507 2816
> 
> ext4 on mdadm RAID0:
>   Count  Read I/O Size
>        9 8
>        3 16
>        5 256
>        5 768
>       19 1024
>      716 1536
>        5 1592
>        5 2504
>      695 2560
>       24 4096
>       21 6656
>      477 8192
> 
> Before I dive into the BTRFS source or try tracing in a different way, I 
> wanted to see if this was a well-known artifact of BTRFS RAID0 and, even 
> better, if there's any tunables available for RAID0 in BTRFS I could 
> play with.  The man page for mkfs.btrfs and btrfstune in the tuning 
> regard seemed...sparse.>
> Any help or pointers are greatly appreciated!>
> Thanks,
> 
> ellis
> 

  reply	other threads:[~2018-10-05  8:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-04 21:33 Understanding BTRFS RAID0 Performance Wilson, Ellis
2018-10-05  8:45 ` Nikolay Borisov [this message]
2018-10-05 10:40 ` Duncan
2018-10-05 15:29   ` Wilson, Ellis
2018-10-06  0:34     ` Duncan
2018-10-08 12:20       ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2da81104-f0c0-96a3-8c8f-98813e5dbeea@suse.com \
    --to=nborisov@suse.com \
    --cc=ellisw@panasas.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.