All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Andrei Borzenkov <arvidjaar@gmail.com>,
	kreijack@inwind.it, George Shammas <btrfs@shamm.as>,
	linux-btrfs@vger.kernel.org
Subject: Re: What exactly is BTRFS Raid 10?
Date: Sun, 21 Aug 2022 08:23:00 +0800	[thread overview]
Message-ID: <e341879f-9e16-d0f9-dbb5-7c54a6bd28c2@gmx.com> (raw)
In-Reply-To: <c0080bf6-c433-30f1-83aa-de8ecba60bee@gmail.com>



On 2022/8/21 02:11, Andrei Borzenkov wrote:
> On 20.08.2022 14:28, Goffredo Baroncelli wrote:
>>
>> RAID1:
>> A new chunk is allocated to the two disks with more space available. Each new chunk has a size of 1GB x 2 = 2GB, but only 1GB is available for the data because the other one contains a copy of the data.
>> A raid1 layout may have more than two disks. However the data is copied only two times, this means that you can tolerate only the lost of one device.
>> For example the first chunk is allocated on the first two disks; the 2nd chunk is allocated on the first and the 3rd disk; the 3rd chunk is allocated on the 2nd and 3rd disk....
>>
> ...
>>
>> RAID10:
>> Is a mix of RAID0 and RAID1: the data is copied two times (so you can tolerate the lost of one device), but it is spread over near all the disks.
>> If you have 7 disks, a new chunk is allocated over 6 disks (the greatest even number <= to the disk count) with more space available.
>> If you write data to a disk, the first 64K are written on the 1st disk and and the 2nd disk (as 2nd copy). When you write the 2nd 64 k of data, these are written in the 3rd disk and 4th disk (as 2nd copy). And so on until you fill the chunk.
>> When the chunk is filled, a new allocation occurred. Likely the 7th disk is used and one of the first 6 isn't for the new chunk.
>>
>
> Is large IO processed in parallel? If I have 8 disks raid10 and issue
> 256K request - will btrfs submit 4 concurrent 64K requests to each disk?

That is related to the RAID10/0 stripe size.
For btrfs, it uses fixes stripe size (64K).

So if you have 8 disks raid10, and issue a 256K request, it will be
split into 4 stripes first.

Then the first stripe go to the first 2 disk group (substripe).
The 2nd stripe go to the 2nd substripe.
Until the last stripe go to the last substripe.

All the submission are in parallel.


Although in full technical details, we will never submit a full 256K
request. Btrfs will submit the first 64K as long as the write size
reaches stripe boundary.
(Which may very slightly reduce the parallism, but also very slightly
reduce memory usage).

We have some pending changes to submit larger bio in logical layer, then
do the split.
But the change in performance should not even be observable.

>
> And for raid1 - will there be single 256K physical disk request or 4 x
> 64K requests?

Stripe length only works for RAID0/RAID10/RAID5/RAID6.

DUP/SINGLE/RAID1* doesn't bother the stripe length, thus it's a single
256K bio submitted to all RAID1* disks.

>
> What about read requests - will all disks in raid1/raid10 be used
> concurrently or btrfs always reads from the "primary" copy (and how it
> is determined then)?

Currently we use pid as the criteria to load balance the reads for
DUP/RAID1* profiles.

Anand Jain has some pending patches to allow different load balance
policy to be applied for DUP/RAID1* profiles though.

Thanks,
Qu

      reply	other threads:[~2022-08-21  0:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-19 16:49 What exactly is BTRFS Raid 10? George Shammas
2022-08-19 18:10 ` Phillip Susi
2022-08-19 22:01   ` George Shammas
2022-08-19 22:18     ` Chris Murphy
2022-08-19 22:37       ` George Shammas
2022-08-19 22:29     ` waxhead
2022-08-22 19:51     ` Phillip Susi
2022-08-20 11:28 ` Goffredo Baroncelli
2022-08-20 18:11   ` Andrei Borzenkov
2022-08-21  0:23     ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e341879f-9e16-d0f9-dbb5-7c54a6bd28c2@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=arvidjaar@gmail.com \
    --cc=btrfs@shamm.as \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.