All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "Sebastian Döring" <moralapostel@gmail.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs-scrub: slow scrub speed (raid5)
Date: Thu, 6 Feb 2020 13:51:06 -0700	[thread overview]
Message-ID: <CAJCQCtTgK08eY3j4VYC=htY5bYj6cu9w3_58nzGo4BoWCQL7uQ@mail.gmail.com> (raw)
In-Reply-To: <CADkZQan+F47nHo49RRhWLi2DfWeJLrhCYvw4=Zw_W7gFedneDw@mail.gmail.com>

On Thu, Feb 6, 2020 at 10:33 AM Sebastian Döring <moralapostel@gmail.com> wrote:
>
> Hi everyone,
>
> when I run a scrub on my 5 disk raid5 array (data: raid5, metadata:
> raid6) I notice very slow scrubbing speed: max. 5MB/s per device,
> about 23-24 MB/s in sum (according to btrfs scrub status).

raid56 is not recommended for metadata. With raid5 data, it's
recommended to use raid1 metadata. It's possible to convert from raid6
to raid1 metadata, but you'll need to use -f flag due to the reduced
redundancy.

If you can consistently depend on kernel 5.5+ you can use raid1c3 or
raid1c4 for metadata, although even though the file system itself can
survive a two or three device failure, most of your data won't
survive. It would allow getting some fraction of the files smaller
than 64KiB (raid5 strip size) off the volume.

I'm not sure this accounts for the slow scrub though. It could be some
combination of heavy block group fragmentation, i.e. a lot of free
space in block groups, in both metadata and data block groups, and
then raid6 on top of it. But, I'm not convinced. It's be useful to see
IO and utilization during the scrub from iostat 5, to see if any one
of the drives is ever getting close to 100% utilization.

>
> What's interesting is at the same time the gross read speed across the
> involved devices (according to iostat) is about ~71 MB/s in sum (14-15
> MB/s per device). Where are the remaining 47 MB/s going? I expect
> there would be some overhead because it's a raid5, but it shouldn't be
> much more than a factor of (n-1) / n , no? At the moment it appears to
> be only scrubbing 1/3 of all data that is being read and the rest is
> thrown out (and probably re-read again at a different time).

What do you get for
btrfs fi df /mountpoint/
btrfs fi us /mountpoint/

Is it consistently this slow or does it vary a lot?

>
> Surely this can't be right? Are iostat or possibly btrfs scrub status
> lying to me? What am I seeing here? I've never seen this problem with
> scrubbing a raid1, so maybe there's a bug in how scrub is reading data
> from raid5 data profile?

I'd say more likely it's a lack of optimization for the moderate to
high fragmentation case. Both LVM and mdadm raid have no idea what the
layout is, there's no fs metadata to take into account, so every scrub
read is a full stripe read. However, that means it reads unused
portions of the array too, where Btrfs won't because every read is
deliberate. But that means performance can be impacted by disk
contention.


> It seems to me that I could perform a much faster scrub by rsyncing
> the whole fs into /dev/null... btrfs is comparing the checksums anyway
> when reading data, no?

Yes.

-- 
Chris Murphy

  parent reply	other threads:[~2020-02-06 20:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 17:32 btrfs-scrub: slow scrub speed (raid5) Sebastian Döring
2020-02-06 17:46 ` Matt Zagrabelny
2020-02-06 18:13   ` Sebastian Döring
2020-02-07  4:58     ` Zygo Blaxell
2020-02-06 20:51 ` Chris Murphy [this message]
2020-02-07  0:58   ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtTgK08eY3j4VYC=htY5bYj6cu9w3_58nzGo4BoWCQL7uQ@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=moralapostel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.