From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Repair broken btrfs raid6?
Date: Fri, 13 Feb 2015 02:12:15 +0100 [thread overview]
Message-ID: <fkvvqb-ecc.ln1@hurikhan77.spdns.de> (raw)
In-Reply-To: 486ed2b2-3c80-4856-8701-bcd71a212b18@aei.ca
Ed Tomlinson <edt@aei.ca> schrieb:
> On Tuesday, February 10, 2015 2:17:43 AM EST, Kai Krakow wrote:
>> Tobias Holst <tobby@tobby.eu> schrieb:
>>
>>> and "btrfs scrub status /[device]" gives me the following output:
>>>> "scrub status for [UUID]
>>>> scrub started at Mon Feb 9 18:16:38 2015 and was aborted after 2008
>>>> seconds total bytes scrubbed: 113.04GiB with 0 errors"
>>
>> Does not look very correct to me:
>>
>> Why should a scrub in a six-drivers btrfs array which is probably multi-
>> terabytes big (as you state a restore from backup would take
>> days) take only
>> ~2000 seconds? And scrub only ~120 GB worth of data. Either your 6
>> devices are really small (then why RAID-6), or your data is very sparse
>> (then way does it take so long), or scrub prematurely aborts and never
>> checks the complete devices (I guess this is it).
>>
>> And that's what it actually says: "aborted after 2008" seconds. I'd
>> expect "finished after XXXX seconds" if I remember my scrub runs
>> correctly (which I
>> currently don't do regularly because it takes long and IO performance
>> sucks during running it).
>
> IO perfermance does suffer during a scrub. I use the following:
>
> ionice -c 3 btrfs scrub start -Bd -n 19 /<target>
Doesn't work for deadline scheduler... Although, when my btrfs was still
fresh (and already had a lot of data), I hardly noticed a running scrub in
the background. But since I did one balance, everything sucks IO
performance-wise.
Off-topic but maybe interesting in this regard:
Meanwhile, I switched away from deadline (which served me better than CFQ at
that time) and am running with BFQ scheduler. It works really nice though
booting is slower and application startup is a little bit less snappy. But
it copes with background IO much better since after the "balance incident".
I went one step further and deployed bcache into the setup and everything is
really snappy now. So I'm playing with the thought of re-enabling a
regularly running scrub. But I still need to figure out if it would or
wouldn't destroy the bcache hit ratio and fill bcache with non-relevant
data.
And thinking further about it: I'm not sure if btrfs RAID protection and
scrub make much sense at all with bcache inbetween... Due to the nature of
bcache, errors may slip through undetected until the bcache LRU forces
cached good copies out of the cache. If this data isn't dirty, it won't be
written to cache. In that case there are three possible outcomes: the
associated blocks on HDD are in perfect shape, one copy is rotten and one is
good, or both are rotten. In the last case, btrfs can no longer help me
there... Scrub may not have catched those as the good copies were in bcache
until shortly before. I wonder if bcache should have a policy for writing
back even non-dirty blocks if they are evicted from the cache...
> The combo of -n19 and ionice makes it workable here.
Yeah, should work here, too, now that I'm using BFQ. But then again, I am
not sure: bcache frontend runs on SSD whose block device is working with
deadline scheduler. My bcache backends are running on HDD with BFQ
scheduler. The virtual bcache partitions sitting inbetween both are
magically setting themselves to the noop scheduler (or maybe it even shows
"none", I'm not sure) - which is intended, I guess.
So kernel access probably goes like this:
---> bcache0-2[noop] <---> phys. SSD [deadline] **
|
`---> phys. HDD 1-3 [bfq], mraid-1, draid-0
So, I guess part of block accesses pass through two schedulers if access to
both devices is needed (frontend and backend) with bcache acting as a huge
block-sorting scheduler itself (which is what makes its performance). But
for scrub, the deadline scheduler may becoming the dominating scheduler
which brings me back to the situation a had in the start while running
scrub.
** --> maybe I should put noop here, too
Does my thought experiment make work?
--
Replies to list only preferred.
next prev parent reply other threads:[~2015-02-13 1:19 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-09 22:45 Repair broken btrfs raid6? Tobias Holst
2015-02-10 3:36 ` Duncan
2015-02-10 7:17 ` Kai Krakow
2015-02-10 13:15 ` Ed Tomlinson
2015-02-13 1:12 ` Kai Krakow [this message]
2015-02-10 18:18 ` Tobias Holst
2015-02-11 14:46 ` Tobias Holst
2015-02-12 9:16 ` Liu Bo
2015-02-12 23:22 ` Tobias Holst
2015-02-13 8:06 ` Liu Bo
2015-02-13 18:26 ` Tobias Holst
2015-02-13 21:54 ` Tobias Holst
2015-02-15 3:30 ` Liu Bo
2015-02-15 20:45 ` Tobias Holst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fkvvqb-ecc.ln1@hurikhan77.spdns.de \
--to=hurikhan77@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.