All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Marat Khalili <mkh@rqc.ru>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Exactly what is wrong with RAID5/6
Date: Wed, 21 Jun 2017 12:43:26 -0600	[thread overview]
Message-ID: <CAJCQCtSJNi3sGJci8xbjgoMOVjc92howEZ5PfJBThWAbm0AVMg@mail.gmail.com> (raw)
In-Reply-To: <a88f7896-5693-ad90-383e-a24c61110e8a@rqc.ru>

On Wed, Jun 21, 2017 at 12:51 AM, Marat Khalili <mkh@rqc.ru> wrote:
> On 21/06/17 06:48, Chris Murphy wrote:
>>
>> Another possibility is to ensure a new write is written to a new*not*
>> full stripe, i.e. dynamic stripe size. So if the modification is a 50K
>> file on a 4 disk raid5; instead of writing 3 64K data strips + 1 64K
>> parity strip (a full stripe write); write out 1 64K data strip + 1 64K
>> parity strip. In effect, a 4 disk raid5 would quickly get not just 3
>> data + 1 parity strip Btrfs block groups; but 1 data + 1 parity, and 2
>> data + 1 parity chunks, and direct those write to the proper chunk
>> based on size. Anyway that's beyond my ability to assess how much
>> allocator work that is. Balance I'd expect to rewrite everything to
>> max data strips possible; the optimization would only apply to normal
>> operation COW..

> This will make some filesystems mostly RAID1, negating all space savings of
> RAID5, won't it?

No. It'd only apply to partial stripe writes, typically small files.
But small file, metadata centric workloads suck for raid5 anyway, and
should use raid1. So making the implementation more like raid1 than
raid5 for the RMW case I think is still better than Btrfs raid56 RMW
writes in effect being no-COW.


> Isn't it easier to recalculate parity block based using previous state of
> two rewritten strips, parity and data? I don't understand all performance
> implications, but it might scale better with number of devices.

The problem is atomicity. Either the data strip or parity strip is
overwritten first, and before the other is committed, the file system
is not merely inconsistent, it's basically lying, there's no way to
know for sure after the fact whether the data or parity were properly
written. And even the metadata is inconsistent too because it can only
describe the unmodified state and the successfully modified state,
whereas a 3rd state "partially modified" is possible and no way to
really fix it.


-- 
Chris Murphy

  parent reply	other threads:[~2017-06-21 18:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20 22:57 Exactly what is wrong with RAID5/6 waxhead
2017-06-20 23:25 ` Hugo Mills
2017-06-21  3:48   ` Chris Murphy
2017-06-21  6:51     ` Marat Khalili
2017-06-21  7:31       ` Peter Grandi
2017-06-21 17:13       ` Andrei Borzenkov
2017-06-21 18:43       ` Chris Murphy [this message]
2017-06-21  8:45 ` Qu Wenruo
2017-06-21 12:43   ` Christoph Anton Mitterer
2017-06-21 13:41     ` Austin S. Hemmelgarn
2017-06-21 17:20       ` Andrei Borzenkov
2017-06-21 17:30         ` Austin S. Hemmelgarn
2017-06-21 17:03   ` Goffredo Baroncelli
2017-06-22  2:05     ` Qu Wenruo
2017-06-21 18:24   ` Chris Murphy
2017-06-21 20:12     ` Goffredo Baroncelli
2017-06-21 23:19       ` Chris Murphy
2017-06-22  2:12     ` Qu Wenruo
2017-06-22  2:43       ` Chris Murphy
2017-06-22  3:55         ` Qu Wenruo
2017-06-22  5:15       ` Goffredo Baroncelli
2017-06-23 17:25 ` Michał Sokołowski
2017-06-23 18:45   ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtSJNi3sGJci8xbjgoMOVjc92howEZ5PfJBThWAbm0AVMg@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mkh@rqc.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.