All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: David Sterba <dsterba@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/4] 3- and 4- copy RAID1
Date: Fri, 20 Jul 2018 18:35:51 +0200	[thread overview]
Message-ID: <20180720163551.GF26141@twin.jikos.cz> (raw)
In-Reply-To: <88531904-288b-f73e-1157-560845f8e72d@gmx.com>

On Thu, Jul 19, 2018 at 03:27:17PM +0800, Qu Wenruo wrote:
> On 2018年07月14日 02:46, David Sterba wrote:
> > Hi,
> > 
> > I have some goodies that go into the RAID56 problem, although not
> > implementing all the remaining features, it can be useful independently.
> > 
> > This time my hackweek project
> > 
> > https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56
> > 
> > aimed to implement the fix for the write hole problem but I spent more
> > time with analysis and design of the solution and don't have a working
> > prototype for that yet.
> > 
> > This patchset brings a feature that will be used by the raid56 log, the
> > log has to be on the same redundancy level and thus we need a 3-copy
> > replication for raid6. As it was easy to extend to higher replication,
> > I've added a 4-copy replication, that would allow triple copy raid (that
> > does not have a standardized name).
> 
> So this special level will be used for RAID56 for now?
> Or it will also be possible for metadata usage just like current RAID1?

It's a new profile usable in the same way as is raid1, ie. for the data
or metadata. The patch that adds support to btrfs-progs has an mkfs
example.

The raid56 will use that to store the log, essentially data forcibly
stored on the n-copy raid1 chunk and used only for logging.

> If the latter, the metadata scrub problem will need to be considered more.
> 
> For more copies RAID1, it's will have higher possibility one or two
> devices missing, and then being scrubbed.
> For metadata scrub, inlined csum can't ensure it's the latest one.
> 
> So for such RAID1 scrub, we need to read out all copies and compare
> their generation to find out the correct copy.
> At least from the changeset, it doesn't look like it's addressed yet.

Nothing like this is implemented in the patches, but I don't understand
how this differs from the current raid1 and one missing device. Sure we
can't have 2 missing devices so the existing copy is automatically
considered correct and up to date.

There are more corner case recovery scenario when there could be 3
copies slightly out of date due to device loss and scrub attempt, so yes
this would need to be addressed.

> And this also reminds me that current scrub is not as flex as balance, I
> really like we could filter block groups to scrub just like balance, and
> do scrub in a block group basis, other than devid basis.
> That's to say, for a block group scrub, we don't really care which
> device we're scrubbing, we just need to ensure all device in this block
> is storing correct data.

Right, a subset of the balance filters would be nice.

      parent reply	other threads:[~2018-07-20 17:25 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-13 18:46 [PATCH 0/4] 3- and 4- copy RAID1 David Sterba
2018-07-13 18:46 ` [PATCH] btrfs-progs: add support for raid1c3 and raid1c4 David Sterba
2018-07-13 18:46 ` [PATCH 1/4] btrfs: refactor block group replication factor calculation to a helper David Sterba
2018-07-13 18:46 ` [PATCH 2/4] btrfs: add support for 3-copy replication (raid1c3) David Sterba
2018-07-13 21:02   ` Goffredo Baroncelli
2018-07-17 16:00     ` David Sterba
2018-07-13 18:46 ` [PATCH 3/4] btrfs: add support for 4-copy replication (raid1c4) David Sterba
2018-07-13 18:46 ` [PATCH 4/4] btrfs: add incompatibility bit for extended raid features David Sterba
2018-07-15 14:37 ` [PATCH 0/4] 3- and 4- copy RAID1 waxhead
2018-07-16 18:29   ` Goffredo Baroncelli
2018-07-16 18:49     ` Austin S. Hemmelgarn
2018-07-17 21:12     ` Duncan
2018-07-18  5:59       ` Goffredo Baroncelli
2018-07-18  7:20         ` Duncan
2018-07-18  8:39           ` Duncan
2018-07-18 12:45             ` Austin S. Hemmelgarn
2018-07-18 12:50             ` Hugo Mills
2018-07-19 21:22               ` waxhead
2018-07-18 12:50           ` Austin S. Hemmelgarn
2018-07-18 19:42           ` Goffredo Baroncelli
2018-07-19 11:43             ` Austin S. Hemmelgarn
2018-07-19 17:29               ` Goffredo Baroncelli
2018-07-19 19:10                 ` Austin S. Hemmelgarn
2018-07-20 17:13                   ` Goffredo Baroncelli
2018-07-20 18:33                     ` Austin S. Hemmelgarn
2018-07-20  5:17             ` Andrei Borzenkov
2018-07-20 17:16               ` Goffredo Baroncelli
2018-07-20 18:38                 ` Andrei Borzenkov
2018-07-20 18:41                   ` Hugo Mills
2018-07-20 18:46                     ` Austin S. Hemmelgarn
2018-07-16 21:51   ` waxhead
2018-07-15 14:46 ` Hugo Mills
2018-07-19  7:27 ` Qu Wenruo
2018-07-19 11:47   ` Austin S. Hemmelgarn
2018-07-20 16:42     ` David Sterba
2018-07-20 16:35   ` David Sterba [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180720163551.GF26141@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.