linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: antlists <antlists@youngman.org.uk>
To: Brian Allen Vanderburg II <brianvanderburg2@aim.com>,
	linux-raid@vger.kernel.org
Subject: Re: Linux raid-like idea
Date: Fri, 11 Sep 2020 20:16:28 +0100	[thread overview]
Message-ID: <ddd9b5b9-88e6-e730-29f4-30dfafd3a736@youngman.org.uk> (raw)
In-Reply-To: <274cb804-9cf1-f56c-9ee4-56463f052c09@aim.com>

On 11/09/2020 16:14, Brian Allen Vanderburg II wrote:
> 
> On 9/5/20 6:42 PM, Wols Lists wrote:
>> I doubt I understand what you're getting at, but this is sounding a bit
>> like raid-4, if you have data disk(s) and a separate parity disk. People
>> don't use raid 4 because it has a nasty performance hit.
> 
> Yes it is a bit like raid-4 since the data and parity disks are
> separated.  In fact the idea could be better called a parity backed
> collection of independently accessed disks. While you would not get the
> advantage/performance increase of reads/writes going across multiple
> disks, the idea is primarily targeted to read-heavy applications, so in
> a typical use, read performance should be no worse than reading directly
> from a single un-raided disk, except in case of a disk failure where the
> parity is being used to calculated a block read on a missing disk.
> Writes would have more overhead since they would also have to
> calculate/update parity.

Ummm...

So let me word this differently. You're looking at pairing disks up, 
with a filesystem on each pair (data/parity), and then using mergefs on 
top. Compared with simple raid, that looks like a lose-lose scenario to me.

A raid-1 will read faster than a single disk, because it optimises which 
disk to read from, and it will write faster too because your typical 
parity calculation for a two-disk scenario is a no-op, which might not 
optimise out.
> 
>> Personally, I'm looking at something like raid-61 as a project. That
>> would let you survive four disk failures ...
> 
> Interesting.  I'll check that out more later, but from what it seems so
> far there is a lot of overhead (10 1TB disks would only be 3TB of data
> (2x 5 disk arrays mirrors, then raid6 on each leaving 3 disks-worth of
> data).  My currently solution since I'ts basically just storing bulk
> data, is mergerfs and snapraid, and from the documents of snapraid, 10
> 1TB disks would provide 6TB if using 4 for parity.  However it's parity
> calculations seem to be more complex as well.

Actually no. Don't forget that, as far as linux is concerned, raid-10 
and raid-1+0 are two *completely* *different* things. You can raid-10 
three disks, but you need four for raid-1+0.

You've mis-calculated raid-6+1 - that gives you 6TB for 10 disks (two 
3TB arrays). I think I would probably get more with raid-61, but every 
time I think about it my brain goes "whoa!!!", and I'll need to start 
concentrating on it to work out exactly what's going on.
> 
>> Also, one of the biggest problems when a disk fails and you have to
>> replace it is that, at present, with nearly all raid levels even if you
>> have lots of disks, rebuilding a failed disk is pretty much guaranteed
>> to hammer just one or two surviving disks, pushing them into failure if
>> they're at all dodgy. I'm also looking at finding some randomisation
>> algorithm that will smear the blocks out across all the disks, so that
>> rebuilding one disk spreads the load evenly across all disks.
> 
> This is actually the main purpose of the idea.  Due to the data on the
> disks in a traditional raid5/6 being mapped from multiple disks to a
> single logical block device, and so the structures of any file systems
> and their files scattered across all the disks, losing one more than the
> number of available lost disks would make the entire filesystem(s) and
> all files virtually unrecoverable.

But raid 5/6 give you much more usable space than a mirror. What I'm 
having trouble getting to grips with in your idea is how is it an 
improvement on a mirror? It looks to me like you're proposing a 2-disk 
raid-4 as the underlying storage medium, with mergefs on top. Which is 
effectively giving you a poorly-performing mirror. A crappy raid-1+0, 
basically.
> 
> By keeping each data disk separate and exposed as it's own block device
> with some parity backup, each disk contains an entire filesystem(s) on
> it's own to be used however a user decides.  The loss of one of the
> disks during a rebuild would not cause full data loss anymore but only
> of the filesystem(s) on that disk.  The data on the other disks would
> still be intact and readable, although depending on the user's usage,
> may be missing files if they used a union/merge filesystem on top of
> them.  A rebuild would still have the same issues, would have to read
> all the remaining disks to rebuild the lost disk.  I'm not really sure
> of any way around that since parity would essentially be calculated as
> the xor of the same block on all the data disks.
> 
And as I understand your setup, you also suffer from the same problem as 
raid-10 - lose one disk and you're fine, lose two and it's russian 
roulette whether you can recover your data. raid-6 is *any* two and 
you're fine, raid-61 would be *any* four and you're fine.
>>
>> At the end of the day, if you think what you're doing is a good idea,
>> scratch that itch, bounce stuff off here (and the kernel newbies list if
>> you're not a kernel programmer yet), and see how it goes. Personally, I
>> don't think it'll fly, but I'm sure people here would say the same about
>> some of my pet ideas too. Give it a go!
>>
Cheers,
Wol

  reply	other threads:[~2020-09-11 19:16 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1cf0d18c-2f63-6bca-9884-9544b0e7c54e.ref@aim.com>
2020-08-24 17:23 ` Linux raid-like idea Brian Allen Vanderburg II
2020-08-28 15:31   ` antlists
2020-09-05 21:47     ` Brian Allen Vanderburg II
2020-09-05 22:42       ` Wols Lists
2020-09-11 15:14         ` Brian Allen Vanderburg II
2020-09-11 19:16           ` antlists [this message]
2020-09-11 20:14             ` Brian Allen Vanderburg II
2020-09-12  6:09               ` Song Liu
2020-09-12 14:40               ` Adam Goryachev
2020-09-12 16:19               ` antlists
2020-09-12 17:28                 ` John Stoffel
2020-09-12 18:41                   ` antlists
2020-09-13 12:50                     ` John Stoffel
2020-09-13 16:01                       ` Wols Lists
2020-09-13 23:49                         ` Brian Allen Vanderburg II
2020-09-15  2:12                           ` John Stoffel
     [not found]                             ` <43ce60a7-64d1-51bc-f29c-7a6388ad91d5@grumpydevil.homelinux.org>
2020-09-15 18:12                               ` John Stoffel
2020-09-15 19:52                                 ` Rudy Zijlstra
2020-09-15  2:09                         ` John Stoffel
2020-09-15 11:14                           ` Roger Heflin
2020-09-15 18:07                             ` John Stoffel
2020-09-15 19:34                               ` Ram Ramesh
2020-09-14 17:19                 ` Phillip Susi
2020-09-14 17:26                   ` Wols Lists
2020-09-15 11:32       ` Nix
2020-09-15 18:10         ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ddd9b5b9-88e6-e730-29f4-30dfafd3a736@youngman.org.uk \
    --to=antlists@youngman.org.uk \
    --cc=brianvanderburg2@aim.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).