linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Allen Vanderburg II <brianvanderburg2@aim.com>
To: Wols Lists <antlists@youngman.org.uk>, linux-raid@vger.kernel.org
Subject: Re: Linux raid-like idea
Date: Fri, 11 Sep 2020 11:14:37 -0400	[thread overview]
Message-ID: <274cb804-9cf1-f56c-9ee4-56463f052c09@aim.com> (raw)
In-Reply-To: <5F54146F.40808@youngman.org.uk>


On 9/5/20 6:42 PM, Wols Lists wrote:
> I doubt I understand what you're getting at, but this is sounding a bit
> like raid-4, if you have data disk(s) and a separate parity disk. People
> don't use raid 4 because it has a nasty performance hit.

Yes it is a bit like raid-4 since the data and parity disks are
separated.  In fact the idea could be better called a parity backed
collection of independently accessed disks. While you would not get the
advantage/performance increase of reads/writes going across multiple
disks, the idea is primarily targeted to read-heavy applications, so in
a typical use, read performance should be no worse than reading directly
from a single un-raided disk, except in case of a disk failure where the
parity is being used to calculated a block read on a missing disk. 
Writes would have more overhead since they would also have to
calculate/update parity.

> Personally, I'm looking at something like raid-61 as a project. That
> would let you survive four disk failures ...

Interesting.  I'll check that out more later, but from what it seems so
far there is a lot of overhead (10 1TB disks would only be 3TB of data
(2x 5 disk arrays mirrors, then raid6 on each leaving 3 disks-worth of
data).  My currently solution since I'ts basically just storing bulk
data, is mergerfs and snapraid, and from the documents of snapraid, 10
1TB disks would provide 6TB if using 4 for parity.  However it's parity
calculations seem to be more complex as well.

> Also, one of the biggest problems when a disk fails and you have to
> replace it is that, at present, with nearly all raid levels even if you
> have lots of disks, rebuilding a failed disk is pretty much guaranteed
> to hammer just one or two surviving disks, pushing them into failure if
> they're at all dodgy. I'm also looking at finding some randomisation
> algorithm that will smear the blocks out across all the disks, so that
> rebuilding one disk spreads the load evenly across all disks.

This is actually the main purpose of the idea.  Due to the data on the
disks in a traditional raid5/6 being mapped from multiple disks to a
single logical block device, and so the structures of any file systems
and their files scattered across all the disks, losing one more than the
number of available lost disks would make the entire filesystem(s) and
all files virtually unrecoverable.

By keeping each data disk separate and exposed as it's own block device
with some parity backup, each disk contains an entire filesystem(s) on
it's own to be used however a user decides.  The loss of one of the
disks during a rebuild would not cause full data loss anymore but only
of the filesystem(s) on that disk.  The data on the other disks would
still be intact and readable, although depending on the user's usage,
may be missing files if they used a union/merge filesystem on top of
them.  A rebuild would still have the same issues, would have to read
all the remaining disks to rebuild the lost disk.  I'm not really sure
of any way around that since parity would essentially be calculated as
the xor of the same block on all the data disks.

>
> At the end of the day, if you think what you're doing is a good idea,
> scratch that itch, bounce stuff off here (and the kernel newbies list if
> you're not a kernel programmer yet), and see how it goes. Personally, I
> don't think it'll fly, but I'm sure people here would say the same about
> some of my pet ideas too. Give it a go!
>
> Cheers,
> Wol


  reply	other threads:[~2020-09-11 15:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1cf0d18c-2f63-6bca-9884-9544b0e7c54e.ref@aim.com>
2020-08-24 17:23 ` Linux raid-like idea Brian Allen Vanderburg II
2020-08-28 15:31   ` antlists
2020-09-05 21:47     ` Brian Allen Vanderburg II
2020-09-05 22:42       ` Wols Lists
2020-09-11 15:14         ` Brian Allen Vanderburg II [this message]
2020-09-11 19:16           ` antlists
2020-09-11 20:14             ` Brian Allen Vanderburg II
2020-09-12  6:09               ` Song Liu
2020-09-12 14:40               ` Adam Goryachev
2020-09-12 16:19               ` antlists
2020-09-12 17:28                 ` John Stoffel
2020-09-12 18:41                   ` antlists
2020-09-13 12:50                     ` John Stoffel
2020-09-13 16:01                       ` Wols Lists
2020-09-13 23:49                         ` Brian Allen Vanderburg II
2020-09-15  2:12                           ` John Stoffel
     [not found]                             ` <43ce60a7-64d1-51bc-f29c-7a6388ad91d5@grumpydevil.homelinux.org>
2020-09-15 18:12                               ` John Stoffel
2020-09-15 19:52                                 ` Rudy Zijlstra
2020-09-15  2:09                         ` John Stoffel
2020-09-15 11:14                           ` Roger Heflin
2020-09-15 18:07                             ` John Stoffel
2020-09-15 19:34                               ` Ram Ramesh
2020-09-14 17:19                 ` Phillip Susi
2020-09-14 17:26                   ` Wols Lists
2020-09-15 11:32       ` Nix
2020-09-15 18:10         ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=274cb804-9cf1-f56c-9ee4-56463f052c09@aim.com \
    --to=brianvanderburg2@aim.com \
    --cc=antlists@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).