From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5EA4C43461 for ; Fri, 11 Sep 2020 19:16:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B43E22208 for ; Fri, 11 Sep 2020 19:16:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725783AbgIKTQb (ORCPT ); Fri, 11 Sep 2020 15:16:31 -0400 Received: from smtp.hosts.co.uk ([85.233.160.19]:62981 "EHLO smtp.hosts.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725778AbgIKTQb (ORCPT ); Fri, 11 Sep 2020 15:16:31 -0400 Received: from host86-136-163-47.range86-136.btcentralplus.com ([86.136.163.47] helo=[192.168.1.65]) by smtp.hosts.co.uk with esmtpa (Exim) (envelope-from ) id 1kGoX5-0007LR-C3; Fri, 11 Sep 2020 20:16:28 +0100 Subject: Re: Linux raid-like idea To: Brian Allen Vanderburg II , linux-raid@vger.kernel.org References: <1cf0d18c-2f63-6bca-9884-9544b0e7c54e.ref@aim.com> <1cf0d18c-2f63-6bca-9884-9544b0e7c54e@aim.com> <5F54146F.40808@youngman.org.uk> <274cb804-9cf1-f56c-9ee4-56463f052c09@aim.com> From: antlists Message-ID: Date: Fri, 11 Sep 2020 20:16:28 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <274cb804-9cf1-f56c-9ee4-56463f052c09@aim.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Sender: linux-raid-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org On 11/09/2020 16:14, Brian Allen Vanderburg II wrote: > > On 9/5/20 6:42 PM, Wols Lists wrote: >> I doubt I understand what you're getting at, but this is sounding a bit >> like raid-4, if you have data disk(s) and a separate parity disk. People >> don't use raid 4 because it has a nasty performance hit. > > Yes it is a bit like raid-4 since the data and parity disks are > separated.  In fact the idea could be better called a parity backed > collection of independently accessed disks. While you would not get the > advantage/performance increase of reads/writes going across multiple > disks, the idea is primarily targeted to read-heavy applications, so in > a typical use, read performance should be no worse than reading directly > from a single un-raided disk, except in case of a disk failure where the > parity is being used to calculated a block read on a missing disk. > Writes would have more overhead since they would also have to > calculate/update parity. Ummm... So let me word this differently. You're looking at pairing disks up, with a filesystem on each pair (data/parity), and then using mergefs on top. Compared with simple raid, that looks like a lose-lose scenario to me. A raid-1 will read faster than a single disk, because it optimises which disk to read from, and it will write faster too because your typical parity calculation for a two-disk scenario is a no-op, which might not optimise out. > >> Personally, I'm looking at something like raid-61 as a project. That >> would let you survive four disk failures ... > > Interesting.  I'll check that out more later, but from what it seems so > far there is a lot of overhead (10 1TB disks would only be 3TB of data > (2x 5 disk arrays mirrors, then raid6 on each leaving 3 disks-worth of > data).  My currently solution since I'ts basically just storing bulk > data, is mergerfs and snapraid, and from the documents of snapraid, 10 > 1TB disks would provide 6TB if using 4 for parity.  However it's parity > calculations seem to be more complex as well. Actually no. Don't forget that, as far as linux is concerned, raid-10 and raid-1+0 are two *completely* *different* things. You can raid-10 three disks, but you need four for raid-1+0. You've mis-calculated raid-6+1 - that gives you 6TB for 10 disks (two 3TB arrays). I think I would probably get more with raid-61, but every time I think about it my brain goes "whoa!!!", and I'll need to start concentrating on it to work out exactly what's going on. > >> Also, one of the biggest problems when a disk fails and you have to >> replace it is that, at present, with nearly all raid levels even if you >> have lots of disks, rebuilding a failed disk is pretty much guaranteed >> to hammer just one or two surviving disks, pushing them into failure if >> they're at all dodgy. I'm also looking at finding some randomisation >> algorithm that will smear the blocks out across all the disks, so that >> rebuilding one disk spreads the load evenly across all disks. > > This is actually the main purpose of the idea.  Due to the data on the > disks in a traditional raid5/6 being mapped from multiple disks to a > single logical block device, and so the structures of any file systems > and their files scattered across all the disks, losing one more than the > number of available lost disks would make the entire filesystem(s) and > all files virtually unrecoverable. But raid 5/6 give you much more usable space than a mirror. What I'm having trouble getting to grips with in your idea is how is it an improvement on a mirror? It looks to me like you're proposing a 2-disk raid-4 as the underlying storage medium, with mergefs on top. Which is effectively giving you a poorly-performing mirror. A crappy raid-1+0, basically. > > By keeping each data disk separate and exposed as it's own block device > with some parity backup, each disk contains an entire filesystem(s) on > it's own to be used however a user decides.  The loss of one of the > disks during a rebuild would not cause full data loss anymore but only > of the filesystem(s) on that disk.  The data on the other disks would > still be intact and readable, although depending on the user's usage, > may be missing files if they used a union/merge filesystem on top of > them.  A rebuild would still have the same issues, would have to read > all the remaining disks to rebuild the lost disk.  I'm not really sure > of any way around that since parity would essentially be calculated as > the xor of the same block on all the data disks. > And as I understand your setup, you also suffer from the same problem as raid-10 - lose one disk and you're fine, lose two and it's russian roulette whether you can recover your data. raid-6 is *any* two and you're fine, raid-61 would be *any* four and you're fine. >> >> At the end of the day, if you think what you're doing is a good idea, >> scratch that itch, bounce stuff off here (and the kernel newbies list if >> you're not a kernel programmer yet), and see how it goes. Personally, I >> don't think it'll fly, but I'm sure people here would say the same about >> some of my pet ideas too. Give it a go! >> Cheers, Wol