All of lore.kernel.org
 help / color / mirror / Atom feed
* Recommended filesystem for RAID 6
@ 2020-08-11  4:42 George Rapp
  2020-08-11 15:06 ` Roy Sigurd Karlsbakk
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: George Rapp @ 2020-08-11  4:42 UTC (permalink / raw)
  To: Linux-RAID

Hello Linux RAID community -

I've been running an assortment of software RAID arrays for a while
now (my oldest Creation Time according to 'mdadm --detail' is April
2011) but have been meaning to consolidate my five active arrays into
something easier to manage. This weekend, I finally found enough cheap
2TB disks to get started. I'm planning on creating a RAID 6 array due
to the age and consumer-grade quality of my 16 2TB disks.

Use case is long-term storage of many small files and a few large ones
(family photos and videos, backups of other systems, working copies of
photo, audio, and video edits, etc.)? Current usable space is about
10TB but my end state vision is probably upwards of 20TB. I'll
probably consign the slowest working disks in the server to an archive
filesystem, either RAID 1 or RAID 5, for stuff I care less about and
backups; the archive part can be ignored for the purposes of this
exercise.

My question is: what filesystem type would be best practice for my use
case and size requirements on the big array? (I have reviewed
https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
looking for practitioners' recommendations.)  I've run ext4
exclusively on my arrays to date, but have been reading up on xfs; is
there another filesystem type I should consider? Finally, are there
any pitfalls I should know about in my high-level design?

Details:
# uname -a
Linux backend5 5.7.11-200.fc32.x86_64 #1 SMP Wed Jul 29 17:15:52 UTC
2020 x86_64 x86_64 x86_64 GNU/Linux
# mdadm --version
mdadm - v4.1 - 2018-10-01

Finally, and this is inadequate given the value I've received from the
efforts of this group, but thanks for many years of supporting mdadm
and helping with software RAID issues, including the recovery
procedures you have written up and guided me through. This group's
efforts have saved my data, my bacon, and my sanity on more than one
occasion.
--
George Rapp  (Pataskala, OH) Home: george.rapp -- at -- gmail.com
LinkedIn profile: https://www.linkedin.com/in/georgerapp
Phone: +1 740 936 RAPP (740 936 7277)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11  4:42 Recommended filesystem for RAID 6 George Rapp
@ 2020-08-11 15:06 ` Roy Sigurd Karlsbakk
  2020-08-11 19:19   ` Michael Fritscher
  2020-08-11 15:22 ` antlists
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Roy Sigurd Karlsbakk @ 2020-08-11 15:06 UTC (permalink / raw)
  To: George Rapp; +Cc: Linux Raid

> Hello Linux RAID community -

Hi

> I've been running an assortment of software RAID arrays for a while
> now (my oldest Creation Time according to 'mdadm --detail' is April
> 2011) but have been meaning to consolidate my five active arrays into
> something easier to manage. This weekend, I finally found enough cheap
> 2TB disks to get started. I'm planning on creating a RAID 6 array due
> to the age and consumer-grade quality of my 16 2TB disks.

I'd recommend first checking each drive's SMART data if they're old and rusty. You can start off with smartctl -H /dev/sdX and if that's ok, check 'smartctl -a' and look for errors, and in particular current pending sectors. A smartct -t short or even -t long won't hurt either. If you find peending sectors or other bad stuff, either choose not to include the drive or at least make sure you have sufficient redundancy. 16 drives in a single RAID-6 may be a bit high, but it should work. Any more than that (or even less), make more RAIDs and use lvm or md to stripe the data across them.

> Use case is long-term storage of many small files and a few large ones
> (family photos and videos, backups of other systems, working copies of
> photo, audio, and video edits, etc.)? Current usable space is about
> 10TB but my end state vision is probably upwards of 20TB. I'll
> probably consign the slowest working disks in the server to an archive
> filesystem, either RAID 1 or RAID 5, for stuff I care less about and
> backups; the archive part can be ignored for the purposes of this
> exercise.

RAID-6 is nice for achival stuff. RAID-1 (or RAID-10) gives you better IOPS and so on, but for mass storage, RAID-10 isn't really much safer than RAID-6. RAID-5 also works, but suddenly, one day, a disk dies and you swap it with a new one and another shows bad sectors. Then you have data corruption. I rarely use RAID-5 anymore, since RAID-6 isn't much heavier on the CPU and the cost of another drive is low compared to the time I'll use to rebuild a broken array in case of the above or even a double disk failure (yes, that happens too).

> My question is: what filesystem type would be best practice for my use
> case and size requirements on the big array? (I have reviewed
> https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
> looking for practitioners' recommendations.)  I've run ext4
> exclusively on my arrays to date, but have been reading up on xfs; is
> there another filesystem type I should consider? Finally, are there
> any pitfalls I should know about in my high-level design?

I've mostly ditched ext4 on large filesystems, since AFAIK it still makes a 32bit fs if created on something <16TiB and then you're unable to grow it past 16TiB without recreating it (backup/create/restore). Also, when something goes bad (it might be a power spike, a sudden power failure, a bug, something) you'll need to run a check the filesystem. With fsck.ext4, this may take hours, many hours, with such a large filesystem. With xfs_check/xfs_repair, it doesn't take so long. This is the main reason that RHEL/CentOS switched to XFS from RHEL/CentOS v7 and forward. The only thing that comes to mind as a good excuse for using ext4, is that it can be shrunk, something xfs doesn't support (yet).

Vennlig hilsen

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
Hið góða skaltu í stein höggva, hið illa í snjó rita.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11  4:42 Recommended filesystem for RAID 6 George Rapp
  2020-08-11 15:06 ` Roy Sigurd Karlsbakk
@ 2020-08-11 15:22 ` antlists
  2020-08-11 16:23 ` Roman Mamedov
  2020-08-12 20:44 ` Peter Grandi
  3 siblings, 0 replies; 25+ messages in thread
From: antlists @ 2020-08-11 15:22 UTC (permalink / raw)
  To: George Rapp, Linux-RAID

On 11/08/2020 05:42, George Rapp wrote:
> Hello Linux RAID community -
> 
> I've been running an assortment of software RAID arrays for a while
> now (my oldest Creation Time according to 'mdadm --detail' is April
> 2011) but have been meaning to consolidate my five active arrays into
> something easier to manage. This weekend, I finally found enough cheap
> 2TB disks to get started. I'm planning on creating a RAID 6 array due
> to the age and consumer-grade quality of my 16 2TB disks.

SCT/ERC ???
> 
> Use case is long-term storage of many small files and a few large ones
> (family photos and videos, backups of other systems, working copies of
> photo, audio, and video edits, etc.)? Current usable space is about
> 10TB but my end state vision is probably upwards of 20TB. I'll
> probably consign the slowest working disks in the server to an archive
> filesystem, either RAID 1 or RAID 5, for stuff I care less about and
> backups; the archive part can be ignored for the purposes of this
> exercise.

If you haven't got ERC, I'd be more inclined to raid-10 than raid 6. 
Your 16 disks would give you 10TB if you used a 3-way mirror, or 15TB if 
it's a 2-way.
> 
> My question is: what filesystem type would be best practice for my use
> case and size requirements on the big array? (I have reviewed
> https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
> looking for practitioners' recommendations.)  I've run ext4
> exclusively on my arrays to date, but have been reading up on xfs; is
> there another filesystem type I should consider? Finally, are there
> any pitfalls I should know about in my high-level design?

Think about dm-integrity and LVM. Take a look at 
https://raid.wiki.kernel.org/index.php/System2020

I'm working my way through with that system, so that page is an 
incomplete mess of musings at the moment but it might give you ideas.

File systems? Ext4 is a good choice. Remember that filesystems like 
btrfs and zfs are trying to replace raid, lvm etc and subsume it all 
into the filesystem. Do you want a layered KISS setup, or an 
all-things-to-all-men filesystem.

And look at slowly upgrading your disks to decent raid-happy 4TB drives 
or so ... Ironwolves aren't that expensive ...
> 
> Details:
> # uname -a
> Linux backend5 5.7.11-200.fc32.x86_64 #1 SMP Wed Jul 29 17:15:52 UTC
> 2020 x86_64 x86_64 x86_64 GNU/Linux
> # mdadm --version
> mdadm - v4.1 - 2018-10-01
> 
> Finally, and this is inadequate given the value I've received from the
> efforts of this group, but thanks for many years of supporting mdadm
> and helping with software RAID issues, including the recovery
> procedures you have written up and guided me through. This group's
> efforts have saved my data, my bacon, and my sanity on more than one
> occasion.

Thanks very much :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11  4:42 Recommended filesystem for RAID 6 George Rapp
  2020-08-11 15:06 ` Roy Sigurd Karlsbakk
  2020-08-11 15:22 ` antlists
@ 2020-08-11 16:23 ` Roman Mamedov
  2020-08-11 18:57   ` Reindl Harald
                     ` (2 more replies)
  2020-08-12 20:44 ` Peter Grandi
  3 siblings, 3 replies; 25+ messages in thread
From: Roman Mamedov @ 2020-08-11 16:23 UTC (permalink / raw)
  To: George Rapp; +Cc: Linux-RAID

On Tue, 11 Aug 2020 00:42:33 -0400
George Rapp <george.rapp@gmail.com> wrote:

> Use case is long-term storage of many small files and a few large ones
> (family photos and videos, backups of other systems, working copies of
> photo, audio, and video edits, etc.)? Current usable space is about
> 10TB but my end state vision is probably upwards of 20TB. I'll
> probably consign the slowest working disks in the server to an archive
> filesystem, either RAID 1 or RAID 5, for stuff I care less about and
> backups; the archive part can be ignored for the purposes of this
> exercise.
> 
> My question is: what filesystem type would be best practice for my use
> case and size requirements on the big array? (I have reviewed
> https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
> looking for practitioners' recommendations.)  I've run ext4
> exclusively on my arrays to date, but have been reading up on xfs; is
> there another filesystem type I should consider? Finally, are there
> any pitfalls I should know about in my high-level design?

Whichever filesystem you choose, you will end up with a huge single point of
failure, and any trouble with that FS or the underlying array put all your
data instantly at risk. "But RAID6" -- what about a SATA controller failure,
or a flaky cabling/PSU/backplane, which disconnects, say, 4 disks at once "on
the fly". What about a sudden power loss amidst heavy write load. And so on.

First of all, ask yourself -- is all of this backed up? If no, then go and buy
more drives until the answer is yes. With current drive prices, or as you say,
with having lots of spare old drives lying around, there's no excuse to leave
anything non-trivial not backed up.

Secondly -- if all of this... is BACKED UP ANYWAY, why even run RAID? And with
RAID6, even waste 2 more drives for redundancy. Do you need 24x7 uptime of your
home NAS, do you have hotswap cages, do you require that the server absolutely
stays online while a disk is being replaced.

Most likely you do not. And the RAID's main purpose in that case is to just
have a unified storage pool, for the convenience of not having to manage free
space across so many drives. But given the above, I would suggest leaving the
drives with their individual FSes, and just running MergerFS on top: 
https://www.teknophiles.com/2018/02/19/disk-pooling-in-linux-with-mergerfs/

Massively simpler and more resilient, no longer a huge array which also needs
to be painstakingly reshaped up and down when you add/remove space. Just add an
extra disk and done. Of course no redundancy, hence the backups part. If a
drive fails, everything that was on that drive is gone. But the best part is,
ONLY what was on that drive. Plug a new one, restore the lost files from
backup, done. One caveat, need to keep a record of what's on each drive, I do
that with a command like "find /mnt/* > /somewhere/list-$date.txt", kept
periodically updated. Yes I use such solution myself now, having migrated from
a Btrfs on top of MD RAID, after a "flaky cabling"-induced complete failure of
the array-wide FS.

For the FS considerations, the dealbreaker of XFS for me is its inability to
be shrunk. The ivory tower people do not think that is important enough, but
for me that limits the FS applicability severely. Also it loved truncating
currently-open files to zero bytes on power loss (dunno if that's been
improved). IIRC JFS can't be shrunk either, but it seems like that one can be
considered legacy at this point. The remaining filesystems that can be freely
resized are Ext4 and Btrfs. In any case, do not go with Btrfs' built in RAID
yet:
https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 16:23 ` Roman Mamedov
@ 2020-08-11 18:57   ` Reindl Harald
  2020-08-11 19:33     ` Roman Mamedov
  2020-08-11 22:14   ` Roy Sigurd Karlsbakk
  2020-08-12 14:16   ` Nix
  2 siblings, 1 reply; 25+ messages in thread
From: Reindl Harald @ 2020-08-11 18:57 UTC (permalink / raw)
  To: Roman Mamedov, George Rapp; +Cc: Linux-RAID



Am 11.08.20 um 18:23 schrieb Roman Mamedov:
> On Tue, 11 Aug 2020 00:42:33 -0400
> George Rapp <george.rapp@gmail.com> wrote:
> 
>> Use case is long-term storage of many small files and a few large ones
>> (family photos and videos, backups of other systems, working copies of
>> photo, audio, and video edits, etc.)? Current usable space is about
>> 10TB but my end state vision is probably upwards of 20TB. I'll
>> probably consign the slowest working disks in the server to an archive
>> filesystem, either RAID 1 or RAID 5, for stuff I care less about and
>> backups; the archive part can be ignored for the purposes of this
>> exercise.
>>
>> My question is: what filesystem type would be best practice for my use
>> case and size requirements on the big array? (I have reviewed
>> https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
>> looking for practitioners' recommendations.)  I've run ext4
>> exclusively on my arrays to date, but have been reading up on xfs; is
>> there another filesystem type I should consider? Finally, are there
>> any pitfalls I should know about in my high-level design?
> 
> Whichever filesystem you choose, you will end up with a huge single point of
> failure, and any trouble with that FS or the underlying array put all your
> data instantly at risk. 

calling an array where you can lose *two* disks as
single-point-of-failure is absurd

no raid can replace backups anyways

> Most likely you do not. And the RAID's main purpose in that case is to just
> have a unified storage pool, for the convenience of not having to manage free
> space across so many drives. But given the above, I would suggest leaving the
> drives with their individual FSes, and just running MergerFS on top: 
> https://www.teknophiles.com/2018/02/19/disk-pooling-in-linux-with-mergerfs/

you just move the complexity to something not used by many people for
what exactly to gain? the rives are still in the same machine

"Secondly -- if all of this... is BACKED UP ANYWAY, why even run RAID?"
is pure nosense! the best backups are the ones you never need and before
i setup up something where a dying drive take smore actions then swap
iot i would commit suicide

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 15:06 ` Roy Sigurd Karlsbakk
@ 2020-08-11 19:19   ` Michael Fritscher
  2020-08-11 19:45     ` Wols Lists
  2020-08-12 14:07     ` Nix
  0 siblings, 2 replies; 25+ messages in thread
From: Michael Fritscher @ 2020-08-11 19:19 UTC (permalink / raw)
  To: linux-raid

Hi,

if you really want to use these tiny 2 TB HDDs - yes, RAID 6 (2x - the
second for the backup system on a physically different location) is a
good choice.

But: If you can, buy some 8-12 TB HDDs and forget the old rusty tiny
HDDs. You'll save a lot at the system - and power.

ext4 is fine. In my experience, it is rock-solid, and also fsck.ext4 is
fairly qick (don't know what Roy is doing that it is so slow - do you
really made a full-fledged ext4 with journal or a old ext2 file system?^^)

Another way would be deploying zfs with raid-z2 (Yes, I can hear the
screams :-D )

Best regards,
Michael Fritscher

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 18:57   ` Reindl Harald
@ 2020-08-11 19:33     ` Roman Mamedov
  2020-08-11 19:49       ` Rudy Zijlstra
  2020-08-11 20:12       ` Reindl Harald
  0 siblings, 2 replies; 25+ messages in thread
From: Roman Mamedov @ 2020-08-11 19:33 UTC (permalink / raw)
  To: Reindl Harald; +Cc: George Rapp, Linux-RAID

On Tue, 11 Aug 2020 20:57:15 +0200
Reindl Harald <h.reindl@thelounge.net> wrote:

> > Whichever filesystem you choose, you will end up with a huge single point of
> > failure, and any trouble with that FS or the underlying array put all your
> > data instantly at risk. 
> 
> calling an array where you can lose *two* disks as
> single-point-of-failure is absurd

As noted before, not just *disks* can fail, plenty of other things to fail in
a storage server, and they can easily take down, say, a half of all disks, or
random portions of them in increments of four. Even if temporarily -- that
surely will be "unexpected" to that single precious 20TB filesystem. How will
it behave, who knows. Do you know? For added fun, reconnect the drives back 30
seconds later. Oh, let's write to linux-raid for how to bring back a half of
RAID6 from the Spare (S) status. Or find some HOWTO suggesting a random
--create without --assume-clean. And if the FS goes corrupt, now you suddenly
need *all* your backups, not just 1 drive worth of them.

> no raid can replace backups anyways

All too often I've seen RAID being used as an implicit excuse to be lenient
about backups. Heck, I know from personal experience how enticing that can be.

> > Most likely you do not. And the RAID's main purpose in that case is to just
> > have a unified storage pool, for the convenience of not having to manage free
> > space across so many drives. But given the above, I would suggest leaving the
> > drives with their individual FSes, and just running MergerFS on top: 
> > https://www.teknophiles.com/2018/02/19/disk-pooling-in-linux-with-mergerfs/
> 
> you just move the complexity to something not used by many people for
> what exactly to gain? the rives are still in the same machine

To gain total independence of drives from each other, you can pull any drive
out of the machine, plug it in somewhere else, and it will have a proper
filesystem and readable files on it. Writable, even.

Compared to a 16-drive RAID6, where you either have a whopping 14 disks
present, connected, powered, spinning, healthy, online, OR you have useless
binary junk instead of any data.

Of course I do not insist my way is the best for everyone, but I hope now you
can better understand the concerns and reasons for choosing it. :)

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:19   ` Michael Fritscher
@ 2020-08-11 19:45     ` Wols Lists
  2020-08-22  1:31       ` David C. Rankin
  2020-08-22 18:50       ` Chris Murphy
  2020-08-12 14:07     ` Nix
  1 sibling, 2 replies; 25+ messages in thread
From: Wols Lists @ 2020-08-11 19:45 UTC (permalink / raw)
  To: Michael Fritscher, linux-raid

On 11/08/20 20:19, Michael Fritscher wrote:
> Hi,
> 
> if you really want to use these tiny 2 TB HDDs - yes, RAID 6 (2x - the
> second for the backup system on a physically different location) is a
> good choice.
> 
> But: If you can, buy some 8-12 TB HDDs and forget the old rusty tiny
> HDDs. You'll save a lot at the system - and power.
> 
I'm looking at one of these ...
https://www.amazon.co.uk/Seagate-ST8000DM004-Barracuda-internal-Silver/dp/B075WYBQXJ/ref=pd_ybh_a_8?_encoding=UTF8&psc=1&refRID=WF1CTS2K9RWY96D1RENJ

Note that it IS a shingled drive, so fine for backup, much less so for
anything else. I'm not sure whether btrfs would be a good choice or not ...

> ext4 is fine. In my experience, it is rock-solid, and also fsck.ext4 is
> fairly qick (don't know what Roy is doing that it is so slow - do you
> really made a full-fledged ext4 with journal or a old ext2 file system?^^)
> 
> Another way would be deploying zfs with raid-z2 (Yes, I can hear the
> screams :-D )
> 
Cheers,
Wol

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:33     ` Roman Mamedov
@ 2020-08-11 19:49       ` Rudy Zijlstra
  2020-08-11 20:13         ` Roman Mamedov
  2020-08-11 20:12       ` Reindl Harald
  1 sibling, 1 reply; 25+ messages in thread
From: Rudy Zijlstra @ 2020-08-11 19:49 UTC (permalink / raw)
  To: Roman Mamedov, Reindl Harald; +Cc: George Rapp, Linux-RAID



Op 11-08-2020 om 21:33 schreef Roman Mamedov:
> On Tue, 11 Aug 2020 20:57:15 +0200
> Reindl Harald <h.reindl@thelounge.net> wrote:
>
>>> Whichever filesystem you choose, you will end up with a huge single point of
>>> failure, and any trouble with that FS or the underlying array put all your
>>> data instantly at risk.
>> calling an array where you can lose *two* disks as
>> single-point-of-failure is absurd
> As noted before, not just *disks* can fail, plenty of other things to fail in
> a storage server, and they can easily take down, say, a half of all disks, or
> random portions of them in increments of four. Even if temporarily -- that
> surely will be "unexpected" to that single precious 20TB filesystem. How will
> it behave, who knows. Do you know? For added fun, reconnect the drives back 30
> seconds later. Oh, let's write to linux-raid for how to bring back a half of
> RAID6 from the Spare (S) status. Or find some HOWTO suggesting a random
> --create without --assume-clean. And if the FS goes corrupt, now you suddenly
> need *all* your backups, not just 1 drive worth of them.
actually, recovering from such an occasion is not the difficult part. 
The difficult part is controlling the human doing the recovery. Too 
often time pressure results in panic and the wrong choices. Which is 
where having a good backup is so very valuable.
Yes, i have had problems like that, and recovered. mdadm -A -f is your 
friend :)
>
>> no raid can replace backups anyways
> All too often I've seen RAID being used as an implicit excuse to be lenient
> about backups. Heck, I know from personal experience how enticing that can be.
Is the backup not automatic? in that case it is no backup.
>
>>> Most likely you do not. And the RAID's main purpose in that case is to just
>>> have a unified storage pool, for the convenience of not having to manage free
>>> space across so many drives. But given the above, I would suggest leaving the
>>> drives with their individual FSes, and just running MergerFS on top:
>>> https://www.teknophiles.com/2018/02/19/disk-pooling-in-linux-with-mergerfs/
>> you just move the complexity to something not used by many people for
>> what exactly to gain? the rives are still in the same machine
> To gain total independence of drives from each other, you can pull any drive
> out of the machine, plug it in somewhere else, and it will have a proper
> filesystem and readable files on it. Writable, even.
>
> Compared to a 16-drive RAID6, where you either have a whopping 14 disks
> present, connected, powered, spinning, healthy, online, OR you have useless
> binary junk instead of any data.
>
> Of course I do not insist my way is the best for everyone, but I hope now you
> can better understand the concerns and reasons for choosing it. :)
You have simply chosen a different set of mistakes to make. Considering 
you need to update the "what is where" list regularly (is that 
automated?) you may actually have more options for mistakes.

My biggest raid currently is a 6 disk raid10

Cheers

Rudy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:33     ` Roman Mamedov
  2020-08-11 19:49       ` Rudy Zijlstra
@ 2020-08-11 20:12       ` Reindl Harald
  1 sibling, 0 replies; 25+ messages in thread
From: Reindl Harald @ 2020-08-11 20:12 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: George Rapp, Linux-RAID



Am 11.08.20 um 21:33 schrieb Roman Mamedov:
> On Tue, 11 Aug 2020 20:57:15 +0200
> Reindl Harald <h.reindl@thelounge.net> wrote:
> 
>>> Whichever filesystem you choose, you will end up with a huge single point of
>>> failure, and any trouble with that FS or the underlying array put all your
>>> data instantly at risk. 
>>
>> calling an array where you can lose *two* disks as
>> single-point-of-failure is absurd
> 
> As noted before, not just *disks* can fail, plenty of other things to fail in
> a storage server, and they can easily take down, say, a half of all disks, or
> random portions of them in increments of four. Even if temporarily -- that
> surely will be "unexpected" to that single precious 20TB filesystem.

a sun storm can wipe everything

> How will
> it behave, who knows. Do you know? For added fun, reconnect the drives back 30
> seconds later. Oh, let's write to linux-raid for how to bring back a half of
> RAID6 from the Spare (S) status. Or find some HOWTO suggesting a random
> --create without --assume-clean. And if the FS goes corrupt, now you suddenly
> need *all* your backups, not just 1 drive worth of them.

i have not lost any bit on a raid in my whole lifetime but faced a ton
of drives dying

>> no raid can replace backups anyways
> 
> All too often I've seen RAID being used as an implicit excuse to be lenient
> about backups. Heck, I know from personal experience how enticing that can be.
> 
>>> Most likely you do not. And the RAID's main purpose in that case is to just
>>> have a unified storage pool, for the convenience of not having to manage free
>>> space across so many drives. But given the above, I would suggest leaving the
>>> drives with their individual FSes, and just running MergerFS on top: 
>>> https://www.teknophiles.com/2018/02/19/disk-pooling-in-linux-with-mergerfs/
>>
>> you just move the complexity to something not used by many people for
>> what exactly to gain? the rives are still in the same machine
> 
> To gain total independence of drives from each other, you can pull any drive
> out of the machine, plug it in somewhere else, and it will have a proper
> filesystem and readable files on it. Writable, even.

i have not lost any bit on a raid in my whole lifetime but faced a ton
of drives dying

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:49       ` Rudy Zijlstra
@ 2020-08-11 20:13         ` Roman Mamedov
  2020-08-11 20:17           ` Reindl Harald
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Mamedov @ 2020-08-11 20:13 UTC (permalink / raw)
  To: Rudy Zijlstra; +Cc: Reindl Harald, George Rapp, Linux-RAID

On Tue, 11 Aug 2020 21:49:07 +0200
Rudy Zijlstra <rudy@grumpydevil.homelinux.org> wrote:

> >> no raid can replace backups anyways
> > All too often I've seen RAID being used as an implicit excuse to be lenient
> > about backups. Heck, I know from personal experience how enticing that can be.
> Is the backup not automatic? in that case it is no backup.

Sure it's automatic, by lenient I meant starting to exclude parts of their
data set from being backed up, deciding what is "important" and what is not,
and for the latter portion just hoping for the best because "it is RAID after
all, I can lose TWO drives, what could possibly go wrong". And then something
goes wrong, and it turns out even the data that was "not important" is actually
important enough to warrant scrambling for how to recover it, because
reobtaining it turns out to be costly / huge effort / not actually possible
even if they thought it will be.

> You have simply chosen a different set of mistakes to make. Considering 
> you need to update the "what is where" list regularly (is that 
> automated?)

Of course. In fact I'd suggest keeping similar lists no matter which storage
setup you run. One thing worse than losing data, is losing data *and* not
remembering what you had on there in the first place :)

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 20:13         ` Roman Mamedov
@ 2020-08-11 20:17           ` Reindl Harald
  0 siblings, 0 replies; 25+ messages in thread
From: Reindl Harald @ 2020-08-11 20:17 UTC (permalink / raw)
  To: Roman Mamedov, Rudy Zijlstra; +Cc: George Rapp, Linux-RAID



Am 11.08.20 um 22:13 schrieb Roman Mamedov:
> On Tue, 11 Aug 2020 21:49:07 +0200
> Rudy Zijlstra <rudy@grumpydevil.homelinux.org> wrote:
> 
>> You have simply chosen a different set of mistakes to make. Considering 
>> you need to update the "what is where" list regularly (is that 
>> automated?)
> 
> Of course. In fact I'd suggest keeping similar lists no matter which storage
> setup you run. One thing worse than losing data, is losing data *and* not
> remembering what you had on there in the first place :)

if you don't remember and don't miss anything everything is fine

however, what is that difficult running a RAID and a rsync cronjob for
backup everything?

data without backup are lost data, you just don#t know the pont in time
and in that case i prefer delete them now

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 16:23 ` Roman Mamedov
  2020-08-11 18:57   ` Reindl Harald
@ 2020-08-11 22:14   ` Roy Sigurd Karlsbakk
  2020-08-12 14:16   ` Nix
  2 siblings, 0 replies; 25+ messages in thread
From: Roy Sigurd Karlsbakk @ 2020-08-11 22:14 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: George Rapp, Linux Raid

>> Use case is long-term storage of many small files and a few large ones
>> (family photos and videos, backups of other systems, working copies of
>> photo, audio, and video edits, etc.)? Current usable space is about
>> 10TB but my end state vision is probably upwards of 20TB. I'll
>> probably consign the slowest working disks in the server to an archive
>> filesystem, either RAID 1 or RAID 5, for stuff I care less about and
>> backups; the archive part can be ignored for the purposes of this
>> exercise.
>> 
>> My question is: what filesystem type would be best practice for my use
>> case and size requirements on the big array? (I have reviewed
>> https://raid.wiki.kernel.org/index.php/RAID_and_filesystems, but am
>> looking for practitioners' recommendations.)  I've run ext4
>> exclusively on my arrays to date, but have been reading up on xfs; is
>> there another filesystem type I should consider? Finally, are there
>> any pitfalls I should know about in my high-level design?
> 
> Whichever filesystem you choose, you will end up with a huge single point of
> failure, and any trouble with that FS or the underlying array put all your
> data instantly at risk. "But RAID6" -- what about a SATA controller failure,
> or a flaky cabling/PSU/backplane, which disconnects, say, 4 disks at once "on
> the fly". What about a sudden power loss amidst heavy write load. And so on.

If that happens, you just connect the drives to another controller and reassemble. If you are really, really unlucky and lose more than two of the drives in a RAID-6, you restore from backup.

> First of all, ask yourself -- is all of this backed up? If no, then go and buy
> more drives until the answer is yes. With current drive prices, or as you say,
> with having lots of spare old drives lying around, there's no excuse to leave
> anything non-trivial not backed up.
> 
> Secondly -- if all of this... is BACKED UP ANYWAY, why even run RAID? And with
> RAID6, even waste 2 more drives for redundancy. Do you need 24x7 uptime of your
> home NAS, do you have hotswap cages, do you require that the server absolutely
> stays online while a disk is being replaced.

Simply because if you lose 10-20TiB of data on a disk failure, you also lose a week or two with troubleshooting instead of just replacing that disk.


> (blablbla)
>
> For the FS considerations, the dealbreaker of XFS for me is its inability to
> be shrunk. 

And how many times have you had the need to shrink a large filesystem used for the mentioned purposes? To me that's zero. It's practical, yes, but the adverse effect of ext4 and large filesystems are worse.

But hey - go on with your "good" advice. I just stick to my own so far.

- Remember that RAID is not backup, it's redundancy. It may fail any day at any time, but normally just one drive fails, sometimes two. With sufficient redundancy, you just swap those drives out.
- Redundancy has a cost, but time also has a cost, even at home. If you need to spend hours or days to restore a system, the time you spent could be used for family or friends.

And so on…

Vennlig hilsen

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
Hið góða skaltu í stein höggva, hið illa í snjó rita.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:19   ` Michael Fritscher
  2020-08-11 19:45     ` Wols Lists
@ 2020-08-12 14:07     ` Nix
  1 sibling, 0 replies; 25+ messages in thread
From: Nix @ 2020-08-12 14:07 UTC (permalink / raw)
  To: Michael Fritscher; +Cc: linux-raid

On 11 Aug 2020, Michael Fritscher told this:
> ext4 is fine. In my experience, it is rock-solid, and also fsck.ext4 is
> fairly qick (don't know what Roy is doing that it is so slow - do you
> really made a full-fledged ext4 with journal or a old ext2 file system?^^)

I note that modern mkext2fs leaves whole block groups uninitialized if
it can, and any block groups that end up with no files in again get
marked uninitalized once more (as of e2fsprogs 1.43). If an older
e2fsprogs than that is in use, or if this is an fs too old to support
unintialized block groups, or if the fs simply doesn't have uninit_bg
enabled (which requires explicit action at creation time, these days),
e2fsprogs will be massively slower than if it can exploit the
uninitialized bgs to (basically) skip huge chunks of the fsck work on
most of the fs that is known to be empty.

Without this optimization, one component of fsck time is linear in the
total size of the fs: with it, it's linear in the *allocated* space used
on the fs. (There are other passes that scale as number of allocated
inodes, number of directories, etc.)

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 16:23 ` Roman Mamedov
  2020-08-11 18:57   ` Reindl Harald
  2020-08-11 22:14   ` Roy Sigurd Karlsbakk
@ 2020-08-12 14:16   ` Nix
  2020-08-12 14:41     ` Roman Mamedov
  2 siblings, 1 reply; 25+ messages in thread
From: Nix @ 2020-08-12 14:16 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: George Rapp, Linux-RAID

On 11 Aug 2020, Roman Mamedov stated:
> For the FS considerations, the dealbreaker of XFS for me is its inability to
> be shrunk. The ivory tower people do not think that is important enough, but
> for me that limits the FS applicability severely. Also it loved truncating
> currently-open files to zero bytes on power loss (dunno if that's been
> improved).

I've been using XFS for more than ten years now and have never seen this
allegedly frequent behaviour at all. It certainly seems to be less
common than, say, fs damage due to the (unjournalled) RAID write hole.

I suspect you're talking about this:
<https://xfs.org/index.php/XFS_FAQ#Q:_Why_do_I_see_binary_NULLS_in_some_files_after_recovery_when_I_unplugged_the_power.3F>,
whicih was fixed in *2007*. So... ignore it, it's *long* dead. (Equally,
ignore complaints about xfs being really slow under heavy metadata
updates: this was true before delayed logging was implemented, but
delaylog has been non-experimental since 2.6.39 (2011) and the
non-delaylog option was removed in 2015. xfs is often now faster than
ext4 at metadata operations, and is generally on par with it.

Shrinking xfs is relatively irrelevant these days: if you want to be
able to shrink, use thin provisioning and run fstrim periodically. The
space used by the fs will then shrink whenever fstrim is run, with no
need to mess about with filesystem resizing.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-12 14:16   ` Nix
@ 2020-08-12 14:41     ` Roman Mamedov
  0 siblings, 0 replies; 25+ messages in thread
From: Roman Mamedov @ 2020-08-12 14:41 UTC (permalink / raw)
  To: Nix; +Cc: George Rapp, Linux-RAID

On Wed, 12 Aug 2020 15:16:37 +0100
Nix <nix@esperi.org.uk> wrote:

> On 11 Aug 2020, Roman Mamedov stated:
> > For the FS considerations, the dealbreaker of XFS for me is its inability to
> > be shrunk. The ivory tower people do not think that is important enough, but
> > for me that limits the FS applicability severely. Also it loved truncating
> > currently-open files to zero bytes on power loss (dunno if that's been
> > improved).
> 
> I've been using XFS for more than ten years now and have never seen this
> allegedly frequent behaviour at all. It certainly seems to be less
> common than, say, fs damage due to the (unjournalled) RAID write hole.
> 
> I suspect you're talking about this:
> <https://xfs.org/index.php/XFS_FAQ#Q:_Why_do_I_see_binary_NULLS_in_some_files_after_recovery_when_I_unplugged_the_power.3F>,
> whicih was fixed in *2007*.

No, it was not nulls inside files, but files becoming zero bytes in size.
Like reported here https://bugzilla.redhat.com/show_bug.cgi?id=845233
Or maybe https://access.redhat.com/solutions/272673 too (don't have an account)
Maybe also fixed since 2012. If so, good for you.

> Shrinking xfs is relatively irrelevant these days: if you want to be
> able to shrink, use thin provisioning and run fstrim periodically. The
> space used by the fs will then shrink whenever fstrim is run, with no
> need to mess about with filesystem resizing.

It also means you can't practically remove drives from that 16-drive RAID6 if
you go with XFS. Cannot shrink the filesystem, cannot reshape to fewer disks.
Cannot switch to a different disk enclosure with fewer slots.
Would you suggest to patch that up with running an array-wide Thin volume as
well? Will not work, as the LVM thin backing volume *itself* cannot be shrunk
either. And even in more suitable conditions, thin has its own overheads and
added complexity.

Well, if we are to mention something positive about XFS, it's another major
Linux FS other than Btrfs (which is not everyone's cup of tea for its own
reasons), that now got support for reflink copies (cp --reflink). Backing up
file-based VMs can be done online and atomically now, like with LVM. And in
general, it's the next-best thing to Btrfs snapshots, but in a mature and
all-around high-performing FS.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11  4:42 Recommended filesystem for RAID 6 George Rapp
                   ` (2 preceding siblings ...)
  2020-08-11 16:23 ` Roman Mamedov
@ 2020-08-12 20:44 ` Peter Grandi
  3 siblings, 0 replies; 25+ messages in thread
From: Peter Grandi @ 2020-08-12 20:44 UTC (permalink / raw)
  To: Linux-RAID

> [...] enough cheap 2TB disks to get started. I'm planning on
> creating a RAID 6 array due to the age and consumer-grade
> quality of my 16 2TB disks. [...] Use case is long-term
> storage of many small files and a few large ones (family
> photos and videos, backups of other systems, working copies of
> photo, audio, and video edits, etc.)? Current usable space is
> about 10TB but my end state vision is probably upwards of
> 20TB.

As other people have remarked a single RAID is not long-term
storage, RAID redundancy is designed for *online continuity*
(that is storage systems to remain available despite media
failures), not for "backup" or "integrity" (even if LVM RAID has
got some recent additions for that).

Also RAID6 is terrible for writing small files, small files are
bad for any filesystem type, and recovery times on a large RAID6
are risky. Also it is much better BTW to use 'ar' or 'zip' etc.
('zip -0' for already compressed files) to avoid many small
files, esepcially if they are part of a read-only collection;
most GUI tools can access archive files as if directories. Some
tools have checksums or even redundancy codes for members.

But overall the idea is that if you are doing archival a single
large storage pool is a terrible idea (it is a terrible idea in
general, but especially for archival):

  http://www.sabi.co.uk/blog/0805may.html?080516#080516

Probably your best options is to have a series of separate
smaller "silos" (or "data drawers" or "data shelves"), where you
fill one "silo" at a time, and when full it can become read-only
mounted, and then you fill the next and so on. An alternative is
to have two RW silos: one for archival, and one permanently RW
for home directories.

For backup you can buy some large disk drives, at least 3 (well
2 is the minimum but it is rather riskier), to use in rotation,
and partition them in the size of the "silos", and dump the
currently RW "silo"; with 'tar' or 'dumpe2fs' (or even 'dd' but
careful about duplicate UUIDs or labels), but not RSYNC if you
have many small files. The backups can be easily encrypted even
if the live "silos" are not.

As the currently RW "silo" fills up you keep backing it up in to
distinct backup disks, and once it is full you can stop backing
it up, and keep its backup disks on a shelf.

For example you could have a number of 2+1 or 3+1 or 4+1 RAID5s
as "silos", with a 4TB or 6TB or 8TB raw capacity and fill one
after the other, and then get 8-12TB drives, partitioned
accordingly. 4TB is for me is a good limit for "live" silos.

The filesystem type for the silos does not matter much, but I
like for archiving the checksumming COW filesystem types, like
NILFS2/Btrfs/ZFS (but I never use the Btrfs volume manager, I
prefer MD RAID by far, while the ZFS volume manager is
acceptable). Otherwise I like JFS (or F2FS or even UDF).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:45     ` Wols Lists
@ 2020-08-22  1:31       ` David C. Rankin
  2020-08-22  7:25         ` Peter Grandi
  2020-08-22 18:50       ` Chris Murphy
  1 sibling, 1 reply; 25+ messages in thread
From: David C. Rankin @ 2020-08-22  1:31 UTC (permalink / raw)
  To: mdraid

On 8/11/20 2:45 PM, Wols Lists wrote:
> Note that it IS a shingled drive, so fine for backup, much less so for
> anything else. I'm not sure whether btrfs would be a good choice or not ...
> 

I'll defer to you, but last I check btrfs did NOT play well with raid 5/6. It
may be old info, but:

https://superuser.com/questions/1131701/btrfs-over-mdadm-raid6

-- 
David C. Rankin, J.D.,P.E.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22  1:31       ` David C. Rankin
@ 2020-08-22  7:25         ` Peter Grandi
  2020-08-22  9:38           ` Wols Lists
  2020-08-22 19:04           ` Chris Murphy
  0 siblings, 2 replies; 25+ messages in thread
From: Peter Grandi @ 2020-08-22  7:25 UTC (permalink / raw)
  To: list Linux RAID

>> [...] Note that it IS a shingled drive, so fine for backup,
>> much less so for anything else.

It is fine for backup especially if used as a tape that is say
divided into partitions and backup is done using 'dd' (but
careful if using Btrfs) or 'tar' or similar. If using 'rsync' or
similar those still write a lot of inodes and often small files
if they are present in the source.

>> I'm not sure whether btrfs would be a good choice or not ...

> [...] btrfs did NOT play well with raid 5/6. It may be old
> info, but:
> https://superuser.com/questions/1131701/btrfs-over-mdadm-raid6

That seems based on some misunderstanding: native Btrfs 5/6 has
some "limitations", like most of its volume management, but
running over MS RAID 5/6 is much the same as running on top of a
single disk, so fine. MD RAID 5/6 takes care of redundancy so
there is no need to have metadata in 'dup' mode.

Using RAID 5/6 with SMR drives can result in pretty huge delays
(IIRC a previous poster has given some relevant URL) on writing
or on rebuilding, as the "chunk" size is very likely not to be
very congruent with the SMT zones.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22  7:25         ` Peter Grandi
@ 2020-08-22  9:38           ` Wols Lists
  2020-08-22 19:21             ` Chris Murphy
  2020-08-22 19:04           ` Chris Murphy
  1 sibling, 1 reply; 25+ messages in thread
From: Wols Lists @ 2020-08-22  9:38 UTC (permalink / raw)
  To: Peter Grandi, list Linux RAID

On 22/08/20 08:25, Peter Grandi wrote:
>>> [...] Note that it IS a shingled drive, so fine for backup,
>>> >> much less so for anything else.
> It is fine for backup especially if used as a tape that is say
> divided into partitions and backup is done using 'dd' (but
> careful if using Btrfs) or 'tar' or similar. If using 'rsync' or
> similar those still write a lot of inodes and often small files
> if they are present in the source.
> 
The idea is an "in place" rsync, with lvm or btrfs or something
providing snapshots.

That way, I have full backups each of which only takes up the marginal
space required by an incremental :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-11 19:45     ` Wols Lists
  2020-08-22  1:31       ` David C. Rankin
@ 2020-08-22 18:50       ` Chris Murphy
  2020-08-22 19:54         ` Kai Stian Olstad
  2020-08-22 23:50         ` antlists
  1 sibling, 2 replies; 25+ messages in thread
From: Chris Murphy @ 2020-08-22 18:50 UTC (permalink / raw)
  To: Wols Lists; +Cc: Linux-RAID

On Tue, Aug 11, 2020 at 1:47 PM Wols Lists <antlists@youngman.org.uk> wrote:
>
> On 11/08/20 20:19, Michael Fritscher wrote:
> > Hi,
> >
> > if you really want to use these tiny 2 TB HDDs - yes, RAID 6 (2x - the
> > second for the backup system on a physically different location) is a
> > good choice.
> >
> > But: If you can, buy some 8-12 TB HDDs and forget the old rusty tiny
> > HDDs. You'll save a lot at the system - and power.
> >
> I'm looking at one of these ...
> https://www.amazon.co.uk/Seagate-ST8000DM004-Barracuda-internal-Silver/dp/B075WYBQXJ/ref=pd_ybh_a_8?_encoding=UTF8&psc=1&refRID=WF1CTS2K9RWY96D1RENJ
>
> Note that it IS a shingled drive, so fine for backup, much less so for
> anything else.

How can you tell? From the spec, I can't find anything that indicates
it. Let alone which of three varieties it is.
https://www.seagate.com/www-content/product-content/barracuda-fam/barracuda-new/en-us/docs/100805918d.pdf

>I'm not sure whether btrfs would be a good choice or not ...

Btrfs tries to write sequentially, both data and metadata, which
favors SMR drives.

For device managed SMR there are some likely optimizations to help
avoid random writes. Top on that list is for the workload to avoid
fsync. And also using mount options: longer commit time, notreelog,
space_cache v2, and nossd. If the drive reports rotational in sysfs,
then nossd is used by default. Space cache v2 is slated to become the
default soon.

For host managed SMR there are significant requirements. Including a
log structured super block.
https://lwn.net/Articles/806327/

Quite a lot of preparatory work has been happening before this series
lands in mainline. For other file systems, I'm not sure, but my guess
is using dm-zoned, basically making non-sequential writes from XFS and
ext4 into sequential writes and ensuring the various alignment
requirements.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22  7:25         ` Peter Grandi
  2020-08-22  9:38           ` Wols Lists
@ 2020-08-22 19:04           ` Chris Murphy
  1 sibling, 0 replies; 25+ messages in thread
From: Chris Murphy @ 2020-08-22 19:04 UTC (permalink / raw)
  To: Peter Grandi; +Cc: list Linux RAID

On Sat, Aug 22, 2020 at 1:25 AM Peter Grandi <pg@mdraid.list.sabi.co.uk> wrote:
>
> >> [...] Note that it IS a shingled drive, so fine for backup,
> >> much less so for anything else.
>
> It is fine for backup especially if used as a tape that is say
> divided into partitions and backup is done using 'dd' (but
> careful if using Btrfs) or 'tar' or similar. If using 'rsync' or
> similar those still write a lot of inodes and often small files
> if they are present in the source.
>
> >> I'm not sure whether btrfs would be a good choice or not ...
>
> > [...] btrfs did NOT play well with raid 5/6. It may be old
> > info, but:
> > https://superuser.com/questions/1131701/btrfs-over-mdadm-raid6
>
> That seems based on some misunderstanding: native Btrfs 5/6 has
> some "limitations", like most of its volume management, but
> running over MS RAID 5/6 is much the same as running on top of a
> single disk, so fine. MD RAID 5/6 takes care of redundancy so
> there is no need to have metadata in 'dup' mode.

True but dup metadata is a small cost to have self-healing file system
metadata. While read errors mean md will reconstruct from parity, and
Btrfs would be none the wiser, Btrfs is more sensitive to certain
kinds of metadata loss than other file systems.

Where dup is pointless is the case of deduplication by SSDs with
concurrent writes, and dup metadata uses concurrent writes - i.e. it's
not going to delay writing one of the copies which is what it'd take
to thwart common internal SSD optimizations. Even those that don't
dedup, concurrent writes end up on the same erase block. And if
there's a hardware failure, it tends to affect an entire erase block.

> Using RAID 5/6 with SMR drives can result in pretty huge delays
> (IIRC a previous poster has given some relevant URL) on writing
> or on rebuilding, as the "chunk" size is very likely not to be
> very congruent with the SMT zones.

At least hmzoned Btrfs will disable features considered incompatible
with SMR: raid56, nodatacow (overwrites), and fallocate which also
implies overwrites.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22  9:38           ` Wols Lists
@ 2020-08-22 19:21             ` Chris Murphy
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Murphy @ 2020-08-22 19:21 UTC (permalink / raw)
  To: Wols Lists; +Cc: Peter Grandi, list Linux RAID

On Sat, Aug 22, 2020 at 3:38 AM Wols Lists <antlists@youngman.org.uk> wrote:
>
> On 22/08/20 08:25, Peter Grandi wrote:
> >>> [...] Note that it IS a shingled drive, so fine for backup,
> >>> >> much less so for anything else.
> > It is fine for backup especially if used as a tape that is say
> > divided into partitions and backup is done using 'dd' (but
> > careful if using Btrfs) or 'tar' or similar. If using 'rsync' or
> > similar those still write a lot of inodes and often small files
> > if they are present in the source.
> >
> The idea is an "in place" rsync, with lvm or btrfs or something
> providing snapshots.
>
> That way, I have full backups each of which only takes up the marginal
> space required by an incremental :-)

In the case where the source and the destination are Btrfs, there can
be an advantage to using 'btrfs send' and 'btrfs receive'. No deep
traversal is required on either send or receive side, to determine the
incremental changes between two snapshots.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22 18:50       ` Chris Murphy
@ 2020-08-22 19:54         ` Kai Stian Olstad
  2020-08-22 23:50         ` antlists
  1 sibling, 0 replies; 25+ messages in thread
From: Kai Stian Olstad @ 2020-08-22 19:54 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Wols Lists, Linux-RAID

On Sat, Aug 22, 2020 at 12:50:50PM -0600, Chris Murphy wrote:
> On Tue, Aug 11, 2020 at 1:47 PM Wols Lists <antlists@youngman.org.uk> wrote:
> >
> > On 11/08/20 20:19, Michael Fritscher wrote:
> > > Hi,
> > >
> > > if you really want to use these tiny 2 TB HDDs - yes, RAID 6 (2x - the
> > > second for the backup system on a physically different location) is a
> > > good choice.
> > >
> > > But: If you can, buy some 8-12 TB HDDs and forget the old rusty tiny
> > > HDDs. You'll save a lot at the system - and power.
> > >
> > I'm looking at one of these ...
> > https://www.amazon.co.uk/Seagate-ST8000DM004-Barracuda-internal-Silver/dp/B075WYBQXJ/ref=pd_ybh_a_8?_encoding=UTF8&psc=1&refRID=WF1CTS2K9RWY96D1RENJ
> >
> > Note that it IS a shingled drive, so fine for backup, much less so for
> > anything else.
> 
> How can you tell? From the spec, I can't find anything that indicates
> it. Let alone which of three varieties it is.
> https://www.seagate.com/www-content/product-content/barracuda-fam/barracuda-new/en-us/docs/100805918d.pdf

You can't from the spec[1], but you have this one[2] in the wake of WD
Red started to use SMR without telling anyone.

[1] https://blocksandfiles.com/2020/04/15/seagate-2-4-and-8tb-barracuda-and-desktop-hdd-smr/
[2] https://www.seagate.com/gb/en/internal-hard-drives/cmr-smr-list/


-- 
Kai Stian Olstad
PS: Sorry Chris, you get this twice, I forgot reply-to-all.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Recommended filesystem for RAID 6
  2020-08-22 18:50       ` Chris Murphy
  2020-08-22 19:54         ` Kai Stian Olstad
@ 2020-08-22 23:50         ` antlists
  1 sibling, 0 replies; 25+ messages in thread
From: antlists @ 2020-08-22 23:50 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID

On 22/08/2020 19:50, Chris Murphy wrote:
> How can you tell? From the spec, I can't find anything that indicates
> it. Let alone which of three varieties it is.
> https://www.seagate.com/www-content/product-content/barracuda-fam/barracuda-new/en-us/docs/100805918d.pdf

Simple.

Seagate now say that all the BarraCuda range are SMR. And this is a 
BarraCuda, not a Barracuda. (What's the difference, you say? The capital C.)

I've been digging, and I think if you want an old-fashioned CMR desktop 
drive you need the BarraCuda Pro or FireCuda. But at least - like the WD 
Reds or Red Pluses, there is at least an official naming convention that 
tells you what's what.

https://www.seagate.com/gb/en/internal-hard-drives/cmr-smr-list/

This is linked to on the wiki timeout problem page, as is the official 
WD statement.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-08-22 23:50 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-11  4:42 Recommended filesystem for RAID 6 George Rapp
2020-08-11 15:06 ` Roy Sigurd Karlsbakk
2020-08-11 19:19   ` Michael Fritscher
2020-08-11 19:45     ` Wols Lists
2020-08-22  1:31       ` David C. Rankin
2020-08-22  7:25         ` Peter Grandi
2020-08-22  9:38           ` Wols Lists
2020-08-22 19:21             ` Chris Murphy
2020-08-22 19:04           ` Chris Murphy
2020-08-22 18:50       ` Chris Murphy
2020-08-22 19:54         ` Kai Stian Olstad
2020-08-22 23:50         ` antlists
2020-08-12 14:07     ` Nix
2020-08-11 15:22 ` antlists
2020-08-11 16:23 ` Roman Mamedov
2020-08-11 18:57   ` Reindl Harald
2020-08-11 19:33     ` Roman Mamedov
2020-08-11 19:49       ` Rudy Zijlstra
2020-08-11 20:13         ` Roman Mamedov
2020-08-11 20:17           ` Reindl Harald
2020-08-11 20:12       ` Reindl Harald
2020-08-11 22:14   ` Roy Sigurd Karlsbakk
2020-08-12 14:16   ` Nix
2020-08-12 14:41     ` Roman Mamedov
2020-08-12 20:44 ` Peter Grandi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.