* Re: Suggestions for building new 44TB Raid5 array @ 2022-06-11 4:51 Marc MERLIN 2022-06-11 9:30 ` Roman Mamedov ` (3 more replies) 0 siblings, 4 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-11 4:51 UTC (permalink / raw) To: Andrei Borzenkov Cc: Zygo Blaxell, Josef Bacik, linux-btrfs, Chris Murphy, Qu Wenruo so, my apologies to all for the thread of death that is hopefully going to be over soon. I still want to help Josef fix the tools though, hopefully we'll get that filesystem back to a mountable state. That said, it's been over 2 months now, and I do need to get this filesystem back up from backup, so I ended up buying new drives (5x 11TiB in raid5). Given the pretty massive corruption that happened in ways that I still can't explain, I'll make sure to turn off all the drive write caches but I think I'm not sure I want to trust bcache anymore even though I had it in writethrough mode. Here's the Email from March, questions still apply: Kernel will be 5.16. Filesystem will be 24TB and contain mostly bigger files (100MB to 10GB). 1) mdadm --create /dev/md7 --level=5 --consistency-policy=ppl --raid-devices=5 /dev/sd[abdef]1 --chunk=256 --bitmap=internal 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach gargamel:/dev# cat /sys/block/md7/bcache/cache_mode [writethrough] writeback writearound none 3) cryptsetup luksFormat --align-payload=2048 -s 256 -c aes-xts-plain64 /dev/bcache64 4) cryptsetup luksOpen /dev/bcache64 dshelf1 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 Any other btrfs options I should set for format to improve reliability first and performance second? I'm told I should use space_cache=v2, is it default now with btrfs-progs 5.10.1-2 ? As for bcache, I'm really thinking about droppping it, unless I'm told it should be safe to use. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 4:51 Suggestions for building new 44TB Raid5 array Marc MERLIN @ 2022-06-11 9:30 ` Roman Mamedov [not found] ` <CAK-xaQYc1PufsvksqP77HMe4ZVTkWuRDn2C3P-iMTQzrbQPLGQ@mail.gmail.com> 2022-06-11 23:44 ` Zygo Blaxell ` (2 subsequent siblings) 3 siblings, 1 reply; 24+ messages in thread From: Roman Mamedov @ 2022-06-11 9:30 UTC (permalink / raw) To: Marc MERLIN Cc: Andrei Borzenkov, Zygo Blaxell, Josef Bacik, linux-btrfs, Chris Murphy, Qu Wenruo On Fri, 10 Jun 2022 21:51:20 -0700 Marc MERLIN <marc@merlins.org> wrote: > Kernel will be 5.16. Filesystem will be 24TB and contain mostly bigger > files (100MB to 10GB). > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > [writethrough] writeback writearound none Maybe try LVM Cache this time? > 3) cryptsetup luksFormat --align-payload=2048 -s 256 -c aes-xts-plain64 /dev/bcache64 > 4) cryptsetup luksOpen /dev/bcache64 dshelf1 What's the threat scenario for LUKS on the array? A major one for me would be not to be having to RMA a disk with all my data still on the platters. But with RAID5, a single disk by itself would not contain easily discernible or usable data. Or if you're protecting against unauthorized access to the entire array, then never mind. > 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 Personally I have switched from Btrfs on MD to individual disks and MergerFS. The rationale for no RAID is the simplicity and resilience of the individual single-disk filesystems, and that anything important or not easily re-obtainable is backed up anyways; so the protection from single disk failures is not as important, compared to the introduced complexity and the chance of losing the entire huge FS (like you had). -- With respect, Roman ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <CAK-xaQYc1PufsvksqP77HMe4ZVTkWuRDn2C3P-iMTQzrbQPLGQ@mail.gmail.com>]
* Re: Suggestions for building new 44TB Raid5 array [not found] ` <CAK-xaQYc1PufsvksqP77HMe4ZVTkWuRDn2C3P-iMTQzrbQPLGQ@mail.gmail.com> @ 2022-06-11 14:52 ` Marc MERLIN 2022-06-11 17:54 ` Roman Mamedov ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-11 14:52 UTC (permalink / raw) To: Andrea Gelmini, Roman Mamedov Cc: Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Sat, Jun 11, 2022 at 09:27:57AM +0200, Andrea Gelmini wrote: > Il giorno sab 11 giu 2022 alle ore 09:16 Marc MERLIN > <marc@merlins.org> ha scritto: > > As for bcache, I'm really thinking about droppping it, unless I'm told > > it should be safe to use. > > https://lwn.net/Articles/895266/ Mmmh, bcachefs, I was not aware of this new one. Not sure if I want to add yet another layer, esepcially if it's new. On Sat, Jun 11, 2022 at 02:30:33PM +0500, Roman Mamedov wrote: > > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > > [writethrough] writeback writearound none > > Maybe try LVM Cache this time? Hard to know either way, trading one layer for another, and LVM has always seemed heavier > > 3) cryptsetup luksFormat --align-payload=2048 -s 256 -c aes-xts-plain64 /dev/bcache64 > > 4) cryptsetup luksOpen /dev/bcache64 dshelf1 > > What's the threat scenario for LUKS on the array? In case my computer gets stolen, and indeed being able to recycle old drives without having to worry, is a nice bonus. > > 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 > > Personally I have switched from Btrfs on MD to individual disks and MergerFS. That gives you no redundancy if a drive disk, correct? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 14:52 ` Marc MERLIN @ 2022-06-11 17:54 ` Roman Mamedov 2022-06-12 17:31 ` Marc MERLIN 2022-06-12 21:21 ` Roman Mamedov 2022-06-20 20:37 ` Andrea Gelmini 2 siblings, 1 reply; 24+ messages in thread From: Roman Mamedov @ 2022-06-11 17:54 UTC (permalink / raw) To: Marc MERLIN Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Sat, 11 Jun 2022 07:52:59 -0700 Marc MERLIN <marc@merlins.org> wrote: > 1) mdadm --create /dev/md7 --level=5 --consistency-policy=ppl > --raid-devices=5 /dev/sd[abdef]1 --chunk=256 --bitmap=internal One more thing I wanted to mention, did you have PPL on your previous array? Or it was not implemented yet back then? I know it is supposed to protect against the write hole, which could have caused your previous FS corruption. > > > 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 > > > > Personally I have switched from Btrfs on MD to individual disks and MergerFS. > > That gives you no redundancy if a drive disk, correct? Yes, but in MergerFS each file is stored entirely within a single disk, there's no striping. So only files which happened to be on the failed disk are lost and need to be restored from backups. For this it helps to keep track of what was where, with something like "find /mnt/ > `date`.lst" in crontab. -- With respect, Roman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 17:54 ` Roman Mamedov @ 2022-06-12 17:31 ` Marc MERLIN 0 siblings, 0 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-12 17:31 UTC (permalink / raw) To: Roman Mamedov Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Sat, Jun 11, 2022 at 10:54:16PM +0500, Roman Mamedov wrote: > On Sat, 11 Jun 2022 07:52:59 -0700 > Marc MERLIN <marc@merlins.org> wrote: > > > 1) mdadm --create /dev/md7 --level=5 --consistency-policy=ppl > > --raid-devices=5 /dev/sd[abdef]1 --chunk=256 --bitmap=internal > > One more thing I wanted to mention, did you have PPL on your previous array? > Or it was not implemented yet back then? I know it is supposed to protect > against the write hole, which could have caused your previous FS corruption. Looks like I had internal bitmap instead gargamel:~# mdadm --query --detail /dev/md7 /dev/md7: Version : 1.2 Creation Time : Sun Feb 11 20:38:30 2018 Raid Level : raid5 Array Size : 23441561600 (22355.62 GiB 24004.16 GB) Used Dev Size : 5860390400 (5588.90 GiB 6001.04 GB) Raid Devices : 5 Total Devices : 5 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Jun 10 12:09:08 2022 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Consistency Policy : bitmap I'll switch PPL instead, thanks for that. Actually I need to migrate my other raid5 arrays to that too. It looks like it can be done at runtime. > Yes, but in MergerFS each file is stored entirely within a single disk, > there's no striping. So only files which happened to be on the failed disk are > lost and need to be restored from backups. For this it helps to keep track of > what was where, with something like "find /mnt/ > `date`.lst" in crontab. Right, I figured. It's not bad, but I do want no data loss if I lose a drive, so I'll take raid5. I realize that filesystem aware raid5, like the raid5 in btrfs which I'm not sure I can really trust, still? , could lay out the files to be one per disk without striping. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 14:52 ` Marc MERLIN 2022-06-11 17:54 ` Roman Mamedov @ 2022-06-12 21:21 ` Roman Mamedov 2022-06-13 17:46 ` Marc MERLIN 2022-06-13 18:13 ` Marc MERLIN 2022-06-20 20:37 ` Andrea Gelmini 2 siblings, 2 replies; 24+ messages in thread From: Roman Mamedov @ 2022-06-12 21:21 UTC (permalink / raw) To: Marc MERLIN Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Sat, 11 Jun 2022 07:52:59 -0700 Marc MERLIN <marc@merlins.org> wrote: > On Sat, Jun 11, 2022 at 02:30:33PM +0500, Roman Mamedov wrote: > > > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > > > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > > > [writethrough] writeback writearound none > > > > Maybe try LVM Cache this time? > > Hard to know either way, trading one layer for another, and LVM has > always seemed heavier I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you can add and remove cache just to see how it works; unlike with bcache, an LVM cache can be added to an existing LV and then removed without a trace, all without having to displace 44 TB of data for that. And plain no-cache LVM doesn't add much in terms of being a "layer". -- With respect, Roman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-12 21:21 ` Roman Mamedov @ 2022-06-13 17:46 ` Marc MERLIN 2022-06-13 18:06 ` Roman Mamedov 2022-06-13 18:10 ` Zygo Blaxell 2022-06-13 18:13 ` Marc MERLIN 1 sibling, 2 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-13 17:46 UTC (permalink / raw) To: Roman Mamedov Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: > On Sat, 11 Jun 2022 07:52:59 -0700 > Marc MERLIN <marc@merlins.org> wrote: > > > On Sat, Jun 11, 2022 at 02:30:33PM +0500, Roman Mamedov wrote: > > > > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > > > > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > > > > [writethrough] writeback writearound none > > > > > > Maybe try LVM Cache this time? > > > > Hard to know either way, trading one layer for another, and LVM has > > always seemed heavier > > I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you > can add and remove cache just to see how it works; unlike with bcache, an LVM > cache can be added to an existing LV and then removed without a trace, all > without having to displace 44 TB of data for that. Thanks. I've always felt that LVM was heavyweight and required extra steps and tools, so I've been avoiding it, but maybe that wasn't rational. bcache by the way, you can set it up without a backing device and then use it normally without the cache layer. I think it's actually pretty similar, but you have to set it up beforehand (just like LVM) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 17:46 ` Marc MERLIN @ 2022-06-13 18:06 ` Roman Mamedov 2022-06-14 4:51 ` Marc MERLIN 2022-06-13 18:10 ` Zygo Blaxell 1 sibling, 1 reply; 24+ messages in thread From: Roman Mamedov @ 2022-06-13 18:06 UTC (permalink / raw) To: Marc MERLIN Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, 13 Jun 2022 10:46:40 -0700 Marc MERLIN <marc@merlins.org> wrote: > bcache by the way, you can set it up without a backing device and then > use it normally without the cache layer. I think it's actually pretty > similar, but you have to set it up beforehand (just like LVM) What I mean is bcache in this way stays bcache-without-a-cache forever, which feels odd; it still goes through the bcache code, has the module loaded, keeps the device name, etc; Whereas in LVM caching is a completely optional side-feature, and many people would just run LVM in any case, not even thinking about enabling cache. LVM is basically "the next generation" of disk partitions, with way more features, but not much more overhead. -- With respect, Roman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 18:06 ` Roman Mamedov @ 2022-06-14 4:51 ` Marc MERLIN 0 siblings, 0 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-14 4:51 UTC (permalink / raw) To: Roman Mamedov, Zygo Blaxell Cc: Andrea Gelmini, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs Thanks to you both for your kind help. If I'm rebulding everything, might as well future-proof it as well as possible. On Mon, Jun 13, 2022 at 11:06:25PM +0500, Roman Mamedov wrote: > What I mean is bcache in this way stays bcache-without-a-cache forever, which > feels odd; it still goes through the bcache code, has the module loaded, keeps > the device name, etc; Fair point. I have done that, but I see what you're saying. > Whereas in LVM caching is a completely optional side-feature, and many people > would just run LVM in any case, not even thinking about enabling cache. LVM is > basically "the next generation" of disk partitions, with way more features, > but not much more overhead. Fair enough. I have used LVM for many years, since the now defunct lvm1, and I've run through a fair amount of issues, some reliability, some performance. that was many many years ago though, so I'll take your word for it that it's a lot more lightweight and safe now. Actually I think I stopped using LVM the same time I started using btrfs, because effectively btrfs subvolumes were close enough to LVM LVs for my use, but yes I understand that different LVs are actually different filesystems and you can do extra stuff like caching. Actually I have another array where there were so many files and snapshots that I split it into different LVs with dm-thin so that I didn't stress the btrfs code too much (which I'm told gets unhappy when you have hundreds of snapshots). On Mon, Jun 13, 2022 at 02:10:56PM -0400, Zygo Blaxell wrote: > You can trivially convert from lvmcache to plain LV on the fly. It's a > pretty essential capability for long-term maintenance, since you can't > move or resize the LV while it's cached. > > If you have a LV and you want it to be cached with bcache, you can hack > up the LVM configuration after the fact with https://github.com/g2p/blocks Got it, thanks much. On Mon, Jun 13, 2022 at 11:29:07PM +0500, Roman Mamedov wrote: > It is a question of whether you want to cache encrypted, or plain-text data. I > guess the former should be preferable, for a complete peace-of-mind against > data forensics vs the cache device, but with a toll on performance, due to the > need to re-decrypt even the cache hits each time. Right, I know that tradeoff. Also, LUKS makes things a bit more complicated if you want to grow the FS. > In case of caching encrypted, it's: > > mdraid => PV => LV => LUKS > | > (cache) > > Otherwise: > > mdraid => LUKS => PV => LV > | > (cache) Right. I'll probably do that. On Mon, Jun 13, 2022 at 04:08:07PM -0400, Zygo Blaxell wrote: > Add a cache LV to an existing LV with: > > lvcreate $vg -n $meta -L 1G $device > lvcreate $vg -n $pool -l 90%PVS $device > lvconvert -f --type cache-pool --poolmetadata $vg/$meta $vg/$pool > lvconvert -f --type cache --cachepool $vg/$pool $vg/$data --cachemode writethrough > > Uncache with: > > lvconvert -f --uncache $vg/$data > > Note that 'lvconvert' will flush the entire cache back to the backing > store during uncache at minimum IO priority, so it will take some time > and can be prolonged indefinitely by a continuous IO workload on top. > Also, the uncache operation will propagate any corruption in the SSD > cache back to the HDD LV, even in writethrough mode. Thanks much for the heads up. Best, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 17:46 ` Marc MERLIN 2022-06-13 18:06 ` Roman Mamedov @ 2022-06-13 18:10 ` Zygo Blaxell 1 sibling, 0 replies; 24+ messages in thread From: Zygo Blaxell @ 2022-06-13 18:10 UTC (permalink / raw) To: Marc MERLIN Cc: Roman Mamedov, Andrea Gelmini, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, Jun 13, 2022 at 10:46:40AM -0700, Marc MERLIN wrote: > On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: > > On Sat, 11 Jun 2022 07:52:59 -0700 > > Marc MERLIN <marc@merlins.org> wrote: > > > > > On Sat, Jun 11, 2022 at 02:30:33PM +0500, Roman Mamedov wrote: > > > > > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > > > > > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > > > > > [writethrough] writeback writearound none > > > > > > > > Maybe try LVM Cache this time? > > > > > > Hard to know either way, trading one layer for another, and LVM has > > > always seemed heavier > > > > I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you > > can add and remove cache just to see how it works; unlike with bcache, an LVM > > cache can be added to an existing LV and then removed without a trace, all > > without having to displace 44 TB of data for that. > > Thanks. I've always felt that LVM was heavyweight and required extra > steps and tools, so I've been avoiding it, but maybe that wasn't > rational. > bcache by the way, you can set it up without a backing device and then > use it normally without the cache layer. I think it's actually pretty > similar, but you have to set it up beforehand (just like LVM) You can trivially convert from lvmcache to plain LV on the fly. It's a pretty essential capability for long-term maintenance, since you can't move or resize the LV while it's cached. If you have a LV and you want it to be cached with bcache, you can hack up the LVM configuration after the fact with https://github.com/g2p/blocks > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-12 21:21 ` Roman Mamedov 2022-06-13 17:46 ` Marc MERLIN @ 2022-06-13 18:13 ` Marc MERLIN 2022-06-13 18:29 ` Roman Mamedov 2022-06-13 20:08 ` Zygo Blaxell 1 sibling, 2 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-13 18:13 UTC (permalink / raw) To: Roman Mamedov Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: > I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you > can add and remove cache just to see how it works; unlike with bcache, an LVM In case I decide to give that a shot, what would the actual LVM command(s) look like to create a null LVM? You'd just make a single PV using the cryptestup decrypted version of the mdadm raid5 and then an LV that takes all of it, but after the fact you can modify the LV and add a cache? Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 18:13 ` Marc MERLIN @ 2022-06-13 18:29 ` Roman Mamedov 2022-06-13 20:08 ` Zygo Blaxell 1 sibling, 0 replies; 24+ messages in thread From: Roman Mamedov @ 2022-06-13 18:29 UTC (permalink / raw) To: Marc MERLIN Cc: Andrea Gelmini, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, 13 Jun 2022 11:13:22 -0700 Marc MERLIN <marc@merlins.org> wrote: > On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: > > I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you > > can add and remove cache just to see how it works; unlike with bcache, an LVM > > In case I decide to give that a shot, what would the actual LVM > command(s) look like to create a null LVM? You'd just make a single PV > using the cryptestup decrypted version of the mdadm raid5 It is a question of whether you want to cache encrypted, or plain-text data. I guess the former should be preferable, for a complete peace-of-mind against data forensics vs the cache device, but with a toll on performance, due to the need to re-decrypt even the cache hits each time. In case of caching encrypted, it's: mdraid => PV => LV => LUKS | (cache) Otherwise: mdraid => LUKS => PV => LV | (cache) For the actual commands see e.g. https://tomlankhorst.nl/setup-lvm-raid-array-mdadm-linux#set-up-logical-volume-management-lvm > an LV that takes all of it, but after the fact you can modify the LV and add > a cache? Yes. -- With respect, Roman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 18:13 ` Marc MERLIN 2022-06-13 18:29 ` Roman Mamedov @ 2022-06-13 20:08 ` Zygo Blaxell 2022-06-14 6:36 ` Torbjörn Jansson 1 sibling, 1 reply; 24+ messages in thread From: Zygo Blaxell @ 2022-06-13 20:08 UTC (permalink / raw) To: Marc MERLIN Cc: Roman Mamedov, Andrea Gelmini, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, Jun 13, 2022 at 11:13:22AM -0700, Marc MERLIN wrote: > On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: > > I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you > > can add and remove cache just to see how it works; unlike with bcache, an LVM > > In case I decide to give that a shot, what would the actual LVM > command(s) look like to create a null LVM? You'd just make a single PV > using the cryptestup decrypted version of the mdadm raid5 and then an LV > that takes all of it, but after the fact you can modify the LV and add a > cache? Some variables: vg=name of VG... device=name of cache device (SSD) PV... base=name of existing backing (HDD) LV... meta=meta$base pool=pool$base Add a cache LV to an existing LV with: lvcreate $vg -n $meta -L 1G $device lvcreate $vg -n $pool -l 90%PVS $device lvconvert -f --type cache-pool --poolmetadata $vg/$meta $vg/$pool lvconvert -f --type cache --cachepool $vg/$pool $vg/$data --cachemode writethrough Uncache with: lvconvert -f --uncache $vg/$data Note that 'lvconvert' will flush the entire cache back to the backing store during uncache at minimum IO priority, so it will take some time and can be prolonged indefinitely by a continuous IO workload on top. Also, the uncache operation will propagate any corruption in the SSD cache back to the HDD LV, even in writethrough mode. > Mart > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-13 20:08 ` Zygo Blaxell @ 2022-06-14 6:36 ` Torbjörn Jansson 0 siblings, 0 replies; 24+ messages in thread From: Torbjörn Jansson @ 2022-06-14 6:36 UTC (permalink / raw) To: Zygo Blaxell, Marc MERLIN Cc: Roman Mamedov, Andrea Gelmini, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On 2022-06-13 22:08, Zygo Blaxell wrote: > On Mon, Jun 13, 2022 at 11:13:22AM -0700, Marc MERLIN wrote: >> On Mon, Jun 13, 2022 at 02:21:07AM +0500, Roman Mamedov wrote: >>> I'd suggest to put the LUKS volume onto an LV still (in case you don't), so you >>> can add and remove cache just to see how it works; unlike with bcache, an LVM >> >> In case I decide to give that a shot, what would the actual LVM >> command(s) look like to create a null LVM? You'd just make a single PV >> using the cryptestup decrypted version of the mdadm raid5 and then an LV >> that takes all of it, but after the fact you can modify the LV and add a >> cache? > > Some variables: > > vg=name of VG... > device=name of cache device (SSD) PV... > base=name of existing backing (HDD) LV... > meta=meta$base > pool=pool$base > > Add a cache LV to an existing LV with: > > lvcreate $vg -n $meta -L 1G $device > lvcreate $vg -n $pool -l 90%PVS $device > lvconvert -f --type cache-pool --poolmetadata $vg/$meta $vg/$pool > lvconvert -f --type cache --cachepool $vg/$pool $vg/$data --cachemode writethrough > > Uncache with: > > lvconvert -f --uncache $vg/$data > > Note that 'lvconvert' will flush the entire cache back to the backing > store during uncache at minimum IO priority, so it will take some time > and can be prolonged indefinitely by a continuous IO workload on top. > Also, the uncache operation will propagate any corruption in the SSD > cache back to the HDD LV, even in writethrough mode. > Personally when i setup lvmcache i always use the "all in one" command to create it. And if i forget the syntax because the man pages are a bit unclear on how to do it exactly then i got to: https://wiki.archlinux.org/title/LVM#Cache for a refresher. It is something like: lvcreate --type cache --cachemode writethrough -l 100%FREE -n root_cachepool MyVolGroup/rootvol /dev/fastdisk i usually change -l to -L and specify the size of the cache there and the name (-n) is not to important since the LV will still be named just the same as before the cache was enabled. so this name is "just" something that shows up in for example lvs output. only type=cache allows you to create and remove the cache with a live filesystem, the other type writecache requires filesystem and probably also the lv to be deactivated. don't remmeber exactly but was more effort to turn on/off writecache vs normal cache ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 14:52 ` Marc MERLIN 2022-06-11 17:54 ` Roman Mamedov 2022-06-12 21:21 ` Roman Mamedov @ 2022-06-20 20:37 ` Andrea Gelmini 2022-06-21 5:26 ` Zygo Blaxell 2 siblings, 1 reply; 24+ messages in thread From: Andrea Gelmini @ 2022-06-20 20:37 UTC (permalink / raw) To: Marc MERLIN Cc: Roman Mamedov, Andrei Borzenkov, Zygo Blaxell, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs Il giorno sab 11 giu 2022 alle ore 16:53 Marc MERLIN <marc@merlins.org> ha scritto: > Mmmh, bcachefs, I was not aware of this new one. Not sure if I want to > add yet another layer, esepcially if it's new. I share the link just to say: bcache author works in a great way. Bcachefs could be the idea to replace a few layers. Not to add new one. Just for the record. I'm using a 120TB array with BTRFS and bcache as caching over raid1 2TB ssd. I tried same setup with LVM, but - sadly - lvm tool complained about maximum cache size (16GB max, if I remember exactly, anyway no way to use the fully 2TB). Sad, because nowhere they mentioned this. Played a little bit with kernel source, but eventually didn't want to risk too much with a server I want to use in production at work. On my side, in the end, I really like cryptsetup for each HD, with mergerfs and snapraid (for my home setup). Very handy with replacing, playing, experimenting and so on. Each time I tried one big single volume setup, eventually I regret it. Ciao, Gelma ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-20 20:37 ` Andrea Gelmini @ 2022-06-21 5:26 ` Zygo Blaxell 2022-07-06 9:09 ` Andrea Gelmini 0 siblings, 1 reply; 24+ messages in thread From: Zygo Blaxell @ 2022-06-21 5:26 UTC (permalink / raw) To: Andrea Gelmini Cc: Marc MERLIN, Roman Mamedov, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs On Mon, Jun 20, 2022 at 10:37:25PM +0200, Andrea Gelmini wrote: > Il giorno sab 11 giu 2022 alle ore 16:53 Marc MERLIN > <marc@merlins.org> ha scritto: > > Mmmh, bcachefs, I was not aware of this new one. Not sure if I want to > > add yet another layer, esepcially if it's new. > > I share the link just to say: bcache author works in a great way. > Bcachefs could be the idea to replace a few layers. Not to add new > one. > > Just for the record. > I'm using a 120TB array with BTRFS and bcache as caching over raid1 2TB ssd. > I tried same setup with LVM, but - sadly - lvm tool complained about > maximum cache size (16GB max, if I remember exactly, anyway no way to > use the fully 2TB). > Sad, because nowhere they mentioned this. How many years ago was this? There's no such limit today. Here is a 500TB LV with 4TB of cache: # lvs -o +cachemode LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert CacheMode lvol0 tv Cwi-aoC--- 500.00t [lvol0_cache0_cvol] [lvol0_corig] 0.19 27.78 34.14 writeback lvol0_cache tv -wi-a----- 4.00t There is a limit on metadata size, but you can override it in lvm.conf. Presumably if you have a 100TB+ filesystem, you also have enough RAM lying around to make the metadata size larger (it's 2.8GB at chunk size 128, but that's not unreasonable for 4096GB of cache). > Played a little bit with kernel source, but eventually didn't want to > risk too much with a server I want to use in production at work. > > On my side, in the end, I really like cryptsetup for each HD, with > mergerfs and snapraid (for my home setup). > Very handy with replacing, playing, experimenting and so on. > Each time I tried one big single volume setup, eventually I regret it. > > Ciao, > Gelma ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-21 5:26 ` Zygo Blaxell @ 2022-07-06 9:09 ` Andrea Gelmini 0 siblings, 0 replies; 24+ messages in thread From: Andrea Gelmini @ 2022-07-06 9:09 UTC (permalink / raw) To: Zygo Blaxell Cc: Marc MERLIN, Roman Mamedov, Andrei Borzenkov, Josef Bacik, Chris Murphy, Qu Wenruo, linux-btrfs Il giorno mar 21 giu 2022 alle ore 07:26 Zygo Blaxell <ce3g8jdj@umail.furryterror.org> ha scritto: > How many years ago was this? There's no such limit today. Here is a > 500TB LV with 4TB of cache: Good to know, thanks! > There is a limit on metadata size, but you can override it in lvm.conf. > Presumably if you have a 100TB+ filesystem, you also have enough RAM lying > around to make the metadata size larger (it's 2.8GB at chunk size 128, > but that's not unreasonable for 4096GB of cache). Yeap. We have at least 48GB of RAM on each server. Well, I use them all to run beesd on night (tweaked the sources) Fun part, I found this project: https://github.com/pkolaczk/fclones It's a recent(!) file-based deduplicator in Rust, with parallelism on every stages, hard/soft liks and range clones. Perfect for my needs and blazing fast. Ciao, Gelma ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 4:51 Suggestions for building new 44TB Raid5 array Marc MERLIN 2022-06-11 9:30 ` Roman Mamedov @ 2022-06-11 23:44 ` Zygo Blaxell 2022-06-14 11:03 ` ronnie sahlberg [not found] ` <5e1733e6-471e-e7cb-9588-3280e659bfc2@aqueos.com> 3 siblings, 0 replies; 24+ messages in thread From: Zygo Blaxell @ 2022-06-11 23:44 UTC (permalink / raw) To: Marc MERLIN Cc: Andrei Borzenkov, Josef Bacik, linux-btrfs, Chris Murphy, Qu Wenruo On Fri, Jun 10, 2022 at 09:51:20PM -0700, Marc MERLIN wrote: > so, my apologies to all for the thread of death that is hopefully going > to be over soon. I still want to help Josef fix the tools though, > hopefully we'll get that filesystem back to a mountable state. > > That said, it's been over 2 months now, and I do need to get this > filesystem back up from backup, so I ended up buying new drives (5x > 11TiB in raid5). > > Given the pretty massive corruption that happened in ways that I still > can't explain, I'll make sure to turn off all the drive write caches > but I think I'm not sure I want to trust bcache anymore even though > I had it in writethrough mode. > > Here's the Email from March, questions still apply: > > Kernel will be 5.16. Filesystem will be 24TB and contain mostly bigger > files (100MB to 10GB). > > 1) mdadm --create /dev/md7 --level=5 --consistency-policy=ppl --raid-devices=5 /dev/sd[abdef]1 --chunk=256 --bitmap=internal > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > [writethrough] writeback writearound none > 3) cryptsetup luksFormat --align-payload=2048 -s 256 -c aes-xts-plain64 /dev/bcache64 > 4) cryptsetup luksOpen /dev/bcache64 dshelf1 > 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 > > Any other btrfs options I should set for format to improve reliability > first and performance second? > I'm told I should use space_cache=v2, is it default now with btrfs-progs 5.10.1-2 ? It's default with current btrfs-progs. I'm not sure what the cutoff version is, but it doesn't matter--you can convert to v2 on first mount, which will be fast on an empty filesystem. > As for bcache, I'm really thinking about droppping it, unless I'm told > it should be safe to use. I would not recommend the cache in this configuration for resilience because it doesn't keep device failures in separate failure domains. Common SSD failure modes (e.g. silent data corruption, dropped writes) can be detected but not repaired, and can affect any part of the filesystem when viewed through the cache. Unfortunately cache is only resilient with btrfs raid1 using SSD+HDD cached device pairs so that a failure of any SSD or HDD affects at most one btrfs device. That configuration works reasonably well, but you'll need a pile more disks (both HDD and SSD) to match the capacity. btrfs raid5 of SSD+HDD devices doesn't work--it will keep all IO accesses below the cache's sequential IO size cutoff, which will wear out the SSDs too fast (in addition to the other btrfs raid5 problems). Same problem with raid10 or raid0. I've tested btrfs with both bcache and lvmcache. I mostly use lvmcache, and have had no problems with it. bcache had problems in testing, so I've never used bcache outside of test environments. bcache has a few sharp edges when SSD devices fail that prevent recovery with the filesystem still online. It seems to trigger service-interrupting firmware bugs in some SSD models with its access patterns compared to lvmcache (failures that are common on one vendor/model/firmware that never happen on any other vendor/model/firmware, and that occur much more often, or at all, when bcache is in use compared to when bcache is not in use). I have not lost data with bcache when SSD corruption is not present--it survived hundreds of power-fail crash test cycles and came back after all the SSD firmware crashes in testing--but the service interruptions from crashing firmware and the inability to recover from a failed drive while keeping the filesystem online were a problem. We worked around this by using lvmcache instead. If your IO subsystem has problems with write dropping, then it's going to be much worse with any cache. Neither bcache nor lvmcache have any sort of hardening against SSD corruption or failure. They both fail badly on SSD corruption tests even in writethrough mode. > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Home page: http://marc.merlins.org/ > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-11 4:51 Suggestions for building new 44TB Raid5 array Marc MERLIN 2022-06-11 9:30 ` Roman Mamedov 2022-06-11 23:44 ` Zygo Blaxell @ 2022-06-14 11:03 ` ronnie sahlberg [not found] ` <5e1733e6-471e-e7cb-9588-3280e659bfc2@aqueos.com> 3 siblings, 0 replies; 24+ messages in thread From: ronnie sahlberg @ 2022-06-14 11:03 UTC (permalink / raw) To: Marc MERLIN Cc: Andrei Borzenkov, Zygo Blaxell, Josef Bacik, linux-btrfs, Chris Murphy, Qu Wenruo On Sat, 11 Jun 2022 at 17:16, Marc MERLIN <marc@merlins.org> wrote: > > so, my apologies to all for the thread of death that is hopefully going > to be over soon. I still want to help Josef fix the tools though, > hopefully we'll get that filesystem back to a mountable state. > > That said, it's been over 2 months now, and I do need to get this > filesystem back up from backup, so I ended up buying new drives (5x > 11TiB in raid5). > > Given the pretty massive corruption that happened in ways that I still > can't explain, I'll make sure to turn off all the drive write caches > but I think I'm not sure I want to trust bcache anymore even though > I had it in writethrough mode. > > Here's the Email from March, questions still apply: > > Kernel will be 5.16. Filesystem will be 24TB and contain mostly bigger > files (100MB to 10GB). > > 1) mdadm --create /dev/md7 --level=5 --consistency-policy=ppl --raid-devices=5 /dev/sd[abdef]1 --chunk=256 --bitmap=internal > 2) echo 0fb96f02-d8da-45ce-aba7-070a1a8420e3 > /sys/block/bcache64/bcache/attach > gargamel:/dev# cat /sys/block/md7/bcache/cache_mode > [writethrough] writeback writearound none > 3) cryptsetup luksFormat --align-payload=2048 -s 256 -c aes-xts-plain64 /dev/bcache64 > 4) cryptsetup luksOpen /dev/bcache64 dshelf1 > 5) mkfs.btrfs -m dup -L dshelf1 /dev/mapper/dshelf1 > > Any other btrfs options I should set for format to improve reliability > first and performance second? > I'm told I should use space_cache=v2, is it default now with btrfs-progs 5.10.1-2 ? > > As for bcache, I'm really thinking about droppping it, unless I'm told > it should be safe to use. > > Thanks, > Marc My needs are much more basic. I have a LOT of large files. ISO images and QEMU disk images. I also have hundreds of thousands of photos. I used different multi-disk solutions but found them all too fragile so now, last 8 years, I have used a setup that is basically 5 disks each with their own EXT4 filesystem ontop of LUKS and two additional drives to have 2 disk parity in snapraid. Now, snapraid does not do in-line raid updates so I carefully manage how I handle the data. Audio, photos and QEMU base images are immutable so these files are not a problem. For VM images I have for each machine an immutable 'base' image that snapraid takes care of and I have live images based on that that are not handled by snapraid. (qemu-img create -b base.img cow.img) If a live image goes corrupt due to a poweroutage or similar I just re-create it ontop of the latest archived base image. As I often do stuff to the VM images that cause kernel panics, this is a very convenient way to restore them quickly and with little effort to a known good state. If one disk has a catastrophic failure, I only lose the files on that particular disk and just have to restore them but nothing else. Now I do export them as 5 different filesystems/shares but that is just because I am too lazy to set up some kind of "merge fs". If your use case is mostly-read and mostly-archive this might work for you too and it is VERY reliable. Ease of mind knowing that if a single disk dies I do not have a total dataloss scenario. If the Windows VM disk dies, I just restore those images while all the other disks are still online. It is simple, primitive and 1980 type of technology but it works. And it is reliable. > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <5e1733e6-471e-e7cb-9588-3280e659bfc2@aqueos.com>]
* Re: Suggestions for building new 44TB Raid5 array [not found] ` <5e1733e6-471e-e7cb-9588-3280e659bfc2@aqueos.com> @ 2022-06-20 15:01 ` Marc MERLIN 2022-06-20 15:52 ` Ghislain Adnet 2022-06-20 17:02 ` Andrei Borzenkov 0 siblings, 2 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-20 15:01 UTC (permalink / raw) To: Ghislain Adnet; +Cc: linux-btrfs > I have a stupid question to ask : Why use btrfs here ? Is not mdamd+xfs good enough ? I use btrfs for historical snapshots and btrfs send/receive remote backups. > If you want snapshot why not use ZFS then ? i try to use btrfs myself and meet a lot of issues with it that i did not had with mdadm+ext4. Perhaps btrfs is not suited to that use (here raid5) ? ZFS is not GPL compatible and out of tree. > ZFS has crypt, raid5 like array and snapshot and LARC cache allready so no need to add 4 layer on it. It seems a solution for you. It has a few of its own issues, but yes, if it were actually GPL compatible and in the linux kernel source tree, I'd consider it. It's also owned by a company (Oracle) that has tried to sue others for billions of dollars over software patents, or even an algorithm, i.e. not a company I'm willing to trust by any means. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-20 15:01 ` Marc MERLIN @ 2022-06-20 15:52 ` Ghislain Adnet 2022-06-20 16:27 ` Marc MERLIN 2022-06-20 17:02 ` Andrei Borzenkov 1 sibling, 1 reply; 24+ messages in thread From: Ghislain Adnet @ 2022-06-20 15:52 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-btrfs > > It has a few of its own issues, but yes, if it were actually GPL > compatible and in the linux kernel source tree, I'd consider it. > > It's also owned by a company (Oracle) that has tried to sue others for > billions of dollars over software patents, or even an algorithm, i.e. > not a company I'm willing to trust by any means. > well i completly understand i use btrfs for the same reason but it seems on your side that this use case is a little far from the features provided. The more layer i use the more i fear a Pise tower syndrome :) good luck for the setup ! -- cordialement, Ghislain ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-20 15:52 ` Ghislain Adnet @ 2022-06-20 16:27 ` Marc MERLIN 0 siblings, 0 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-20 16:27 UTC (permalink / raw) To: Ghislain Adnet; +Cc: linux-btrfs On Mon, Jun 20, 2022 at 05:52:14PM +0200, Ghislain Adnet wrote: > well i completly understand i use btrfs for the same reason but > it seems on your side that this use case is a little far from the > features provided. > The more layer i use the more i fear a Pise tower syndrome :) I share that worry, but using ZFS simply isn't an option to me for the reasons explained. But indeed, I removed bcache as a layer. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-20 15:01 ` Marc MERLIN 2022-06-20 15:52 ` Ghislain Adnet @ 2022-06-20 17:02 ` Andrei Borzenkov 2022-06-20 17:26 ` Marc MERLIN 1 sibling, 1 reply; 24+ messages in thread From: Andrei Borzenkov @ 2022-06-20 17:02 UTC (permalink / raw) To: Marc MERLIN, Ghislain Adnet; +Cc: linux-btrfs On 20.06.2022 18:01, Marc MERLIN wrote: >> I have a stupid question to ask : Why use btrfs here ? Is not mdamd+xfs good enough ? > > I use btrfs for historical snapshots and btrfs send/receive remote > backups. > >> If you want snapshot why not use ZFS then ? i try to use btrfs myself and meet a lot of issues with it that i did not had with mdadm+ext4. Perhaps btrfs is not suited to that use (here raid5) ? > > ZFS is not GPL compatible and out of tree. > >> ZFS has crypt, raid5 like array and snapshot and LARC cache allready so no need to add 4 layer on it. It seems a solution for you. > > It has a few of its own issues, but yes, if it were actually GPL > compatible and in the linux kernel source tree, I'd consider it. > > It's also owned by a company (Oracle) ZFS on Linux is not owned by Oracle to my best knowledge. https://openzfs.github.io/openzfs-docs/License.html that has tried to sue others for > billions of dollars over software patents, or even an algorithm, i.e. > not a company I'm willing to trust by any means. > > Marc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Suggestions for building new 44TB Raid5 array 2022-06-20 17:02 ` Andrei Borzenkov @ 2022-06-20 17:26 ` Marc MERLIN 0 siblings, 0 replies; 24+ messages in thread From: Marc MERLIN @ 2022-06-20 17:26 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Ghislain Adnet, linux-btrfs On Mon, Jun 20, 2022 at 08:02:59PM +0300, Andrei Borzenkov wrote: > ZFS on Linux is not owned by Oracle to my best knowledge. > > https://openzfs.github.io/openzfs-docs/License.html Oracle bought Sun and its patent portfolio in the process, including all claims to any patents in ZFS. I simply will never trust them given what they've already done. I did give a full talk about this issue years ago. https://marc.merlins.org/linux/talks/Btrfs-LC2014-JP/Btrfs.pdf and go to page #5 and https://www.theregister.com/2010/09/09/oracle_netapp_zfs_dismiss/ basically there likely are Netapp patents in ZFS too, but I'm less worried about Netapp suing others for patents, and they did settle with Sun back in the days. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2022-07-06 9:13 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-11 4:51 Suggestions for building new 44TB Raid5 array Marc MERLIN 2022-06-11 9:30 ` Roman Mamedov [not found] ` <CAK-xaQYc1PufsvksqP77HMe4ZVTkWuRDn2C3P-iMTQzrbQPLGQ@mail.gmail.com> 2022-06-11 14:52 ` Marc MERLIN 2022-06-11 17:54 ` Roman Mamedov 2022-06-12 17:31 ` Marc MERLIN 2022-06-12 21:21 ` Roman Mamedov 2022-06-13 17:46 ` Marc MERLIN 2022-06-13 18:06 ` Roman Mamedov 2022-06-14 4:51 ` Marc MERLIN 2022-06-13 18:10 ` Zygo Blaxell 2022-06-13 18:13 ` Marc MERLIN 2022-06-13 18:29 ` Roman Mamedov 2022-06-13 20:08 ` Zygo Blaxell 2022-06-14 6:36 ` Torbjörn Jansson 2022-06-20 20:37 ` Andrea Gelmini 2022-06-21 5:26 ` Zygo Blaxell 2022-07-06 9:09 ` Andrea Gelmini 2022-06-11 23:44 ` Zygo Blaxell 2022-06-14 11:03 ` ronnie sahlberg [not found] ` <5e1733e6-471e-e7cb-9588-3280e659bfc2@aqueos.com> 2022-06-20 15:01 ` Marc MERLIN 2022-06-20 15:52 ` Ghislain Adnet 2022-06-20 16:27 ` Marc MERLIN 2022-06-20 17:02 ` Andrei Borzenkov 2022-06-20 17:26 ` Marc MERLIN
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.