Btrfs on LUKS on loopback mounted image on Btrfs

* Btrfs on LUKS on loopback mounted image on Btrfs
@ 2019-08-21 19:42 Chris Murphy
  2019-08-21 20:12 ` Roman Mamedov
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Murphy @ 2019-08-21 19:42 UTC (permalink / raw)
  To: Btrfs BTRFS

Hi,

Why do this? a) compression for home, b) encryption for home, c) home
is portable because it's a file, d) I still get btrfs snapshots
anywhere (I tend to snapshot subvolumes inside of cryptohome; but I
could "snapshot" outside of it by reflink copying the backing file.

But, I'm curious about other's experiences. I've been doing this for a
while (few years) in the following configuration.

NVMe>plain partition>Btrfs sysroot->fallocated file that also has
chattr +C applied>attached to loop device>luksopen>Btrfs on the
dmcrypt device>mounted at /home.

sysroot mkfs options: -dsingle -msingle
cryptohome mkfs options: -M

Btrfs sysroot mount options: noatime,compress=zstd:3,ssd,space_cache=v2
dmcrypt discard passthrough is enabled
Btrfs crypto home mount options: noatime,compress=zstd:3,ssd,space_cache=v2

Ergo, pretty much the same except the smallish home uses mixed block
groups, and I mainly did that to avoid any balance related issues in
home, and figure the allocation behavior at this layer is
irrelevant/virtual anyway. The Btrfs on top of the actual device does
used separate block groups, and sees the "stream" from the loop device
as all data.

I have done some crazy things with this, like, I routinely,
intentionally, just force power off on the laptop while this is all
assembled as described. Literally hundreds of times. Zero complaints
by either Btrfs (as in no mount time complaints, no btrfs check
complaints, no scrub complaints, no Firefox database complaints). I
admit I do not often do super crazy things like simultaneous heavy
writes to both sysroot and home, and *then* force the power off. I
have done it, just not enough times that I can say for sure it's not
possible to corrupt either one of these file systems.

I have not benchmarked this setup at all, but I notice no unusual
latency. It might exist, just that the use cases I regularly use don't
display any additional latency (I do go back and forth between a
crypto home and plaintext home on the same system). For VMs, the
images tend to be +C raw images in /var/lib/libvirt/images; but a
valid use case exists for VM user sessions, including GNOME Boxes
which creates a qcow2 file in /home. That's a curious case I haven't
tested. There's now a new virtio-fs driver that might be better for
this use case, and directly use a subvolume in cryptohome, no VM
backing file needed. (?)

Cryptohome does get subject to fstrim.timer, which passes through and
punches holes in the file just fine. But, as a consequence of this
entire arrangement, the loopback mounted file does fragment quite a
lot. It's only a 4GiB image file, not even half full, and there are
18000+ fragments for the file. I don't defragment it, ever. I don't
use autodefrag. But I'm using NVMe, which has super low latency and
supports multiqueueing. I think it would be a problem on conventional
single queue SATA SSD and HDD.

And to amp this up a notch, I wonder about parallelism or multiqueue
limitations of the loop device? I know XFS and Btrfs both do leverage
parallelism quite a bit.

Anyway, the point is, I'm curious about this arrangement, other
arrangements, and avoiding pathological cases.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 3+ messages in thread