Btrfs - distribute files equally across multiple devices

* Btrfs - distribute files equally across multiple devices
@ 2015-07-06 16:22 Johannes Pfrang
  2015-07-06 16:45 ` Roman Mamedov
  2015-07-06 17:53 ` Hugo Mills
  0 siblings, 2 replies; 5+ messages in thread
From: Johannes Pfrang @ 2015-07-06 16:22 UTC (permalink / raw)
  To: linux-btrfs

Cross-posting my unix.stackexchange.com question[1] to the btrfs list
(slightly modified):

[1]
https://unix.stackexchange.com/questions/214009/btrfs-distribute-files-equally-across-multiple-devices

---------------------------------------------------------------------------------

I have a btrfs volume across two devices that has metadata RAID 1 and
data RAID 0. AFAIK, in the event one drive would fail, practically all
files above the 64KB default stripe size would be corrupted. As this
partition isn't performance critical, but should be space-efficient,
I've thought about re-balancing the filesystem to distribute files
equally across disks, but something like that doesn't seem to exist. The
ultimate goal would be to be able to still read some of the files in the
event of a drive failure.

AFAIK, using "single"/linear data allocation just fills up drives one by
one (at least that's what the wiki says).

Simple example (according to my best knowledge):

Write two 128KB files (file0, file1) to two devices (dev0, dev1):

RAID0:

    file0/chunk0 (64KB): dev0
    file0/chunk1 (64KB): dev1
    file1/chunk0 (64KB): dev0
    file1/chunk1 (64KB): dev1

Linear:

    file0 (128KB): dev0
    file1 (128KB): dev0

distribute files:

    file0 (128KB): dev0
    file1 (128KB): dev1

The simplest implementation would probably be something like: Always
write files to the disk with the least amount of space used. I think
this may be a valid software-raid use-case, as it combines RAID 0 (w/o
some of the performance gains[2]) with recoverability of about half of
the data/files (balanced by filled space or amount of files) in the
event of a drive-failure[3] by using filesystem information a
hardware-raid doesn't have. In the end this is more or less JBOD with
balanced disk usage + filesystem intelligence.

Is there something like that already in btrfs or could this be something
the btrfs-devs would consider?

[2] Still can read/write multiple files from/to different disks, so less
performance only for "single-file-reads/writes"
[3] using two disks, otherwise (totalDisks-failedDisks)/totalDisks

^ permalink raw reply	[flat|nested] 5+ messages in thread