On 2014-05-19 22:07, Russell Coker wrote: > On Mon, 19 May 2014 23:47:37 Brendan Hide wrote: >> This is extremely difficult to measure objectively. Subjectively ... see >> below. >> >>> [snip] >>> >>> *What other failure modes* should we guard against? >> >> I know I'd sleep a /little/ better at night knowing that a double disk >> failure on a "raid5/1/10" configuration might ruin a ton of data along >> with an obscure set of metadata in some "long" tree paths - but not the >> entire filesystem. > > My experience is that most disk failures that don't involve extreme physical > damage (EG dropping a drive on concrete) don't involve totally losing the > disk. Much of the discussion about RAID failures concerns entirely failed > disks, but I believe that is due to RAID implementations such as Linux > software RAID that will entirely remove a disk when it gives errors. > > I have a disk which had ~14,000 errors of which ~2000 errors were corrected by > duplicate metadata. If two disks with that problem were in a RAID-1 array > then duplicate metadata would be a significant benefit. > >> The other use-case/failure mode - where you are somehow unlucky enough >> to have sets of bad sectors/bitrot on multiple disks that simultaneously >> affect the only copies of the tree roots - is an extremely unlikely >> scenario. As unlikely as it may be, the scenario is a very painful >> consequence in spite of VERY little corruption. That is where the >> peace-of-mind/bragging rights come in. > > http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html > > The NetApp research on latent errors on drives is worth reading. On page 12 > they report latent sector errors on 9.5% of SATA disks per year. So if you > lose one disk entirely the risk of having errors on a second disk is higher > than you would want for RAID-5. While losing the root of the tree is > unlikely, losing a directory in the middle that has lots of subdirectories is > a risk. > > I can understand why people wouldn't want ditto blocks to be mandatory. But > why are people arguing against them as an option? > > > As an aside, I'd really like to be able to set RAID levels by subtree. I'd > like to use RAID-1 with ditto blocks for my important data and RAID-0 for > unimportant data. > But the proposed changes for n-way replication would already handle this. They would just need the option of having more than one copy per device (which theoretically shouldn't be too hard once you have n-way replication). Also, BTRFS already has the option of replicating the root tree across multiple devices (it is included in the System Data subset), and in fact dose so by default when using multiple devices. Also, there are plans to have per-subvolume or per file RAID level selection, but IIRC that is planned for after n-way replication (and of course, RAID 5/6, as n-way replication isn't going to be implemented until after RAID 5/6)