On Mon, Nov 03, 2014 at 10:11:18AM -0700, Chris Murphy wrote: > > On Nov 2, 2014, at 8:43 PM, Zygo Blaxell wrote: > > btrfs seems to assume the data is correct on both disks (the generation > > numbers and checksums are OK) but gets confused by equally plausible but > > different metadata on each disk. It doesn't take long before the > > filesystem becomes data soup or crashes the kernel. > > This is a pretty significant problem to still be present, honestly. I > can understand the "catchup" mechanism is probably not built yet, > but clearly the two devices don't have the same generation. The lower > generation device should probably be booted/ignored or declared missing > in the meantime to prevent trashing the file system. The problem with generation numbers is when both devices get divergent generation numbers but we can't tell them apart, e.g. 1. sda generation = 5, sdb generation = 5 2. sdb temporarily disconnects, so we are degraded on just sda 3. sda gets more generations 6..9 4. sda temporarily disconnects, so we have no disks at all. 5. the machine reboots, gets sdb back but not sda If we allow degraded here, then: 6. sdb gets more generations 6..9 7. sdb disconnects, no disks so no filesystem 8. the machine reboots again, this time with sda and sdb present Now we have two disks with equal generation numbers. Generations 6..9 on sda are not the same as generations 6..9 on sdb, so if we mix the two disks' metadata we get bad confusion. It needs to be more than a sequential number. If one of the disks disappears we need to record this fact on the surviving disks, and also cope with _both_ disks claiming to be the "surviving" one. > > Chris Murphy > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html