On 2015-11-30 11:48, Chris Mason wrote: > On Sat, Nov 28, 2015 at 01:46:34PM +0000, Hugo Mills wrote: >> We've just had someone on IRC with a problem mounting their FS. The >> main problem is that they've got a corrupt log tree. That isn't the >> subject of this email, though. >> >> The issue I'd like to raise is that even with -oro as a point >> option, the FS is trying to replay the log tree. The dmesg output from >> mount -oro is at the end of the email. >> >> Now, my memory, experience and understanding is that the FS >> doesn't, and shouldn't replay the log tree on a RO mount, because the >> FS should still be consistent even without the reply, and >> RO-means-actually-RO is possible and desirable. (Compared to a >> journalling FS, where journal replay is required for a consistent, >> usable FS). >> >> So, this looks to me like a regression that's come in somewhere. >> >> (Just for completeness, the system in question usually runs 4.2.5, >> but the live CD the OP is using is 4.2.3). > > We do need to replay the log tree, even on readonly mounts. Otherwise > files created and fsunk before crashing may not even exist. I would argue that if a user is trying to mount read-only after a crash (that is, the user requests a read-only mount, not if the kernel forces it), then that probably means that the user has a specific reason for doing so, and doesn't want us writing to the filesystem at all. I understand wanting consistency, but if your system just crashed and your FS won't mount RW, then it's probably not a good idea to do anything that would cause it to be written to until you've figured out what's wrong and fixed it. Because of how BTRFS is designed, about half of the things that are needed for recovery on average, need a mounted filesystem. If you can't mount RW, then something _is_ broken, and you shouldn't be doing anything to the FS unless the user tells you to. > > We'll bail out of the log replay on readonly media, but otherwise the > replay always happens. We have the ability to make a RO mount truly RO, so we should have some way to do that without needing to jump through hoops to make the media read-only. Not needing to jump through hoops to do this is a BIG selling point for some people (myself included) for a filesystem. Perhaps we should provide an option to control if the log replay happens at all (and then we wouldn't need btrfs-zero-log)? Or we could replay the log in memory, and only write changes to disk if the FS is mounted RW.