Hello Wenruo (and all), > Any log on `btrfs check` without --repair? This was all after I reformatted the partition, so it might not be as useful. But as you see, `dmesg` reports 14 corruption errors on /dev/sda1 (which has been functioning correctly) but `btrfs scrub` does not report any problems. I'll do a btrfs check when I boot from a live USB. > But normally, csum read shouldn't lead to RO, thus I believe there > are more problems of that previous failure. I think there are other problems indeed, not just csum mismatch. I got lots of I/O errors, but now after reformatting my partition they just disappeared. Particularly, writing to the filesystem could randomly crash the filesystem. It could be a hardware issue, but now it seems more likely to be software-related. Best, Xuanrui On Tue, 2020-06-02 at 09:18 +0800, Qu Wenruo wrote: > > On 2020/6/2 上午5:08, Xuanrui Qi wrote: > > Hello all, > > > > I have just recovered from a massive filesystem corruption problem > > which turned out to be a total nightmare, and I have strong reason > > to > > suspect that it is related to eCryptfs-encrypted folders on btrfs. > > > > I run Arch Linux and have my /home directory as a btrfs partition. > > My > > user's home directory (/home/xuanrui) is encrypted using eCryptFS. > > > > I ran into a massive filesystem corrpution issue a while ago. When > > reading certain files or occasionally writing to files, I encounter > > FS > > errors (mainly checksum errors, but also other I/O errors). Then my > > file system becomes read-only because errors were encountered. > > It's a pity we won't get the dmesg of that incident, what would be > super > useful to debug. > > > A `btrfs scrub` identified a dozen of checksum errors which were > > "not > > correctable", and `btrfs check --repair` (and `btrfs check --repair > > -- > > init-csum-tree`) > > Not recommended, but the output may still help. > > > also failed to fix anything. The former crashed in a > > segfault, and the latter refused to write anything because of an > > "I/O > > error". > > > > Unfortunately, I don't have any logs because I had to nuke (wipe & > > re- > > make) my filesystem as the solution. However, after the > > reformatting I > > gave up using eCryptFs, and the file corruption bugs have not > > reappeared since. > > That's a little strange. I guess there is some buffered IO mixed with > direct IO, which is known to cause csum mismatch, while other fs just > can't detect such data corruption and pretend nothing happened. > > But normally, csum read shouldn't lead to RO, thus I believe there > are > more problems of that previous failure. > > > Initially I suspected that it was a hardware issue, > > but I did a SMART test and no errors were detected; I strongly > > suspect > > that it is related to eCryptFS. > > > > System info: > > > > uname -a: > > > > Linux xuanruiwork 5.6.15-3-clear #1 SMP Sun, 31 May 2020 19:57:42 > > +0000 > > x86_64 GNU/Linux > > > > btrfs --version: > > btrfs-progs v5.6.1 > > > > (the rest is from after the reformat, but the setup is identical to > > before the reformat sans eCryptFS) > > > > btrfs fi show: > > Label: none uuid: 823961e1-6b9e-4ab8-b5a7-c17eb8c40d64 > > Total devices 1 FS bytes used 57.58GiB > > devid 1 size 332.94GiB used 60.02GiB path /dev/sda3 > > > > btrfs fi df /home: > > Data, single: total=59.01GiB, used=57.26GiB > > System, single: total=4.00MiB, used=16.00KiB > > Metadata, single: total=1.01GiB, used=328.25MiB > > GlobalReserve, single: total=75.17MiB, used=0.00B > > > > Some output from dmesg (note that /dev/sda1 is not the corrupted > > filesystem; these corruptions seem to have been self-corrected by > > btrfs): > > > > [ 3.434351] BTRFS: device fsid 823961e1-6b9e-4ab8-b5a7- > > c17eb8c40d64 > > devid 1 transid 79 /dev/sda3 scanned by systemd-udevd (519) > > [ 3.440896] BTRFS: device fsid a3892669-1ad8-4ff3-9747- > > 0f8c405c0e6a > > devid 1 transid 4769881 /dev/sda1 scanned by systemd-udevd (487) > > [ 3.461539] BTRFS info (device sda1): disk space caching is > > enabled > > [ 3.461540] BTRFS info (device sda1): has skinny extents > > [ 3.464079] BTRFS info (device sda1): bdev /dev/sda1 errs: wr 0, > > rd > > 0, flush 0, corrupt 14, gen 0 > > Corruption count 14 doesn't seem good. > > > [ 3.510991] BTRFS info (device sda1): enabling ssd optimizations > > [ 5.938153] BTRFS info (device sda1): disk space caching is > > enabled > > [ 7.072974] BTRFS info (device sda3): enabling ssd optimizations > > [ 7.072977] BTRFS info (device sda3): disk space caching is > > enabled > > [ 7.072978] BTRFS info (device sda3): has skinny extents > > [ 3710.968433] BTRFS warning (device sda3): qgroup rescan init > > failed, > > qgroup is not enabled > > And btrfs is trying to init qgroup rescan while qgroup is not > enabled? > That's doesn't sound good either. > > > [ 7412.459332] BTRFS info (device sda1): scrub: started on devid 1 > > [ 7545.641724] BTRFS info (device sda1): scrub: finished on devid 1 > > with status: 0 > > [ 8244.846830] BTRFS info (device sda3): scrub: started on devid 1 > > [ 8369.651774] BTRFS info (device sda3): scrub: finished on devid 1 > > with status: 0 > > Any log on `btrfs check` without --repair? > > Thanks, > Qu > > If anyone could look into the issue, it would be greatly > > appreciated. > > > > Best, > > Xuanrui > >