On Fri, Feb 01, 2019 at 11:28:27PM -0500, Alan Hardman wrote: > I have a Btrfs filesystem using 6 partitionless disks in RAID1 that's failing to mount. I've tried the common recommended safe check options, but I haven't gotten the disk to mount at all, even with -o ro,recovery. If necessary, I can try to use the recovery to another filesystem, but I have around 18 TB of data on the filesystem that won't mount, so I'd like to avoid that if there's some other way of recovering it. > > Versions: > btrfs-progs v4.19.1 > Linux localhost 4.20.6-arch1-1-ARCH #1 SMP PREEMPT Thu Jan 31 08:22:01 UTC 2019 x86_64 GNU/Linux > > Based on my understanding of how RAID1 works with Btrfs, I would expect a single disk failure to not prevent the volume from mounting entirely, but I'm only seeing one disk with errors according to dmesg output, maybe I'm misinterpreting it: > > [ 534.519437] BTRFS warning (device sdd): 'recovery' is deprecated, use 'usebackuproot' instead > [ 534.519441] BTRFS info (device sdd): trying to use backup root at mount time > [ 534.519443] BTRFS info (device sdd): disk space caching is enabled > [ 534.519446] BTRFS info (device sdd): has skinny extents > [ 536.306194] BTRFS info (device sdd): bdev /dev/sdc errs: wr 23038942, rd 22208378, flush 1, corrupt 29486730, gen 2933 > [ 556.126928] BTRFS critical (device sdd): corrupt leaf: root=2 block=25540634836992 slot=45, unexpected item end, have 13882 expect 13898 It's worth noting that 13898-13882 = 16, which is a power of two. This means that you most likely have a single-bit error in your metadata. That, plus the checksum not being warned about, would strongly suggest that you have bad RAM. I would recommend that you check your RAM first before trying anything else that would write to your filesystem (including btrfs check --repair). Hugo. > [ 556.134767] BTRFS critical (device sdd): corrupt leaf: root=2 block=25540634836992 slot=45, unexpected item end, have 13882 expect 13898 > [ 556.150278] BTRFS critical (device sdd): corrupt leaf: root=2 block=25540634836992 slot=45, unexpected item end, have 13882 expect 13898 > [ 556.150310] BTRFS error (device sdd): failed to read block groups: -5 > [ 556.216418] BTRFS error (device sdd): open_ctree failed > > If helpful, here is some lsblk output: > > NAME TYPE SIZE FSTYPE MOUNTPOINT UUID > sda disk 111.8G > ├─sda1 part 1.9M > └─sda2 part 111.8G ext4 / c598dfdf-d6e7-47d3-888a-10f5f53fa338 > sdb disk 7.3T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > sdc disk 7.3T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > sdd disk 7.3T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > sde disk 7.3T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > sdf disk 2.7T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > sdh disk 2.7T btrfs 8f26ae2d-84b5-47d7-8f19-64b0ef5a481b > > My main system partition on sda mounts fine and is usable to work with the btrfs filesystem that's having issues. > > Running "btrfs check /dev/sdb" exits with this: > > Opening filesystem to check... > Incorrect offsets 13898 13882 > ERROR: cannot open file system > > Also, "btrfs restore -Dv /dev/sdb /tmp" outputs some of the files on the filesystem but not all of them. I'm not sure if this is limited to the files on that physical disk, or if there's a bigger issue with the filesystem. I'm not sure what the best approach from here is, so any advice would be great. -- Hugo Mills | If it's December 1941 in Casablanca, what time is it hugo@... carfax.org.uk | in New York? http://carfax.org.uk/ | PGP: E2AB1DE4 | Rick Blaine, Casablanca