Hi Qu, This was the output: ~# btrfs rescue chunk-recover /dev/sde Scanning: DONE in dev0, DONE in dev1, DONE in dev2 parent transid verify failed on 4146261671936 wanted 1427658 found 1439315 parent transid verify failed on 4146261721088 wanted 1427658 found 1439317 parent transid verify failed on 4146236669952 wanted 1427658 found 1439310 parent transid verify failed on 4146174771200 wanted 1427600 found 1439310 parent transid verify failed on 4146258919424 wanted 1427656 found 1439317 parent transid verify failed on 4146238095360 wanted 1427658 found 1439317 parent transid verify failed on 4146260951040 wanted 1427656 found 1439317 parent transid verify failed on 4146266193920 wanted 1427656 found 1439065 parent transid verify failed on 4146067701760 wanted 1427599 found 1439304 parent transid verify failed on 4146246123520 wanted 1427599 found 1439316 parent transid verify failed on 4146246139904 wanted 1427599 found 1439312 parent transid verify failed on 4146246238208 wanted 1427599 found 1439317 parent transid verify failed on 4146246254592 wanted 1427599 found 1439317 parent transid verify failed on 4146246303744 wanted 1427599 found 1439317 parent transid verify failed on 4146246320128 wanted 1427599 found 1439317 parent transid verify failed on 4146246336512 wanted 1427599 found 1439317 parent transid verify failed on 4146246352896 wanted 1427599 found 1438647 parent transid verify failed on 4146246369280 wanted 1427599 found 1439312 parent transid verify failed on 4146237063168 wanted 1427604 found 1439314 parent transid verify failed on 4146236637184 wanted 1427603 found 1439316 parent transid verify failed on 4146260754432 wanted 1427604 found 1439317 parent transid verify failed on 4146246516736 wanted 1427599 found 1439317 parent transid verify failed on 4146246533120 wanted 1427599 found 1439065 parent transid verify failed on 4146268749824 wanted 1427602 found 1439316 parent transid verify failed on 5141904293888 wanted 1419828 found 1439215 parent transid verify failed on 5141904293888 wanted 1419828 found 1439215 parent transid verify failed on 5141904293888 wanted 1419828 found 1439215 Ignoring transid failure leaf parent key incorrect 5141904293888 ERROR: failed to read block groups: Operation not permitted open with broken chunk error Chunk tree recovery failed ... and attached a couple of btrfs checks without doing any dangerous option Thanks, Bastiaan On Fri, 16 Jul 2021 at 14:26, Qu Wenruo wrote: > > > > On 2021/7/16 下午7:12, bw wrote: > > Hello, > > > > My raid1 with 3 hdd's that contain /home and /data cannot be mounted > > after a freeze/restart on the 14 of juli > > > > My root / (ubuntu 20.10) is on a raid with 2 ssd's so I can boot but I > > always end up in rescue mode atm. When disabling the two mounts (/data > > and /home) in fstab i can log in as root. (had to first change the > > root password via a rescue usb in order to log in) > > /dev/sde has corrupted chunk root, which is pretty rare. > > [ 8.175417] BTRFS error (device sde): parent transid verify failed on > 5028524228608 wanted 1427600 found 1429491 > [ 8.175729] BTRFS error (device sde): bad tree block start, want > 5028524228608 have 0 > [ 8.175771] BTRFS error (device sde): failed to read chunk root > > Chunk tree is the very essential tree, handling the logical bytenr -> > physical device mapping. > > If it has something wrong, it's a big problem. > > But normally, such transid error indicates the HDD or the hardware RAID > controller has something wrong handling barrier/flush command. > Mostly it means the disk or the hardware controller is lying about its > FLUSH command. > > > You can try "btrfs rescue chunk-recovery" but I doubt the chance of > success, as such transid error never just show up in one tree. > > > Now let's talk about the other device, /dev/sdb. > > This is more straightforward: > > [ 3.165790] ata2.00: exception Emask 0x10 SAct 0x10000 SErr 0x680101 > action 0x6 frozen > [ 3.165846] ata2.00: irq_stat 0x08000000, interface fatal error > [ 3.165892] ata2: SError: { RecovData UnrecovData 10B8B BadCRC Handshk } > [ 3.165940] ata2.00: failed command: READ FPDMA QUEUED > [ 3.165987] ata2.00: cmd 60/f8:80:08:01:00/00:00:00:00:00/40 tag 16 > ncq dma 126976 in > res 40/00:80:08:01:00/00:00:00:00:00/40 Emask > 0x10 (ATA bus error) > [ 3.166055] ata2.00: status: { DRDY } > > Read command just failed, with hardware reporting internal checksum error. > > This definitely means the device is not working properly. > > > And later we got the even stranger error message: > > [ 3.571793] sd 1:0:0:0: [sdb] tag#16 FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE cmd_age=0s > [ 3.571848] sd 1:0:0:0: [sdb] tag#16 Sense Key : Illegal Request > [current] > [ 3.571895] sd 1:0:0:0: [sdb] tag#16 Add. Sense: Unaligned write command > [ 3.571943] sd 1:0:0:0: [sdb] tag#16 CDB: Read(10) 28 00 00 00 01 08 > 00 00 f8 00 > [ 3.571996] blk_update_request: I/O error, dev sdb, sector 264 op > 0x0:(READ) flags 0x80700 phys_seg 30 prio class 0 > > The disk reports it got some unaligned write, but the block layer says > the operation failed is a READ. > > Not sure if the device is really sane. > > > All these disks are the same model, ST2000DM008, I think it should more > or less indicate there is something wrong in the model... > > Recently I have got at least two friends reporting Seagate HDDs have > various problems. > Not sure if they are using the same model. > > > > > smartctl seam to say the disks are ok but i'm still unable to mount. > > scrub doesnt see any errors > > Well, you already have /dev/sdb report internal checksum error, aka data > corruption inside the disk, and your smartctl is report everything fine. > > Then I guess the disk is lying again on the smart info. > (Now I'm more convinced the disk is lying about FLUSH or at least has > something wrong doing FLUSH). > > > > > I have installed btrfsmaintanance btw. > > > > Can anyone advice me which steps to take in order to save the data? > > There is no backup (yes I'm a fool but was under the impression that > > with a copy of each file on 2 different disks I'll survive) > > For /dev/sdb, I have no idea at all, as we failed to read something from > the disk, it's a completely disk failure. > > For /dev/sde, as mentioned, you can try "btrfs rescue chunk-recovery", > and then let "btrfs check" to find out what's wrong. > > And I'm pretty sure you won't buy the same disks from Seagate next time. > > Thanks, > Qu > > > > Attached all(?) important files + a history of my attempts the past > > days. My attempts from the system rescue usb are not in though. > > > > kind regards, > > Bastiaan > >