On 2019/10/21 下午6:47, Christian Pernegger wrote: > [Please CC me, I'm not on the list.] > > Am So., 20. Okt. 2019 um 12:28 Uhr schrieb Qu Wenruo : >>> Question: Can I work with the mounted backup image on the machine that >>> also contains the original disc? I vaguely recall something about >>> btrfs really not liking clones. >> >> If your fs only contains one device (single fs on single device), then >> you should be mostly fine. [...] mostly OK. > > Should? Mostly? What a nightmare-inducing, yet pleasantly Adams-esqe > way of putting things ... :-) > > Anyway, I have an image of the whole disk on a server now and am > feeling all the more adventurous for it. (The first try failed a > couple of MB from completion due to spurious network issues, which is > why I've taken so long to reply.) > >>> You wouldn't happen to know of a [suitable] bootable rescue image [...]? >> >> Archlinux iso at least has the latest btrfs-progs. > > I'm on the Ubuntu 19.10 live CD (btrfs-progs 5.2.1, kernel 5.3.0) > until further notice. Exploring other options (incl. running your > rescue kernel on another machine and serving the disk via nbd) in > parallel. > >> I'd recommend the following safer methods before trying --init-extent-tree: >> >> - Dump backup roots first: >> # btrfs ins dump-super -f | grep backup_treee_root >> Then grab all big numbers. > > # btrfs inspect-internal dump-super -f /dev/nvme0n1p2 | grep backup_tree_root > backup_tree_root: 284041969664 gen: 58600 level: 1 > backup_tree_root: 284041953280 gen: 58601 level: 1 > backup_tree_root: 284042706944 gen: 58602 level: 1 > backup_tree_root: 284045410304 gen: 58603 level: 1 > >> - Try backup_extent_root numbers in btrfs check first >> # btrfs check -r >> Use the number with highest generation first. > > Assuming backup_extent_root == backup_tree_root ... > > # btrfs check --tree-root 284045410304 /dev/nvme0n1p2 > Opening filesystem to check... > checksum verify failed on 284041084928 found E4E3BDB6 wanted 00000000 > checksum verify failed on 284041084928 found E4E3BDB6 wanted 00000000 > bad tree block 284041084928, bytenr mismatch, want=284041084928, have=0 > ERROR: cannot open file system > > # btrfs check --tree-root 284042706944 /dev/nvme0n1p2 > Opening filesystem to check... > checksum verify failed on 284042706944 found E4E3BDB6 wanted 00000000 > checksum verify failed on 284042706944 found E4E3BDB6 wanted 00000000 > bad tree block 284042706944, bytenr mismatch, want=284042706944, have=0 > Couldn't read tree root > ERROR: cannot open file system > > # btrfs check --tree-root 284041953280 /dev/nvme0n1p2 > Opening filesystem to check... > checksum verify failed on 284041953280 found E4E3BDB6 wanted 00000000 > checksum verify failed on 284041953280 found E4E3BDB6 wanted 00000000 > bad tree block 284041953280, bytenr mismatch, want=284041953280, have=0 > Couldn't read tree root > ERROR: cannot open file system > > # btrfs check --tree-root 284041969664 /dev/nvme0n1p2 > Opening filesystem to check... > checksum verify failed on 284041969664 found E4E3BDB6 wanted 00000000 > checksum verify failed on 284041969664 found E4E3BDB6 wanted 00000000 > bad tree block 284041969664, bytenr mismatch, want=284041969664, have=0 > Couldn't read tree root > ERROR: cannot open file system This doesn't look good at all. All 4 copies are wiped out, so it doesn't look like a bug in btrfs, but some other problem wiping out a full range of tree blocks. > >> If all backup fails to pass basic btrfs check, and all happen to have >> the same "wanted 00000000" then it means a big range of tree blocks >> get wiped out, not really related to btrfs but some hardware wipe. > > Doesn't look good, does it? Any further ideas at all or is this the > end of the line? TBH, at this point, I don't mind having to re-install > the box so much as the idea that the same thing might happen again -- I don't have good idea. The result looks like something have wiped part of your tree blocks (not a single one, but a range). > either to this one, or to my work machine, which is very similar. If > nothing else, I'd really appreciate knowing what exactly happened here > and why -- a bug in the GPU and/or its driver shouldn't cause this --; > and an avoidance strategy that goes beyond-upgrade-and-pray. At this stage, I'm sorry that I have no idea at all. If you're 100% sure that you haven't enabled discard for a while, then I guess it doesn't look like btrfs at least. Btrfs shouldn't cause so many tree blocks wiped at all, even for v5.0 kernel. Thanks, Qu > > Cheers, > Christian >