On 2018年05月01日 23:50, Michael Wade wrote: > Hi Qu, > > Oh dear that is not good news! > > I have been running the find root command since yesterday but it only > seems to be only be outputting the following message: > > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 It's mostly fine, as find-root will go through all tree blocks and try to read them as tree blocks. Although btrfs-find-root will suppress csum error output, but such basic tree validation check is not suppressed, thus you get such message. > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 > > I tried with the latest btrfs tools compiled from source and the ones > I have installed with the same result. Is there a CLI utility I could > use to determine if the log contains any other content? Did it report any useful info at the end? Thanks, Qu > > Kind regards > Michael > > > On 30 April 2018 at 04:02, Qu Wenruo wrote: >> >> >> On 2018年04月29日 22:08, Michael Wade wrote: >>> Hi Qu, >>> >>> Got this error message: >>> >>> ./btrfs inspect dump-tree -b 20800943685632 /dev/md127 >>> btrfs-progs v4.16.1 >>> bytenr mismatch, want=20800943685632, have=3118598835113619663 >>> ERROR: cannot read chunk root >>> ERROR: unable to open /dev/md127 >>> >>> I have attached the dumps for: >>> >>> dd if=/dev/md127 of=/tmp/chunk_root.copy1 bs=1 count=32K skip=266325721088 >>> dd if=/dev/md127 of=/tmp/chunk_root.copy2 bs=1 count=32K skip=266359275520 >> >> Unfortunately, both dumps are corrupted and contain mostly garbage. >> I think it's the underlying stack (mdraid) has something wrong or failed >> to recover its data. >> >> This means your last chance will be btrfs-find-root. >> >> Please try: >> # btrfs-find-root -o 3 >> >> And provide all the output. >> >> But please keep in mind, chunk root is a critical tree, and so far it's >> already heavily damaged. >> Although I could still continue try to recover, there is pretty low >> chance now. >> >> Thanks, >> Qu >>> >>> Kind regards >>> Michael >>> >>> >>> On 29 April 2018 at 10:33, Qu Wenruo wrote: >>>> >>>> >>>> On 2018年04月29日 16:59, Michael Wade wrote: >>>>> Ok, will it be possible for me to install the new version of the tools >>>>> on my current kernel without overriding the existing install? Hesitant >>>>> to update kernel/btrfs as it might break the ReadyNAS interface / >>>>> future firmware upgrades. >>>>> >>>>> Perhaps I could grab this: >>>>> https://github.com/kdave/btrfs-progs/releases/tag/v4.16.1 and >>>>> hopefully build from source and then run the binaries directly? >>>> >>>> Of course, that's how most of us test btrfs-progs builds. >>>> >>>> Thanks, >>>> Qu >>>> >>>>> >>>>> Kind regards >>>>> >>>>> On 29 April 2018 at 09:33, Qu Wenruo wrote: >>>>>> >>>>>> >>>>>> On 2018年04月29日 16:11, Michael Wade wrote: >>>>>>> Thanks Qu, >>>>>>> >>>>>>> Please find attached the log file for the chunk recover command. >>>>>> >>>>>> Strangely, btrfs chunk recovery found no extra chunk beyond current >>>>>> system chunk range. >>>>>> >>>>>> Which means, it's chunk tree corrupted. >>>>>> >>>>>> Please dump the chunk tree with latest btrfs-progs (which provides the >>>>>> new --follow option). >>>>>> >>>>>> # btrfs inspect dump-tree -b 20800943685632 >>>>>> >>>>>> If it doesn't work, please provide the following binary dump: >>>>>> >>>>>> # dd if= of=/tmp/chunk_root.copy1 bs=1 count=32K skip=266325721088 >>>>>> # dd if= of=/tmp/chunk_root.copy2 bs=1 count=32K skip=266359275520 >>>>>> (And will need to repeat similar dump for several times according to >>>>>> above dump) >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>> >>>>>>> >>>>>>> Kind regards >>>>>>> Michael >>>>>>> >>>>>>> On 28 April 2018 at 12:38, Qu Wenruo wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 2018年04月28日 17:37, Michael Wade wrote: >>>>>>>>> Hi Qu, >>>>>>>>> >>>>>>>>> Thanks for your reply. I will investigate upgrading the kernel, >>>>>>>>> however I worry that future ReadyNAS firmware upgrades would fail on a >>>>>>>>> newer kernel version (I don't have much linux experience so maybe my >>>>>>>>> concerns are unfounded!?). >>>>>>>>> >>>>>>>>> I have attached the output of the dump super command. >>>>>>>>> >>>>>>>>> I did actually run chunk recover before, without the verbose option, >>>>>>>>> it took around 24 hours to finish but did not resolve my issue. Happy >>>>>>>>> to start that again if you need its output. >>>>>>>> >>>>>>>> The system chunk only contains the following chunks: >>>>>>>> [0, 4194304]: Initial temporary chunk, not used at all >>>>>>>> [20971520, 29360128]: System chunk created by mkfs, should be full >>>>>>>> used up >>>>>>>> [20800943685632, 20800977240064]: >>>>>>>> The newly created large system chunk. >>>>>>>> >>>>>>>> The chunk root is still in 2nd chunk thus valid, but some of its leaf is >>>>>>>> out of the range. >>>>>>>> >>>>>>>> If you can't wait 24h for chunk recovery to run, my advice would be move >>>>>>>> the disk to some other computer, and use latest btrfs-progs to execute >>>>>>>> the following command: >>>>>>>> >>>>>>>> # btrfs inpsect dump-tree -b 20800943685632 --follow >>>>>>>> >>>>>>>> If we're lucky enough, we may read out the tree leaf containing the new >>>>>>>> system chunk and save a day. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Qu >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks so much for your help. >>>>>>>>> >>>>>>>>> Kind regards >>>>>>>>> Michael >>>>>>>>> >>>>>>>>> On 28 April 2018 at 09:45, Qu Wenruo wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2018年04月28日 16:30, Michael Wade wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I was hoping that someone would be able to help me resolve the issues >>>>>>>>>>> I am having with my ReadyNAS BTRFS volume. Basically my trouble >>>>>>>>>>> started after a power cut, subsequently the volume would not mount. >>>>>>>>>>> Here are the details of my setup as it is at the moment: >>>>>>>>>>> >>>>>>>>>>> uname -a >>>>>>>>>>> Linux QAI 4.4.116.alpine.1 #1 SMP Mon Feb 19 21:58:38 PST 2018 armv7l GNU/Linux >>>>>>>>>> >>>>>>>>>> The kernel is pretty old for btrfs. >>>>>>>>>> Strongly recommended to upgrade. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> btrfs --version >>>>>>>>>>> btrfs-progs v4.12 >>>>>>>>>> >>>>>>>>>> So is the user tools. >>>>>>>>>> >>>>>>>>>> Although I think it won't be a big problem, as needed tool should be there. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> btrfs fi show >>>>>>>>>>> Label: '11baed92:data' uuid: 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>>> Total devices 1 FS bytes used 5.12TiB >>>>>>>>>>> devid 1 size 7.27TiB used 6.24TiB path /dev/md127 >>>>>>>>>> >>>>>>>>>> So, it's btrfs on mdraid. >>>>>>>>>> It would normally make things harder to debug, so I could only provide >>>>>>>>>> advice from the respect of btrfs. >>>>>>>>>> For mdraid part, I can't ensure anything. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Here are the relevant dmesg logs for the current state of the device: >>>>>>>>>>> >>>>>>>>>>> [ 19.119391] md: md127 stopped. >>>>>>>>>>> [ 19.120841] md: bind >>>>>>>>>>> [ 19.121120] md: bind >>>>>>>>>>> [ 19.121380] md: bind >>>>>>>>>>> [ 19.125535] md/raid:md127: device sda3 operational as raid disk 0 >>>>>>>>>>> [ 19.125547] md/raid:md127: device sdc3 operational as raid disk 2 >>>>>>>>>>> [ 19.125554] md/raid:md127: device sdb3 operational as raid disk 1 >>>>>>>>>>> [ 19.126712] md/raid:md127: allocated 3240kB >>>>>>>>>>> [ 19.126778] md/raid:md127: raid level 5 active with 3 out of 3 >>>>>>>>>>> devices, algorithm 2 >>>>>>>>>>> [ 19.126784] RAID conf printout: >>>>>>>>>>> [ 19.126789] --- level:5 rd:3 wd:3 >>>>>>>>>>> [ 19.126794] disk 0, o:1, dev:sda3 >>>>>>>>>>> [ 19.126799] disk 1, o:1, dev:sdb3 >>>>>>>>>>> [ 19.126804] disk 2, o:1, dev:sdc3 >>>>>>>>>>> [ 19.128118] md127: detected capacity change from 0 to 7991637573632 >>>>>>>>>>> [ 19.395112] Adding 523708k swap on /dev/md1. Priority:-1 extents:1 >>>>>>>>>>> across:523708k >>>>>>>>>>> [ 19.434956] BTRFS: device label 11baed92:data devid 1 transid >>>>>>>>>>> 151800 /dev/md127 >>>>>>>>>>> [ 19.739276] BTRFS info (device md127): setting nodatasum >>>>>>>>>>> [ 19.740440] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740450] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740498] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740512] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740552] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740560] BTRFS critical (device md127): unable to find logical >>>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>>> [ 19.740576] BTRFS error (device md127): failed to read chunk root >>>>>>>>>> >>>>>>>>>> This shows it pretty clear, btrfs fails to read chunk root. >>>>>>>>>> And according your above "len 4096" it's pretty old fs, as it's still >>>>>>>>>> using 4K nodesize other than 16K nodesize. >>>>>>>>>> >>>>>>>>>> According to above output, it means your superblock by somehow lacks the >>>>>>>>>> needed system chunk mapping, which is used to initialize chunk mapping. >>>>>>>>>> >>>>>>>>>> Please provide the following command output: >>>>>>>>>> >>>>>>>>>> # btrfs inspect dump-super -fFa /dev/md127 >>>>>>>>>> >>>>>>>>>> Also, please consider run the following command and dump all its output: >>>>>>>>>> >>>>>>>>>> # btrfs rescue chunk-recover -v /dev/md127. >>>>>>>>>> >>>>>>>>>> Please note that, above command can take a long time to finish, and if >>>>>>>>>> it works without problem, it may solve your problem. >>>>>>>>>> But if it doesn't work, the output could help me to manually craft a fix >>>>>>>>>> to your super block. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Qu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> [ 19.783975] BTRFS error (device md127): open_ctree failed >>>>>>>>>>> >>>>>>>>>>> In an attempt to recover the volume myself I run a few BTRFS commands >>>>>>>>>>> mostly using advice from here: >>>>>>>>>>> https://lists.opensuse.org/opensuse/2017-02/msg00930.html. However >>>>>>>>>>> that actually seems to have made things worse as I can no longer mount >>>>>>>>>>> the file system, not even in readonly mode. >>>>>>>>>>> >>>>>>>>>>> So starting from the beginning here is a list of things I have done so >>>>>>>>>>> far (hopefully I remembered the order in which I ran them!) >>>>>>>>>>> >>>>>>>>>>> 1. Noticed that my backups to the NAS were not running (didn't get >>>>>>>>>>> notified that the volume had basically "died") >>>>>>>>>>> 2. ReadyNAS UI indicated that the volume was inactive. >>>>>>>>>>> 3. SSHed onto the box and found that the first drive was not marked as >>>>>>>>>>> operational (log showed I/O errors / UNKOWN (0x2003)) so I replaced >>>>>>>>>>> the disk and let the array resync. >>>>>>>>>>> 4. After resync the volume still was unaccessible so I looked at the >>>>>>>>>>> logs once more and saw something like the following which seemed to >>>>>>>>>>> indicate that the replay log had been corrupted when the power went >>>>>>>>>>> out: >>>>>>>>>>> >>>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>>> BTRFS: error (device md127) in btrfs_replay_log:2524: errno=-5 IO >>>>>>>>>>> failure (Failed to recover log tree) >>>>>>>>>>> BTRFS error (device md127): pending csums is 155648 >>>>>>>>>>> BTRFS error (device md127): cleaner transaction attach returned -30 >>>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>>> >>>>>>>>>>> 5. Then: >>>>>>>>>>> >>>>>>>>>>> btrfs rescue zero-log >>>>>>>>>>> >>>>>>>>>>> 6. Was then able to mount the volume in readonly mode. >>>>>>>>>>> >>>>>>>>>>> btrfs scrub start >>>>>>>>>>> >>>>>>>>>>> Which fixed some errors but not all: >>>>>>>>>>> >>>>>>>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>>> >>>>>>>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:00:34 >>>>>>>>>>> total bytes scrubbed: 224.26GiB with 6 errors >>>>>>>>>>> error details: csum=6 >>>>>>>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0 >>>>>>>>>>> >>>>>>>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:34:43 >>>>>>>>>>> total bytes scrubbed: 224.26GiB with 6 errors >>>>>>>>>>> error details: csum=6 >>>>>>>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0 >>>>>>>>>>> >>>>>>>>>>> 6. Seeing this hanging I rebooted the NAS >>>>>>>>>>> 7. Think this is when the volume would not mount at all. >>>>>>>>>>> 8. Seeing log entries like these: >>>>>>>>>>> >>>>>>>>>>> BTRFS warning (device md127): checksum error at logical 20800943685632 >>>>>>>>>>> on dev /dev/md127, sector 520167424: metadata node (level 1) in tree 3 >>>>>>>>>>> >>>>>>>>>>> I ran >>>>>>>>>>> >>>>>>>>>>> btrfs check --fix-crc >>>>>>>>>>> >>>>>>>>>>> And that brings us to where I am now: Some seemly corrupted BTRFS >>>>>>>>>>> metadata and unable to mount the drive even with the recovery option. >>>>>>>>>>> >>>>>>>>>>> Any help you can give is much appreciated! >>>>>>>>>>> >>>>>>>>>>> Kind regards >>>>>>>>>>> Michael >>>>>>>>>>> -- >>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>