All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tomáš Metelka" <tomas.metelka@metaliza.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Broken chunk tree - Was: Mount issue, mount /dev/sdc2: can't read superblock
Date: Sun, 30 Dec 2018 01:48:23 +0100	[thread overview]
Message-ID: <07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz> (raw)
In-Reply-To: <8f59acfd-4d97-86d4-2063-25213e2770d0@gmx.com>

Ok, I've got it:-(

But just a few questions: I've tried (with btrfs-progs v4.19.1) to 
recover files through btrfs restore -s -m -S -v -i ... and following 
events occurred:

1) Just 1 "hard" error:
ERROR: cannot map block logical 117058830336 length 1073741824: -2
Error copying data for /mnt/...
(file which absence really doesn't pain me:-))

2) For 24 files a I got "too much loops" warning (U mean this: "if 
(loops >= 0 && loops++ >= 1024) { ..."). I've always answered yes but 
I'm afraid these files are corrupted (at least 2 of them seems corrupted).

How much bad is this? Does the error mentioned in #1 mean that it's the 
only file which is totally lost? I can live without those 24 + 1 files 
so if #1 and #2 would be the only errors then I could say the recovery 
was successful ... but I'm afraid things aren't such easy:-)

Thanks
M.


   Tomáš Metelka
   Business & IT Analyst

   Tel: +420 728 627 252
   Email: tomas.metelka@metaliza.cz



On 24. 12. 18 15:19, Qu Wenruo wrote:
> 
> 
> On 2018/12/24 下午9:52, Tomáš Metelka wrote:
>> On 24. 12. 18 14:02, Qu Wenruo wrote:
>>> btrfs check --readonly output please.
>>>
>>> btrfs check --readonly is always the most reliable and detailed output
>>> for any possible recovery.
>>
>> This is very weird because it prints only:
>> ERROR: cannot open file system
> 
> A new place to enhance ;)
> 
>>
>> I've tried also "btrfs check -r 75152310272" but it only says:
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> Ignoring transid failure
>> ERROR: cannot open file system
>>
>> I've tried that because:
>>      backup 3:
>>   backup_tree_root:    75152310272    gen: 2488741 level: 1
>>
>>> Also kernel message for the mount failure could help.
>>
>> Sorry, my fault, I should start from this point:
>>
>> Dec 23 21:59:07 tisc5 kernel: [10319.442615] BTRFS: device fsid
>> be557007-42c9-4079-be16-568997e94cd9 devid 1 transid 2488742 /dev/loop0
>> Dec 23 22:00:49 tisc5 kernel: [10421.167028] BTRFS info (device loop0):
>> disk space caching is enabled
>> Dec 23 22:00:49 tisc5 kernel: [10421.167034] BTRFS info (device loop0):
>> has skinny extents
>> Dec 23 22:00:50 tisc5 kernel: [10421.807564] BTRFS critical (device
>> loop0): corrupt node: root=1 block=75150311424 slot=245, invalid NULL
>> node pointer
> This explains the problem.
> 
> Your root tree has one node pointer which is not correct.
> For pointer it should never points to 0.
> 
> This is pretty weird, at least some corruption pattern I have never seen.
> 
> Since your tree root get corrupted, there isn't much thing we can do,
> but try to use older tree roots.
> 
> You could go try all backup roots, starting from the newest backup (with
> highest generation), and check the backup root bytenr using:
> # btrfs check -r <backup root bytenr> <device>
> 
> To see which one get least error, but normally the chance is near 0.
> 
>> Dec 23 22:00:50 tisc5 kernel: [10421.807653] BTRFS error (device loop0):
>> failed to read block groups: -5
>> Dec 23 22:00:50 tisc5 kernel: [10421.877001] BTRFS error (device loop0):
>> open_ctree failed
>>
>>
>> So i tried to do:
>> 1) btrfs inspect-internal dump-super (with the snippet posted above)
>> 2) btrfs inspect-internal dump-tree -b 75150311424
>>
>> And it showed (header + snippet for items 243-248):
>> node 75150311424 level 1 items 249 free 244 generation 2488741 owner 2
>> fs uuid be557007-42c9-4079-be16-568997e94cd9
>> chunk uuid dbe69c7e-2d50-4001-af31-148c5475b48b
>> ...
>>    key (14799519744 EXTENT_ITEM 4096) block 233423224832 (14247023) gen
>> 2484894
>>    key (14811271168 EXTENT_ITEM 135168) block 656310272 (40058) gen 2488049
> 
> 
>>    key (1505328190277054464 UNKNOWN.4 366981796979539968) block 0 (0) gen 0
>>    key (0 UNKNOWN.0 1419267647995904) block 6468220747776 (394788864) gen
>> 7786775707648
> 
> Pretty obviously, these two nodes are garbage.
> Something corrupted the memory at runtime, and we don't have runtime
> check against corruption yet.
> 
> So IMHO, I think the problem is, some kernel code, either btrfs or other
> parts, corrupted the memory.
> And then btrfs fails to detect it, write it back to disk, and finally
> kernel get its chance to read the tree block from disk and finally
> caught the problem.
> 
> I could add such check for node, but normally it needs
> CONFIG_BTRFS_FS_CHECK_INTEGRITY, so makes no sense for normal user.
> 
>>    key (12884901888 EXTENT_ITEM 24576) block 816693248 (49847) gen 2484931
>>    key (14902849536 EXTENT_ITEM 131072) block 75135844352 (4585928) gen
>> 2488739
>>
>>
>> I looked at that numbers quite a while (also in hex) trying to figure
>> out what has happened (bit flips (it was on SSD), byte shifts (I
>> suspected bad CPU also ... because it has died after 2 months from
>> that)) and tried to guess "correct" values for that items ... but no
>> idea:-(
> 
> I'm not that sure, unless you're super lucky (or unlucky in this case),
> or it will normally get caught by csum first.
> 
>>
>> So this why I have asked about that log_root and whether there is a
>> chance to "log-replay things":-)
> 
> For your case, definitely not related to log replay.
> 
> Thanks,
> Qu
> 
>>
>>
>> Thanks
>> M.
> 

  reply	other threads:[~2018-12-30  0:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-20 21:21 Mount issue, mount /dev/sdc2: can't read superblock Peter Chant
2018-12-21 22:25 ` Chris Murphy
2018-12-22 12:34   ` Peter Chant
2018-12-24  0:58     ` Chris Murphy
2018-12-24  2:00       ` Qu Wenruo
2018-12-24 11:36         ` Peter Chant
2018-12-24 11:31       ` Peter Chant
2018-12-24 12:02         ` Qu Wenruo
2018-12-24 12:48           ` Tomáš Metelka
2018-12-24 13:02             ` Qu Wenruo
2018-12-24 13:52               ` Tomáš Metelka
2018-12-24 14:19                 ` Qu Wenruo
2018-12-30  0:48                   ` Tomáš Metelka [this message]
2018-12-30  3:59                     ` Broken chunk tree - Was: " Duncan
2018-12-30  4:38                     ` Qu Wenruo
2018-12-24 23:20         ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz \
    --to=tomas.metelka@metaliza.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.