* Btrfs check reports errors, filesystem seems fine
@ 2017-07-01 11:59 Filippe LeMarchand
2017-07-03 0:34 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-01 11:59 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2784 bytes --]
Hello everyone.
I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
enabling repair mode
Checking filesystem on /dev/sda2
UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
checking csums
checking root refs
found 23421812736 bytes used err is 0
total csum bytes: 21531608
total tree bytes: 776650752
total fs tree bytes: 711278592
total extent tree bytes: 36798464
btree space waste bytes: 116002036
file data blocks allocated: 850546470912
referenced 27611987968
Is it dangerous and what should I do about it?
I also tried --clear-space-cache, but it just removes the line about space cache.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-01 11:59 Btrfs check reports errors, filesystem seems fine Filippe LeMarchand
@ 2017-07-03 0:34 ` Qu Wenruo
2017-07-04 13:16 ` Lu Fengqi
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-03 0:34 UTC (permalink / raw)
To: Filippe LeMarchand, linux-btrfs
At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> Hello everyone.
>
> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>
> enabling repair mode
> Checking filesystem on /dev/sda2
> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> checking extents
> Fixed 0 roots.
> checking free space cache
> cache and super generation don't match, space cache will be invalidated
> checking fs roots
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
This means that in dir whose inode number is 79177, it has a child inode
pointer pointing to depercated.sxt.
But it doesn't have dir index and corresponding inode ref, which is
breaking the cross reference rule of btrfs.
Would you please run the following command to dump needed info for us to
debug?
# btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
and
# btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
and
# btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
Considering the output has both .txt and .sxt, I think that's the problem.
But such bit-flip should be detected by tree block csum.
I'm not sure what's wrong with it.
Thanks,
Qu
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> checking csums
> checking root refs
> found 23421812736 bytes used err is 0
> total csum bytes: 21531608
> total tree bytes: 776650752
> total fs tree bytes: 711278592
> total extent tree bytes: 36798464
> btree space waste bytes: 116002036
> file data blocks allocated: 850546470912
> referenced 27611987968
>
> Is it dangerous and what should I do about it?
>
> I also tried --clear-space-cache, but it just removes the line about space cache.
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-03 0:34 ` Qu Wenruo
@ 2017-07-04 13:16 ` Lu Fengqi
2017-07-04 13:24 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Lu Fengqi @ 2017-07-04 13:16 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: linux-btrfs, Qu Wenruo
On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>
>
>At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>> Hello everyone.
>>
>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>
>> enabling repair mode
>> Checking filesystem on /dev/sda2
>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>> checking extents
>> Fixed 0 roots.
>> checking free space cache
>> cache and super generation don't match, space cache will be invalidated
>> checking fs roots
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>
>This means that in dir whose inode number is 79177, it has a child inode
>pointer pointing to depercated.sxt.
>
>But it doesn't have dir index and corresponding inode ref, which is breaking
>the cross reference rule of btrfs.
>
>Would you please run the following command to dump needed info for us to
>debug?
>
># btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>
>and
>
># btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>
>and
>
># btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>
>
>Considering the output has both .txt and .sxt, I think that's the problem.
>But such bit-flip should be detected by tree block csum.
>I'm not sure what's wrong with it.
>
>Thanks,
>Qu
>
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>> checking csums
>> checking root refs
>> found 23421812736 bytes used err is 0
>> total csum bytes: 21531608
>> total tree bytes: 776650752
>> total fs tree bytes: 711278592
>> total extent tree bytes: 36798464
>> btree space waste bytes: 116002036
>> file data blocks allocated: 850546470912
>> referenced 27611987968
>>
>> Is it dangerous and what should I do about it?
>>
>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
I'm afraid that your mail may be rejected because the attachment size
exceeds the allowable limit(100kB) of btrfs mailing list. Could you
share the attachment by google drive?
Lastly, while Qu's timing is too tight, I will assist you on this issue.
--
Thanks,
Lu
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-04 13:16 ` Lu Fengqi
@ 2017-07-04 13:24 ` Filippe LeMarchand
2017-07-12 7:15 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-04 13:24 UTC (permalink / raw)
To: Lu Fengqi; +Cc: linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 4530 bytes --]
Sure, here it is:
https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >
> >
> >At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >> Hello everyone.
> >>
> >> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>
> >> enabling repair mode
> >> Checking filesystem on /dev/sda2
> >> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >> checking extents
> >> Fixed 0 roots.
> >> checking free space cache
> >> cache and super generation don't match, space cache will be invalidated
> >> checking fs roots
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >
> >This means that in dir whose inode number is 79177, it has a child inode
> >pointer pointing to depercated.sxt.
> >
> >But it doesn't have dir index and corresponding inode ref, which is breaking
> >the cross reference rule of btrfs.
> >
> >Would you please run the following command to dump needed info for us to
> >debug?
> >
> ># btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >
> >and
> >
> ># btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >
> >and
> >
> ># btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >
> >
> >Considering the output has both .txt and .sxt, I think that's the problem.
> >But such bit-flip should be detected by tree block csum.
> >I'm not sure what's wrong with it.
> >
> >Thanks,
> >Qu
> >
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >> checking csums
> >> checking root refs
> >> found 23421812736 bytes used err is 0
> >> total csum bytes: 21531608
> >> total tree bytes: 776650752
> >> total fs tree bytes: 711278592
> >> total extent tree bytes: 36798464
> >> btree space waste bytes: 116002036
> >> file data blocks allocated: 850546470912
> >> referenced 27611987968
> >>
> >> Is it dangerous and what should I do about it?
> >>
> >> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>
> >
> >
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> I'm afraid that your mail may be rejected because the attachment size
> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> share the attachment by google drive?
>
> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-04 13:24 ` Filippe LeMarchand
@ 2017-07-12 7:15 ` Qu Wenruo
2017-07-12 11:12 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-12 7:15 UTC (permalink / raw)
To: Filippe LeMarchand, Lu Fengqi; +Cc: linux-btrfs, Qu Wenruo
Sorry for the late reply.
After investigating the dumps, I found the output is quite strange.
1) Mismatching output.
In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
here at all.
While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
79177 DIR_ITEM/DIR_INDEX.
Maybe something wrong in grep happened which skip "(79177" ?
2) Mismatched hash
The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
number 54846528 is the hash(crc32c) of filename, and it contains 2
items, one for "deprecated.txt" and one for "deprecated.sxt".
But we found that 54846528 only matches the hash for "deprecated.txt",
not "deprecated.sxt".
I think that's the main problem.
BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
mode reports similar (well, output may differ) error?
If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
that's the problem.
However it may take some time before we can fix it in repair mode.
Thanks,
Qu
在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> Sure, here it is:
> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>
> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>
>>>
>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>> Hello everyone.
>>>>
>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>
>>>> enabling repair mode
>>>> Checking filesystem on /dev/sda2
>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>> checking extents
>>>> Fixed 0 roots.
>>>> checking free space cache
>>>> cache and super generation don't match, space cache will be invalidated
>>>> checking fs roots
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>
>>> This means that in dir whose inode number is 79177, it has a child inode
>>> pointer pointing to depercated.sxt.
>>>
>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>> the cross reference rule of btrfs.
>>>
>>> Would you please run the following command to dump needed info for us to
>>> debug?
>>>
>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>
>>> and
>>>
>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>
>>> and
>>>
>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>
>>>
>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>> But such bit-flip should be detected by tree block csum.
>>> I'm not sure what's wrong with it.
>>>
>>> Thanks,
>>> Qu
>>>
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>> checking csums
>>>> checking root refs
>>>> found 23421812736 bytes used err is 0
>>>> total csum bytes: 21531608
>>>> total tree bytes: 776650752
>>>> total fs tree bytes: 711278592
>>>> total extent tree bytes: 36798464
>>>> btree space waste bytes: 116002036
>>>> file data blocks allocated: 850546470912
>>>> referenced 27611987968
>>>>
>>>> Is it dangerous and what should I do about it?
>>>>
>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> I'm afraid that your mail may be rejected because the attachment size
>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>> share the attachment by google drive?
>>
>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-12 7:15 ` Qu Wenruo
@ 2017-07-12 11:12 ` Filippe LeMarchand
2017-07-12 12:44 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-12 11:12 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 7758 bytes --]
> Maybe something wrong in grep happened which skip "(79177" ?
Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
And btrfs check --mode=lowmem gives this:
checking extents
ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
ERROR: errors found in fs roots
Checking filesystem on /dev/sda2
UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
found 153429872640 bytes used, error(s) found
total csum bytes: 121991672
total tree bytes: 1940160512
total fs tree bytes: 1683767296
total extent tree bytes: 103841792
btree space waste bytes: 310722480
file data blocks allocated: 842455031808
referenced 159286636544
In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> Sorry for the late reply.
>
> After investigating the dumps, I found the output is quite strange.
>
> 1) Mismatching output.
> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> here at all.
>
> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> 79177 DIR_ITEM/DIR_INDEX.
>
> Maybe something wrong in grep happened which skip "(79177" ?
>
> 2) Mismatched hash
> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> number 54846528 is the hash(crc32c) of filename, and it contains 2
> items, one for "deprecated.txt" and one for "deprecated.sxt".
>
> But we found that 54846528 only matches the hash for "deprecated.txt",
> not "deprecated.sxt".
>
> I think that's the main problem.
>
> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> mode reports similar (well, output may differ) error?
>
> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> that's the problem.
>
> However it may take some time before we can fix it in repair mode.
>
> Thanks,
> Qu
>
>
>
> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> > Sure, here it is:
> > https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >
> > In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>
> >>>
> >>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>> Hello everyone.
> >>>>
> >>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>
> >>>> enabling repair mode
> >>>> Checking filesystem on /dev/sda2
> >>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>> checking extents
> >>>> Fixed 0 roots.
> >>>> checking free space cache
> >>>> cache and super generation don't match, space cache will be invalidated
> >>>> checking fs roots
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>
> >>> This means that in dir whose inode number is 79177, it has a child inode
> >>> pointer pointing to depercated.sxt.
> >>>
> >>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>> the cross reference rule of btrfs.
> >>>
> >>> Would you please run the following command to dump needed info for us to
> >>> debug?
> >>>
> >>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>
> >>> and
> >>>
> >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>
> >>> and
> >>>
> >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>
> >>>
> >>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>> But such bit-flip should be detected by tree block csum.
> >>> I'm not sure what's wrong with it.
> >>>
> >>> Thanks,
> >>> Qu
> >>>
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>> checking csums
> >>>> checking root refs
> >>>> found 23421812736 bytes used err is 0
> >>>> total csum bytes: 21531608
> >>>> total tree bytes: 776650752
> >>>> total fs tree bytes: 711278592
> >>>> total extent tree bytes: 36798464
> >>>> btree space waste bytes: 116002036
> >>>> file data blocks allocated: 850546470912
> >>>> referenced 27611987968
> >>>>
> >>>> Is it dangerous and what should I do about it?
> >>>>
> >>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>
> >>>
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >> I'm afraid that your mail may be rejected because the attachment size
> >> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >> share the attachment by google drive?
> >>
> >> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-12 11:12 ` Filippe LeMarchand
@ 2017-07-12 12:44 ` Qu Wenruo
2017-07-12 13:11 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-12 12:44 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
On 2017年07月12日 19:12, Filippe LeMarchand wrote:
>> Maybe something wrong in grep happened which skip "(79177" ?
> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
It looks much better, thanks.
>
> And btrfs check --mode=lowmem gives this:
>
> checking extents
> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> ERROR: errors found in extent allocation tree or chunk allocation
Looks much like an exposed lowmem mode bug.
Feel free to ignore these error from extent tree, they are just false
alerts.
> checking free space cache
> checking fs roots
> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
The error report is much better than original mode, and that's what I need.
Now I can wipe out all other noise as we know exactly which tree and
which DIR_ITEM/INODE_REF is causing the problem.
Would you please update the dump result with "-t 4546" passed to
btrfs-debug-tree like:
# btrfs-debug-tree -t 4546 <device>| grep 79177
Only "-t 4546" is added, to only dump the result of subvolume 4546.
As always, all 3 grep results (2 "deprecated" and one 79177) need to be
updated.
And it seems that my previous assumption is still right for this case.
If it's caused by kernel, your dump would definitely help us to locate
the problem.
> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
Also for root 5134 please.
Thanks,
Qu
> ERROR: errors found in fs roots
> Checking filesystem on /dev/sda2
> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> found 153429872640 bytes used, error(s) found
> total csum bytes: 121991672
> total tree bytes: 1940160512
> total fs tree bytes: 1683767296
> total extent tree bytes: 103841792
> btree space waste bytes: 310722480
> file data blocks allocated: 842455031808
> referenced 159286636544
>
> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
>> Sorry for the late reply.
>>
>> After investigating the dumps, I found the output is quite strange.
>>
>> 1) Mismatching output.
>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
>> here at all.
>>
>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
>> 79177 DIR_ITEM/DIR_INDEX.
>>
>> Maybe something wrong in grep happened which skip "(79177" ?
>>
>> 2) Mismatched hash
>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
>> number 54846528 is the hash(crc32c) of filename, and it contains 2
>> items, one for "deprecated.txt" and one for "deprecated.sxt".
>>
>> But we found that 54846528 only matches the hash for "deprecated.txt",
>> not "deprecated.sxt".
>>
>> I think that's the main problem.
>>
>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
>> mode reports similar (well, output may differ) error?
>>
>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
>> that's the problem.
>>
>> However it may take some time before we can fix it in repair mode.
>>
>> Thanks,
>> Qu
>>
>>
>>
>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
>>> Sure, here it is:
>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>>>
>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>>>> Hello everyone.
>>>>>>
>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>>>
>>>>>> enabling repair mode
>>>>>> Checking filesystem on /dev/sda2
>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>> checking extents
>>>>>> Fixed 0 roots.
>>>>>> checking free space cache
>>>>>> cache and super generation don't match, space cache will be invalidated
>>>>>> checking fs roots
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>
>>>>> This means that in dir whose inode number is 79177, it has a child inode
>>>>> pointer pointing to depercated.sxt.
>>>>>
>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>>>> the cross reference rule of btrfs.
>>>>>
>>>>> Would you please run the following command to dump needed info for us to
>>>>> debug?
>>>>>
>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>>>
>>>>> and
>>>>>
>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>>>
>>>>> and
>>>>>
>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>>>
>>>>>
>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>>>> But such bit-flip should be detected by tree block csum.
>>>>> I'm not sure what's wrong with it.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>> checking csums
>>>>>> checking root refs
>>>>>> found 23421812736 bytes used err is 0
>>>>>> total csum bytes: 21531608
>>>>>> total tree bytes: 776650752
>>>>>> total fs tree bytes: 711278592
>>>>>> total extent tree bytes: 36798464
>>>>>> btree space waste bytes: 116002036
>>>>>> file data blocks allocated: 850546470912
>>>>>> referenced 27611987968
>>>>>>
>>>>>> Is it dangerous and what should I do about it?
>>>>>>
>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>> I'm afraid that your mail may be rejected because the attachment size
>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>>>> share the attachment by google drive?
>>>>
>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-12 12:44 ` Qu Wenruo
@ 2017-07-12 13:11 ` Filippe LeMarchand
2017-07-14 6:11 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-12 13:11 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 9429 bytes --]
Done, files added to same GDrive folder with corresponding names.
If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
>
> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
> >> Maybe something wrong in grep happened which skip "(79177" ?
> > Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
>
> It looks much better, thanks.
>
> >
> > And btrfs check --mode=lowmem gives this:
> >
> > checking extents
> > ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> > ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> > ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> > ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> > ERROR: errors found in extent allocation tree or chunk allocation
>
> Looks much like an exposed lowmem mode bug.
> Feel free to ignore these error from extent tree, they are just false
> alerts.
>
> > checking free space cache
> > checking fs roots
> > ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>
> The error report is much better than original mode, and that's what I need.
>
> Now I can wipe out all other noise as we know exactly which tree and
> which DIR_ITEM/INODE_REF is causing the problem.
>
> Would you please update the dump result with "-t 4546" passed to
> btrfs-debug-tree like:
>
> # btrfs-debug-tree -t 4546 <device>| grep 79177
>
> Only "-t 4546" is added, to only dump the result of subvolume 4546.
> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
> updated.
>
> And it seems that my previous assumption is still right for this case.
> If it's caused by kernel, your dump would definitely help us to locate
> the problem.
>
> > ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> > ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>
> Also for root 5134 please.
>
> Thanks,
> Qu
>
> > ERROR: errors found in fs roots
> > Checking filesystem on /dev/sda2
> > UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> > found 153429872640 bytes used, error(s) found
> > total csum bytes: 121991672
> > total tree bytes: 1940160512
> > total fs tree bytes: 1683767296
> > total extent tree bytes: 103841792
> > btree space waste bytes: 310722480
> > file data blocks allocated: 842455031808
> > referenced 159286636544
> >
> > In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> >> Sorry for the late reply.
> >>
> >> After investigating the dumps, I found the output is quite strange.
> >>
> >> 1) Mismatching output.
> >> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> >> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> >> here at all.
> >>
> >> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> >> 79177 DIR_ITEM/DIR_INDEX.
> >>
> >> Maybe something wrong in grep happened which skip "(79177" ?
> >>
> >> 2) Mismatched hash
> >> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> >> number 54846528 is the hash(crc32c) of filename, and it contains 2
> >> items, one for "deprecated.txt" and one for "deprecated.sxt".
> >>
> >> But we found that 54846528 only matches the hash for "deprecated.txt",
> >> not "deprecated.sxt".
> >>
> >> I think that's the main problem.
> >>
> >> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> >> mode reports similar (well, output may differ) error?
> >>
> >> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> >> that's the problem.
> >>
> >> However it may take some time before we can fix it in repair mode.
> >>
> >> Thanks,
> >> Qu
> >>
> >>
> >>
> >> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> >>> Sure, here it is:
> >>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >>>
> >>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>>>
> >>>>>
> >>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>>>> Hello everyone.
> >>>>>>
> >>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>>>
> >>>>>> enabling repair mode
> >>>>>> Checking filesystem on /dev/sda2
> >>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>> checking extents
> >>>>>> Fixed 0 roots.
> >>>>>> checking free space cache
> >>>>>> cache and super generation don't match, space cache will be invalidated
> >>>>>> checking fs roots
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>
> >>>>> This means that in dir whose inode number is 79177, it has a child inode
> >>>>> pointer pointing to depercated.sxt.
> >>>>>
> >>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>>>> the cross reference rule of btrfs.
> >>>>>
> >>>>> Would you please run the following command to dump needed info for us to
> >>>>> debug?
> >>>>>
> >>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>>>
> >>>>> and
> >>>>>
> >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>>>
> >>>>> and
> >>>>>
> >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>>>
> >>>>>
> >>>>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>>>> But such bit-flip should be detected by tree block csum.
> >>>>> I'm not sure what's wrong with it.
> >>>>>
> >>>>> Thanks,
> >>>>> Qu
> >>>>>
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>> checking csums
> >>>>>> checking root refs
> >>>>>> found 23421812736 bytes used err is 0
> >>>>>> total csum bytes: 21531608
> >>>>>> total tree bytes: 776650752
> >>>>>> total fs tree bytes: 711278592
> >>>>>> total extent tree bytes: 36798464
> >>>>>> btree space waste bytes: 116002036
> >>>>>> file data blocks allocated: 850546470912
> >>>>>> referenced 27611987968
> >>>>>>
> >>>>>> Is it dangerous and what should I do about it?
> >>>>>>
> >>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>
> >>>> I'm afraid that your mail may be rejected because the attachment size
> >>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >>>> share the attachment by google drive?
> >>>>
> >>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>>>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-12 13:11 ` Filippe LeMarchand
@ 2017-07-14 6:11 ` Qu Wenruo
2017-07-14 10:12 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-14 6:11 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
Thanks for your dump.
We're clear what is the direct cause of the problem.
It's one corrupted DIR_ITEM causing the problem.
And further more, original mode btrfs check can't detect it, and we will
fix it soon.
The corrupted DIR_ITEM is as the following:
item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
location key (4222342 INODE_ITEM 0) type FILE
transid 170929 data_len 0 name_len 14
name: deprecated.sxt
location key (13590433 INODE_ITEM 0) type FILE
transid 796448 data_len 0 name_len 14
name: deprecated.txt
For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
(ino=4222342) and "deprecated.sxt" (ino=13590433)
But something goes wrong here:
1) Hash of "deprecated.sxt" doesn't match 54846528
2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
Also captured by dump:
item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
inode ref index 417 namelen 14 name: deprecated.txt
3) DIR_INDEX also shows that filename for inode 4222342 should be
"deprecated.txt"
item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
location key (4222342 INODE_ITEM 0) type FILE
transid 170929 data_len 0 name_len 14
name: deprecated.txt
So generic speaking, it's DIR_ITEM wrong and causing the problem.
But the root reason is still unknown.
What I can see is, the corrupted DIR_ITEM points to an very old inode,
its mtime is back to 2016-09-07.
While the good DIR_ITEM points to newer inode, whose mtime is just
2017-05-02.
But more weird, there should not be two child inodes with the same
filename ("depercated.txt", I assume the sxt one is caused by a memory
bit corruption).
So, any details on the operation with util-linux/deprecated.txt will
help us to locate the root cause in kernel.
Thanks,
Qu
On 2017年07月12日 21:11, Filippe LeMarchand wrote:
> Done, files added to same GDrive folder with corresponding names.
> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
>
> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
>>
>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
>>
>> It looks much better, thanks.
>>
>>>
>>> And btrfs check --mode=lowmem gives this:
>>>
>>> checking extents
>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>
>> Looks much like an exposed lowmem mode bug.
>> Feel free to ignore these error from extent tree, they are just false
>> alerts.
>>
>>> checking free space cache
>>> checking fs roots
>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>
>> The error report is much better than original mode, and that's what I need.
>>
>> Now I can wipe out all other noise as we know exactly which tree and
>> which DIR_ITEM/INODE_REF is causing the problem.
>>
>> Would you please update the dump result with "-t 4546" passed to
>> btrfs-debug-tree like:
>>
>> # btrfs-debug-tree -t 4546 <device>| grep 79177
>>
>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
>> updated.
>>
>> And it seems that my previous assumption is still right for this case.
>> If it's caused by kernel, your dump would definitely help us to locate
>> the problem.
>>
>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>
>> Also for root 5134 please.
>>
>> Thanks,
>> Qu
>>
>>> ERROR: errors found in fs roots
>>> Checking filesystem on /dev/sda2
>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>> found 153429872640 bytes used, error(s) found
>>> total csum bytes: 121991672
>>> total tree bytes: 1940160512
>>> total fs tree bytes: 1683767296
>>> total extent tree bytes: 103841792
>>> btree space waste bytes: 310722480
>>> file data blocks allocated: 842455031808
>>> referenced 159286636544
>>>
>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
>>>> Sorry for the late reply.
>>>>
>>>> After investigating the dumps, I found the output is quite strange.
>>>>
>>>> 1) Mismatching output.
>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
>>>> here at all.
>>>>
>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
>>>> 79177 DIR_ITEM/DIR_INDEX.
>>>>
>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>
>>>> 2) Mismatched hash
>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
>>>>
>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
>>>> not "deprecated.sxt".
>>>>
>>>> I think that's the main problem.
>>>>
>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
>>>> mode reports similar (well, output may differ) error?
>>>>
>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
>>>> that's the problem.
>>>>
>>>> However it may take some time before we can fix it in repair mode.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>>
>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
>>>>> Sure, here it is:
>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>>>>>
>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>>>>>
>>>>>>>
>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>>>>>> Hello everyone.
>>>>>>>>
>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>>>>>
>>>>>>>> enabling repair mode
>>>>>>>> Checking filesystem on /dev/sda2
>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>>> checking extents
>>>>>>>> Fixed 0 roots.
>>>>>>>> checking free space cache
>>>>>>>> cache and super generation don't match, space cache will be invalidated
>>>>>>>> checking fs roots
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>
>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
>>>>>>> pointer pointing to depercated.sxt.
>>>>>>>
>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>>>>>> the cross reference rule of btrfs.
>>>>>>>
>>>>>>> Would you please run the following command to dump needed info for us to
>>>>>>> debug?
>>>>>>>
>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>>>>>
>>>>>>>
>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>>>>>> But such bit-flip should be detected by tree block csum.
>>>>>>> I'm not sure what's wrong with it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>> checking csums
>>>>>>>> checking root refs
>>>>>>>> found 23421812736 bytes used err is 0
>>>>>>>> total csum bytes: 21531608
>>>>>>>> total tree bytes: 776650752
>>>>>>>> total fs tree bytes: 711278592
>>>>>>>> total extent tree bytes: 36798464
>>>>>>>> btree space waste bytes: 116002036
>>>>>>>> file data blocks allocated: 850546470912
>>>>>>>> referenced 27611987968
>>>>>>>>
>>>>>>>> Is it dangerous and what should I do about it?
>>>>>>>>
>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>> I'm afraid that your mail may be rejected because the attachment size
>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>>>>>> share the attachment by google drive?
>>>>>>
>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>>>>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 6:11 ` Qu Wenruo
@ 2017-07-14 10:12 ` Filippe LeMarchand
2017-07-14 11:28 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-14 10:12 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 13743 bytes --]
First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
$ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
$ sudo rm -rf /usr/share/doc/packages/util-linux/
rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
$ sudo ls -l /usr/share/doc/packages/util-linux/
ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
total 0
-????????? ? ? ? ? ? deprecated.txt
Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
$ ls -l /usr/share/doc/packages/util-linux/
total 104
-rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
-rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
-rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
-rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
-rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
-rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
-rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
-rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
-rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
-rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
-rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
-rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
-rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
-rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
-rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
-rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
So, is this situation actually dangerous? And what can I do to gather more information for you?
In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
> Thanks for your dump.
>
> We're clear what is the direct cause of the problem.
>
> It's one corrupted DIR_ITEM causing the problem.
> And further more, original mode btrfs check can't detect it, and we will
> fix it soon.
>
> The corrupted DIR_ITEM is as the following:
> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
> location key (4222342 INODE_ITEM 0) type FILE
> transid 170929 data_len 0 name_len 14
> name: deprecated.sxt
> location key (13590433 INODE_ITEM 0) type FILE
> transid 796448 data_len 0 name_len 14
> name: deprecated.txt
>
> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
> (ino=4222342) and "deprecated.sxt" (ino=13590433)
>
> But something goes wrong here:
>
> 1) Hash of "deprecated.sxt" doesn't match 54846528
>
> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
> Also captured by dump:
> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
> inode ref index 417 namelen 14 name: deprecated.txt
>
> 3) DIR_INDEX also shows that filename for inode 4222342 should be
> "deprecated.txt"
> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
> location key (4222342 INODE_ITEM 0) type FILE
> transid 170929 data_len 0 name_len 14
> name: deprecated.txt
>
> So generic speaking, it's DIR_ITEM wrong and causing the problem.
>
> But the root reason is still unknown.
>
> What I can see is, the corrupted DIR_ITEM points to an very old inode,
> its mtime is back to 2016-09-07.
> While the good DIR_ITEM points to newer inode, whose mtime is just
> 2017-05-02.
>
> But more weird, there should not be two child inodes with the same
> filename ("depercated.txt", I assume the sxt one is caused by a memory
> bit corruption).
>
> So, any details on the operation with util-linux/deprecated.txt will
> help us to locate the root cause in kernel.
>
> Thanks,
> Qu
>
>
> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
> > Done, files added to same GDrive folder with corresponding names.
> > If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
> >
> > In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
> >>
> >> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
> >>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
> >>
> >> It looks much better, thanks.
> >>
> >>>
> >>> And btrfs check --mode=lowmem gives this:
> >>>
> >>> checking extents
> >>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> >>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> >>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> >>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> >>> ERROR: errors found in extent allocation tree or chunk allocation
> >>
> >> Looks much like an exposed lowmem mode bug.
> >> Feel free to ignore these error from extent tree, they are just false
> >> alerts.
> >>
> >>> checking free space cache
> >>> checking fs roots
> >>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>
> >> The error report is much better than original mode, and that's what I need.
> >>
> >> Now I can wipe out all other noise as we know exactly which tree and
> >> which DIR_ITEM/INODE_REF is causing the problem.
> >>
> >> Would you please update the dump result with "-t 4546" passed to
> >> btrfs-debug-tree like:
> >>
> >> # btrfs-debug-tree -t 4546 <device>| grep 79177
> >>
> >> Only "-t 4546" is added, to only dump the result of subvolume 4546.
> >> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
> >> updated.
> >>
> >> And it seems that my previous assumption is still right for this case.
> >> If it's caused by kernel, your dump would definitely help us to locate
> >> the problem.
> >>
> >>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> >>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>
> >> Also for root 5134 please.
> >>
> >> Thanks,
> >> Qu
> >>
> >>> ERROR: errors found in fs roots
> >>> Checking filesystem on /dev/sda2
> >>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>> found 153429872640 bytes used, error(s) found
> >>> total csum bytes: 121991672
> >>> total tree bytes: 1940160512
> >>> total fs tree bytes: 1683767296
> >>> total extent tree bytes: 103841792
> >>> btree space waste bytes: 310722480
> >>> file data blocks allocated: 842455031808
> >>> referenced 159286636544
> >>>
> >>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> >>>> Sorry for the late reply.
> >>>>
> >>>> After investigating the dumps, I found the output is quite strange.
> >>>>
> >>>> 1) Mismatching output.
> >>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> >>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> >>>> here at all.
> >>>>
> >>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> >>>> 79177 DIR_ITEM/DIR_INDEX.
> >>>>
> >>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>
> >>>> 2) Mismatched hash
> >>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> >>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
> >>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
> >>>>
> >>>> But we found that 54846528 only matches the hash for "deprecated.txt",
> >>>> not "deprecated.sxt".
> >>>>
> >>>> I think that's the main problem.
> >>>>
> >>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> >>>> mode reports similar (well, output may differ) error?
> >>>>
> >>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> >>>> that's the problem.
> >>>>
> >>>> However it may take some time before we can fix it in repair mode.
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>
> >>>>
> >>>>
> >>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> >>>>> Sure, here it is:
> >>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >>>>>
> >>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>>>>>> Hello everyone.
> >>>>>>>>
> >>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>>>>>
> >>>>>>>> enabling repair mode
> >>>>>>>> Checking filesystem on /dev/sda2
> >>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>>> checking extents
> >>>>>>>> Fixed 0 roots.
> >>>>>>>> checking free space cache
> >>>>>>>> cache and super generation don't match, space cache will be invalidated
> >>>>>>>> checking fs roots
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>
> >>>>>>> This means that in dir whose inode number is 79177, it has a child inode
> >>>>>>> pointer pointing to depercated.sxt.
> >>>>>>>
> >>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>>>>>> the cross reference rule of btrfs.
> >>>>>>>
> >>>>>>> Would you please run the following command to dump needed info for us to
> >>>>>>> debug?
> >>>>>>>
> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>>>>>
> >>>>>>> and
> >>>>>>>
> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>>>>>
> >>>>>>> and
> >>>>>>>
> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>>>>>
> >>>>>>>
> >>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>>>>>> But such bit-flip should be detected by tree block csum.
> >>>>>>> I'm not sure what's wrong with it.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Qu
> >>>>>>>
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>> checking csums
> >>>>>>>> checking root refs
> >>>>>>>> found 23421812736 bytes used err is 0
> >>>>>>>> total csum bytes: 21531608
> >>>>>>>> total tree bytes: 776650752
> >>>>>>>> total fs tree bytes: 711278592
> >>>>>>>> total extent tree bytes: 36798464
> >>>>>>>> btree space waste bytes: 116002036
> >>>>>>>> file data blocks allocated: 850546470912
> >>>>>>>> referenced 27611987968
> >>>>>>>>
> >>>>>>>> Is it dangerous and what should I do about it?
> >>>>>>>>
> >>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>
> >>>>>> I'm afraid that your mail may be rejected because the attachment size
> >>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >>>>>> share the attachment by google drive?
> >>>>>>
> >>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>>>>>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 10:12 ` Filippe LeMarchand
@ 2017-07-14 11:28 ` Qu Wenruo
2017-07-14 12:04 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-14 11:28 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
On 2017年07月14日 18:12, Filippe LeMarchand wrote:
> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
>
> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
>
> $ sudo rm -rf /usr/share/doc/packages/util-linux/
> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
>
> $ sudo ls -l /usr/share/doc/packages/util-linux/
> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> total 0
> -????????? ? ? ? ? ? deprecated.txt
Similar behavior is also detected using manually crafted image in our
environment.
Su Yue have sent patches to enhance error detection and test case for
it, but repairing is not supported.
>
> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
>
> $ ls -l /usr/share/doc/packages/util-linux/
> total 104
> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
>
> So, is this situation actually dangerous? And what can I do to gather more information for you?
The situation won't be worse. I'd recommend not to take any snapshot of
those subvolumes (4546 and 5134) to limit the corruption to those
subvolumes.
However there is also no easy way to fix it yet.
Currently possible solution may be deleting the whole subvolume.
If no further error happens, it may be fixed.
IIRC btrfs check --repair in original mode has
DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
handle it well.
Btrfs check --repair *MAY* fix it, or it may make things worse.
If you have full backup, then you could try it.
Otherwise, don't try it at all.
Other solution includes a specific repair program just for your case.
We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
(".sxt" one) and related DIR_INDEX/INODE_REF.
But I'll only choose this if you really need to fix it as soon as possible.
At least we have solution for it.
I'm more concerned about how this happened.
Any idea about the reproducer? Or just random memory corruption?
Thanks,
Qu
>
> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
>> Thanks for your dump.
>>
>> We're clear what is the direct cause of the problem.
>>
>> It's one corrupted DIR_ITEM causing the problem.
>> And further more, original mode btrfs check can't detect it, and we will
>> fix it soon.
>>
>> The corrupted DIR_ITEM is as the following:
>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
>> location key (4222342 INODE_ITEM 0) type FILE
>> transid 170929 data_len 0 name_len 14
>> name: deprecated.sxt
>> location key (13590433 INODE_ITEM 0) type FILE
>> transid 796448 data_len 0 name_len 14
>> name: deprecated.txt
>>
>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
>> (ino=4222342) and "deprecated.sxt" (ino=13590433)
>>
>> But something goes wrong here:
>>
>> 1) Hash of "deprecated.sxt" doesn't match 54846528
>>
>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
>> Also captured by dump:
>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
>> inode ref index 417 namelen 14 name: deprecated.txt
>>
>> 3) DIR_INDEX also shows that filename for inode 4222342 should be
>> "deprecated.txt"
>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
>> location key (4222342 INODE_ITEM 0) type FILE
>> transid 170929 data_len 0 name_len 14
>> name: deprecated.txt
>>
>> So generic speaking, it's DIR_ITEM wrong and causing the problem.
>>
>> But the root reason is still unknown.
>>
>> What I can see is, the corrupted DIR_ITEM points to an very old inode,
>> its mtime is back to 2016-09-07.
>> While the good DIR_ITEM points to newer inode, whose mtime is just
>> 2017-05-02.
>>
>> But more weird, there should not be two child inodes with the same
>> filename ("depercated.txt", I assume the sxt one is caused by a memory
>> bit corruption).
>>
>> So, any details on the operation with util-linux/deprecated.txt will
>> help us to locate the root cause in kernel.
>>
>> Thanks,
>> Qu
>>
>>
>> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
>>> Done, files added to same GDrive folder with corresponding names.
>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
>>>
>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
>>>>
>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
>>>>
>>>> It looks much better, thanks.
>>>>
>>>>>
>>>>> And btrfs check --mode=lowmem gives this:
>>>>>
>>>>> checking extents
>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
>>>>> ERROR: errors found in extent allocation tree or chunk allocation
>>>>
>>>> Looks much like an exposed lowmem mode bug.
>>>> Feel free to ignore these error from extent tree, they are just false
>>>> alerts.
>>>>
>>>>> checking free space cache
>>>>> checking fs roots
>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>
>>>> The error report is much better than original mode, and that's what I need.
>>>>
>>>> Now I can wipe out all other noise as we know exactly which tree and
>>>> which DIR_ITEM/INODE_REF is causing the problem.
>>>>
>>>> Would you please update the dump result with "-t 4546" passed to
>>>> btrfs-debug-tree like:
>>>>
>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
>>>>
>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
>>>> updated.
>>>>
>>>> And it seems that my previous assumption is still right for this case.
>>>> If it's caused by kernel, your dump would definitely help us to locate
>>>> the problem.
>>>>
>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>
>>>> Also for root 5134 please.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> ERROR: errors found in fs roots
>>>>> Checking filesystem on /dev/sda2
>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>> found 153429872640 bytes used, error(s) found
>>>>> total csum bytes: 121991672
>>>>> total tree bytes: 1940160512
>>>>> total fs tree bytes: 1683767296
>>>>> total extent tree bytes: 103841792
>>>>> btree space waste bytes: 310722480
>>>>> file data blocks allocated: 842455031808
>>>>> referenced 159286636544
>>>>>
>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
>>>>>> Sorry for the late reply.
>>>>>>
>>>>>> After investigating the dumps, I found the output is quite strange.
>>>>>>
>>>>>> 1) Mismatching output.
>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
>>>>>> here at all.
>>>>>>
>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
>>>>>> 79177 DIR_ITEM/DIR_INDEX.
>>>>>>
>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>>>
>>>>>> 2) Mismatched hash
>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
>>>>>>
>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
>>>>>> not "deprecated.sxt".
>>>>>>
>>>>>> I think that's the main problem.
>>>>>>
>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
>>>>>> mode reports similar (well, output may differ) error?
>>>>>>
>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
>>>>>> that's the problem.
>>>>>>
>>>>>> However it may take some time before we can fix it in repair mode.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>
>>>>>>
>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
>>>>>>> Sure, here it is:
>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>>>>>>>
>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>>>>>>>> Hello everyone.
>>>>>>>>>>
>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>>>>>>>
>>>>>>>>>> enabling repair mode
>>>>>>>>>> Checking filesystem on /dev/sda2
>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>>>>> checking extents
>>>>>>>>>> Fixed 0 roots.
>>>>>>>>>> checking free space cache
>>>>>>>>>> cache and super generation don't match, space cache will be invalidated
>>>>>>>>>> checking fs roots
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>
>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
>>>>>>>>> pointer pointing to depercated.sxt.
>>>>>>>>>
>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>>>>>>>> the cross reference rule of btrfs.
>>>>>>>>>
>>>>>>>>> Would you please run the following command to dump needed info for us to
>>>>>>>>> debug?
>>>>>>>>>
>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>>>>>>>
>>>>>>>>> and
>>>>>>>>>
>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>>>>>>>
>>>>>>>>> and
>>>>>>>>>
>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>>>>>>>> But such bit-flip should be detected by tree block csum.
>>>>>>>>> I'm not sure what's wrong with it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>> checking csums
>>>>>>>>>> checking root refs
>>>>>>>>>> found 23421812736 bytes used err is 0
>>>>>>>>>> total csum bytes: 21531608
>>>>>>>>>> total tree bytes: 776650752
>>>>>>>>>> total fs tree bytes: 711278592
>>>>>>>>>> total extent tree bytes: 36798464
>>>>>>>>>> btree space waste bytes: 116002036
>>>>>>>>>> file data blocks allocated: 850546470912
>>>>>>>>>> referenced 27611987968
>>>>>>>>>>
>>>>>>>>>> Is it dangerous and what should I do about it?
>>>>>>>>>>
>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>>
>>>>>>>> I'm afraid that your mail may be rejected because the attachment size
>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>>>>>>>> share the attachment by google drive?
>>>>>>>>
>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>>>>>>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 11:28 ` Qu Wenruo
@ 2017-07-14 12:04 ` Filippe LeMarchand
2017-07-14 12:11 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-14 12:04 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 16533 bytes --]
> Currently possible solution may be deleting the whole subvolume.
Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync?
> If you have full backup, then you could try it.
It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes?
> Any idea about the reproducer? Or just random memory corruption?
No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago.
Also I ran memtest recently and it didn't find any errors.
In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote:
>
> On 2017年07月14日 18:12, Filippe LeMarchand wrote:
> > First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
> >
> > $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
> > rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> >
> > $ sudo rm -rf /usr/share/doc/packages/util-linux/
> > rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
> >
> > $ sudo ls -l /usr/share/doc/packages/util-linux/
> > ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> > total 0
> > -????????? ? ? ? ? ? deprecated.txt
>
> Similar behavior is also detected using manually crafted image in our
> environment.
>
> Su Yue have sent patches to enhance error detection and test case for
> it, but repairing is not supported.
>
> >
> > Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
> >
> > $ ls -l /usr/share/doc/packages/util-linux/
> > total 104
> > -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
> > -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
> > -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
> > -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
> > -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
> > -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
> > -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
> > -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
> > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> > -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
> > -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
> > -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
> > -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
> > -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
> > -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
> >
> > So, is this situation actually dangerous? And what can I do to gather more information for you?
>
> The situation won't be worse. I'd recommend not to take any snapshot of
> those subvolumes (4546 and 5134) to limit the corruption to those
> subvolumes.
>
> However there is also no easy way to fix it yet.
>
> Currently possible solution may be deleting the whole subvolume.
> If no further error happens, it may be fixed.
>
> IIRC btrfs check --repair in original mode has
> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
> handle it well.
> Btrfs check --repair *MAY* fix it, or it may make things worse.
> If you have full backup, then you could try it.
> Otherwise, don't try it at all.
>
> Other solution includes a specific repair program just for your case.
> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
> (".sxt" one) and related DIR_INDEX/INODE_REF.
> But I'll only choose this if you really need to fix it as soon as possible.
>
> At least we have solution for it.
> I'm more concerned about how this happened.
>
> Any idea about the reproducer? Or just random memory corruption?
>
> Thanks,
> Qu
> >
> > In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
> >> Thanks for your dump.
> >>
> >> We're clear what is the direct cause of the problem.
> >>
> >> It's one corrupted DIR_ITEM causing the problem.
> >> And further more, original mode btrfs check can't detect it, and we will
> >> fix it soon.
> >>
> >> The corrupted DIR_ITEM is as the following:
> >> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
> >> location key (4222342 INODE_ITEM 0) type FILE
> >> transid 170929 data_len 0 name_len 14
> >> name: deprecated.sxt
> >> location key (13590433 INODE_ITEM 0) type FILE
> >> transid 796448 data_len 0 name_len 14
> >> name: deprecated.txt
> >>
> >> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
> >> (ino=4222342) and "deprecated.sxt" (ino=13590433)
> >>
> >> But something goes wrong here:
> >>
> >> 1) Hash of "deprecated.sxt" doesn't match 54846528
> >>
> >> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
> >> Also captured by dump:
> >> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
> >> inode ref index 417 namelen 14 name: deprecated.txt
> >>
> >> 3) DIR_INDEX also shows that filename for inode 4222342 should be
> >> "deprecated.txt"
> >> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
> >> location key (4222342 INODE_ITEM 0) type FILE
> >> transid 170929 data_len 0 name_len 14
> >> name: deprecated.txt
> >>
> >> So generic speaking, it's DIR_ITEM wrong and causing the problem.
> >>
> >> But the root reason is still unknown.
> >>
> >> What I can see is, the corrupted DIR_ITEM points to an very old inode,
> >> its mtime is back to 2016-09-07.
> >> While the good DIR_ITEM points to newer inode, whose mtime is just
> >> 2017-05-02.
> >>
> >> But more weird, there should not be two child inodes with the same
> >> filename ("depercated.txt", I assume the sxt one is caused by a memory
> >> bit corruption).
> >>
> >> So, any details on the operation with util-linux/deprecated.txt will
> >> help us to locate the root cause in kernel.
> >>
> >> Thanks,
> >> Qu
> >>
> >>
> >> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
> >>> Done, files added to same GDrive folder with corresponding names.
> >>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
> >>>
> >>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
> >>>>
> >>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
> >>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
> >>>>
> >>>> It looks much better, thanks.
> >>>>
> >>>>>
> >>>>> And btrfs check --mode=lowmem gives this:
> >>>>>
> >>>>> checking extents
> >>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> >>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> >>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> >>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> >>>>> ERROR: errors found in extent allocation tree or chunk allocation
> >>>>
> >>>> Looks much like an exposed lowmem mode bug.
> >>>> Feel free to ignore these error from extent tree, they are just false
> >>>> alerts.
> >>>>
> >>>>> checking free space cache
> >>>>> checking fs roots
> >>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>
> >>>> The error report is much better than original mode, and that's what I need.
> >>>>
> >>>> Now I can wipe out all other noise as we know exactly which tree and
> >>>> which DIR_ITEM/INODE_REF is causing the problem.
> >>>>
> >>>> Would you please update the dump result with "-t 4546" passed to
> >>>> btrfs-debug-tree like:
> >>>>
> >>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
> >>>>
> >>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
> >>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
> >>>> updated.
> >>>>
> >>>> And it seems that my previous assumption is still right for this case.
> >>>> If it's caused by kernel, your dump would definitely help us to locate
> >>>> the problem.
> >>>>
> >>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> >>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>
> >>>> Also for root 5134 please.
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>
> >>>>> ERROR: errors found in fs roots
> >>>>> Checking filesystem on /dev/sda2
> >>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>> found 153429872640 bytes used, error(s) found
> >>>>> total csum bytes: 121991672
> >>>>> total tree bytes: 1940160512
> >>>>> total fs tree bytes: 1683767296
> >>>>> total extent tree bytes: 103841792
> >>>>> btree space waste bytes: 310722480
> >>>>> file data blocks allocated: 842455031808
> >>>>> referenced 159286636544
> >>>>>
> >>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> >>>>>> Sorry for the late reply.
> >>>>>>
> >>>>>> After investigating the dumps, I found the output is quite strange.
> >>>>>>
> >>>>>> 1) Mismatching output.
> >>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> >>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> >>>>>> here at all.
> >>>>>>
> >>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> >>>>>> 79177 DIR_ITEM/DIR_INDEX.
> >>>>>>
> >>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>>>
> >>>>>> 2) Mismatched hash
> >>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> >>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
> >>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
> >>>>>>
> >>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
> >>>>>> not "deprecated.sxt".
> >>>>>>
> >>>>>> I think that's the main problem.
> >>>>>>
> >>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> >>>>>> mode reports similar (well, output may differ) error?
> >>>>>>
> >>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> >>>>>> that's the problem.
> >>>>>>
> >>>>>> However it may take some time before we can fix it in repair mode.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Qu
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> >>>>>>> Sure, here it is:
> >>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >>>>>>>
> >>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>>>>>>>> Hello everyone.
> >>>>>>>>>>
> >>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>>>>>>>
> >>>>>>>>>> enabling repair mode
> >>>>>>>>>> Checking filesystem on /dev/sda2
> >>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>>>>> checking extents
> >>>>>>>>>> Fixed 0 roots.
> >>>>>>>>>> checking free space cache
> >>>>>>>>>> cache and super generation don't match, space cache will be invalidated
> >>>>>>>>>> checking fs roots
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>
> >>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
> >>>>>>>>> pointer pointing to depercated.sxt.
> >>>>>>>>>
> >>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>>>>>>>> the cross reference rule of btrfs.
> >>>>>>>>>
> >>>>>>>>> Would you please run the following command to dump needed info for us to
> >>>>>>>>> debug?
> >>>>>>>>>
> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>>>>>>>
> >>>>>>>>> and
> >>>>>>>>>
> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>>>>>>>
> >>>>>>>>> and
> >>>>>>>>>
> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>>>>>>>> But such bit-flip should be detected by tree block csum.
> >>>>>>>>> I'm not sure what's wrong with it.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Qu
> >>>>>>>>>
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>> checking csums
> >>>>>>>>>> checking root refs
> >>>>>>>>>> found 23421812736 bytes used err is 0
> >>>>>>>>>> total csum bytes: 21531608
> >>>>>>>>>> total tree bytes: 776650752
> >>>>>>>>>> total fs tree bytes: 711278592
> >>>>>>>>>> total extent tree bytes: 36798464
> >>>>>>>>>> btree space waste bytes: 116002036
> >>>>>>>>>> file data blocks allocated: 850546470912
> >>>>>>>>>> referenced 27611987968
> >>>>>>>>>>
> >>>>>>>>>> Is it dangerous and what should I do about it?
> >>>>>>>>>>
> >>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>>
> >>>>>>>> I'm afraid that your mail may be rejected because the attachment size
> >>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >>>>>>>> share the attachment by google drive?
> >>>>>>>>
> >>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>>>>>>>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 12:04 ` Filippe LeMarchand
@ 2017-07-14 12:11 ` Qu Wenruo
2017-07-14 12:26 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-14 12:11 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
On 2017年07月14日 20:04, Filippe LeMarchand wrote:
>> Currently possible solution may be deleting the whole subvolume.
> Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync?
You could try if you have backup.
Personally speaking, I'm not sure if it will work or make things worse.
Such hash and name mismatch is really rare, I don't know how kernel send
will handle it.
>
>> If you have full backup, then you could try it.
> It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes?
Unfortunately, it may corrupt other subvolumes.
But from your fsck output, the possibility of corruption is not that
high AFAIK.
I recommend to backup other good subvolumes/snapshots using send and
receive just in case.
>
>> Any idea about the reproducer? Or just random memory corruption?
> No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago.
> Also I ran memtest recently and it didn't find any errors.
Well, that's common.
I'll focus on checking your dump result to make a special purposed
btrfs-corrupt-block to fix your situation if no other method works for you.
Thanks,
Qu
>
> In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote:
>>
>> On 2017年07月14日 18:12, Filippe LeMarchand wrote:
>>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
>>>
>>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
>>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
>>>
>>> $ sudo rm -rf /usr/share/doc/packages/util-linux/
>>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
>>>
>>> $ sudo ls -l /usr/share/doc/packages/util-linux/
>>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
>>> total 0
>>> -????????? ? ? ? ? ? deprecated.txt
>>
>> Similar behavior is also detected using manually crafted image in our
>> environment.
>>
>> Su Yue have sent patches to enhance error detection and test case for
>> it, but repairing is not supported.
>>
>>>
>>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
>>>
>>> $ ls -l /usr/share/doc/packages/util-linux/
>>> total 104
>>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
>>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
>>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
>>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
>>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
>>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
>>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
>>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
>>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
>>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
>>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
>>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
>>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
>>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
>>>
>>> So, is this situation actually dangerous? And what can I do to gather more information for you?
>>
>> The situation won't be worse. I'd recommend not to take any snapshot of
>> those subvolumes (4546 and 5134) to limit the corruption to those
>> subvolumes.
>>
>> However there is also no easy way to fix it yet.
>>
>> Currently possible solution may be deleting the whole subvolume.
>> If no further error happens, it may be fixed.
>>
>> IIRC btrfs check --repair in original mode has
>> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
>> handle it well.
>> Btrfs check --repair *MAY* fix it, or it may make things worse.
>> If you have full backup, then you could try it.
>> Otherwise, don't try it at all.
>>
>> Other solution includes a specific repair program just for your case.
>> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
>> (".sxt" one) and related DIR_INDEX/INODE_REF.
>> But I'll only choose this if you really need to fix it as soon as possible.
>>
>> At least we have solution for it.
>> I'm more concerned about how this happened.
>>
>> Any idea about the reproducer? Or just random memory corruption?
>>
>> Thanks,
>> Qu
>>>
>>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
>>>> Thanks for your dump.
>>>>
>>>> We're clear what is the direct cause of the problem.
>>>>
>>>> It's one corrupted DIR_ITEM causing the problem.
>>>> And further more, original mode btrfs check can't detect it, and we will
>>>> fix it soon.
>>>>
>>>> The corrupted DIR_ITEM is as the following:
>>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
>>>> location key (4222342 INODE_ITEM 0) type FILE
>>>> transid 170929 data_len 0 name_len 14
>>>> name: deprecated.sxt
>>>> location key (13590433 INODE_ITEM 0) type FILE
>>>> transid 796448 data_len 0 name_len 14
>>>> name: deprecated.txt
>>>>
>>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
>>>> (ino=4222342) and "deprecated.sxt" (ino=13590433)
>>>>
>>>> But something goes wrong here:
>>>>
>>>> 1) Hash of "deprecated.sxt" doesn't match 54846528
>>>>
>>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
>>>> Also captured by dump:
>>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
>>>> inode ref index 417 namelen 14 name: deprecated.txt
>>>>
>>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be
>>>> "deprecated.txt"
>>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
>>>> location key (4222342 INODE_ITEM 0) type FILE
>>>> transid 170929 data_len 0 name_len 14
>>>> name: deprecated.txt
>>>>
>>>> So generic speaking, it's DIR_ITEM wrong and causing the problem.
>>>>
>>>> But the root reason is still unknown.
>>>>
>>>> What I can see is, the corrupted DIR_ITEM points to an very old inode,
>>>> its mtime is back to 2016-09-07.
>>>> While the good DIR_ITEM points to newer inode, whose mtime is just
>>>> 2017-05-02.
>>>>
>>>> But more weird, there should not be two child inodes with the same
>>>> filename ("depercated.txt", I assume the sxt one is caused by a memory
>>>> bit corruption).
>>>>
>>>> So, any details on the operation with util-linux/deprecated.txt will
>>>> help us to locate the root cause in kernel.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
>>>>> Done, files added to same GDrive folder with corresponding names.
>>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
>>>>>
>>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
>>>>>>
>>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
>>>>>>
>>>>>> It looks much better, thanks.
>>>>>>
>>>>>>>
>>>>>>> And btrfs check --mode=lowmem gives this:
>>>>>>>
>>>>>>> checking extents
>>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
>>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
>>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
>>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
>>>>>>> ERROR: errors found in extent allocation tree or chunk allocation
>>>>>>
>>>>>> Looks much like an exposed lowmem mode bug.
>>>>>> Feel free to ignore these error from extent tree, they are just false
>>>>>> alerts.
>>>>>>
>>>>>>> checking free space cache
>>>>>>> checking fs roots
>>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>>>
>>>>>> The error report is much better than original mode, and that's what I need.
>>>>>>
>>>>>> Now I can wipe out all other noise as we know exactly which tree and
>>>>>> which DIR_ITEM/INODE_REF is causing the problem.
>>>>>>
>>>>>> Would you please update the dump result with "-t 4546" passed to
>>>>>> btrfs-debug-tree like:
>>>>>>
>>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
>>>>>>
>>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
>>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
>>>>>> updated.
>>>>>>
>>>>>> And it seems that my previous assumption is still right for this case.
>>>>>> If it's caused by kernel, your dump would definitely help us to locate
>>>>>> the problem.
>>>>>>
>>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
>>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>>>
>>>>>> Also for root 5134 please.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>> ERROR: errors found in fs roots
>>>>>>> Checking filesystem on /dev/sda2
>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>> found 153429872640 bytes used, error(s) found
>>>>>>> total csum bytes: 121991672
>>>>>>> total tree bytes: 1940160512
>>>>>>> total fs tree bytes: 1683767296
>>>>>>> total extent tree bytes: 103841792
>>>>>>> btree space waste bytes: 310722480
>>>>>>> file data blocks allocated: 842455031808
>>>>>>> referenced 159286636544
>>>>>>>
>>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
>>>>>>>> Sorry for the late reply.
>>>>>>>>
>>>>>>>> After investigating the dumps, I found the output is quite strange.
>>>>>>>>
>>>>>>>> 1) Mismatching output.
>>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
>>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
>>>>>>>> here at all.
>>>>>>>>
>>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
>>>>>>>> 79177 DIR_ITEM/DIR_INDEX.
>>>>>>>>
>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>>>>>
>>>>>>>> 2) Mismatched hash
>>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
>>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
>>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
>>>>>>>>
>>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
>>>>>>>> not "deprecated.sxt".
>>>>>>>>
>>>>>>>> I think that's the main problem.
>>>>>>>>
>>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
>>>>>>>> mode reports similar (well, output may differ) error?
>>>>>>>>
>>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
>>>>>>>> that's the problem.
>>>>>>>>
>>>>>>>> However it may take some time before we can fix it in repair mode.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
>>>>>>>>> Sure, here it is:
>>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>>>>>>>>>
>>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>>>>>>>>>> Hello everyone.
>>>>>>>>>>>>
>>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>>>>>>>>>
>>>>>>>>>>>> enabling repair mode
>>>>>>>>>>>> Checking filesystem on /dev/sda2
>>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>>>>>>> checking extents
>>>>>>>>>>>> Fixed 0 roots.
>>>>>>>>>>>> checking free space cache
>>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated
>>>>>>>>>>>> checking fs roots
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>
>>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
>>>>>>>>>>> pointer pointing to depercated.sxt.
>>>>>>>>>>>
>>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>>>>>>>>>> the cross reference rule of btrfs.
>>>>>>>>>>>
>>>>>>>>>>> Would you please run the following command to dump needed info for us to
>>>>>>>>>>> debug?
>>>>>>>>>>>
>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>>
>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>>
>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>>>>>>>>>> But such bit-flip should be detected by tree block csum.
>>>>>>>>>>> I'm not sure what's wrong with it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Qu
>>>>>>>>>>>
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>> checking csums
>>>>>>>>>>>> checking root refs
>>>>>>>>>>>> found 23421812736 bytes used err is 0
>>>>>>>>>>>> total csum bytes: 21531608
>>>>>>>>>>>> total tree bytes: 776650752
>>>>>>>>>>>> total fs tree bytes: 711278592
>>>>>>>>>>>> total extent tree bytes: 36798464
>>>>>>>>>>>> btree space waste bytes: 116002036
>>>>>>>>>>>> file data blocks allocated: 850546470912
>>>>>>>>>>>> referenced 27611987968
>>>>>>>>>>>>
>>>>>>>>>>>> Is it dangerous and what should I do about it?
>>>>>>>>>>>>
>>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>
>>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size
>>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>>>>>>>>>> share the attachment by google drive?
>>>>>>>>>>
>>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>>>>>>>>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 12:11 ` Qu Wenruo
@ 2017-07-14 12:26 ` Filippe LeMarchand
2017-07-14 12:41 ` Qu Wenruo
0 siblings, 1 reply; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-14 12:26 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 18265 bytes --]
So, my options are
a) Delete and re-create sobvolume
b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help)
c) Do nothing and wait for further update
?
In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote:
>
> On 2017年07月14日 20:04, Filippe LeMarchand wrote:
> >> Currently possible solution may be deleting the whole subvolume.
> > Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync?
>
> You could try if you have backup.
>
> Personally speaking, I'm not sure if it will work or make things worse.
> Such hash and name mismatch is really rare, I don't know how kernel send
> will handle it.
>
> >
> >> If you have full backup, then you could try it.
> > It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes?
>
> Unfortunately, it may corrupt other subvolumes.
> But from your fsck output, the possibility of corruption is not that
> high AFAIK.
>
> I recommend to backup other good subvolumes/snapshots using send and
> receive just in case.
>
> >
> >> Any idea about the reproducer? Or just random memory corruption?
> > No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago.
> > Also I ran memtest recently and it didn't find any errors.
>
> Well, that's common.
> I'll focus on checking your dump result to make a special purposed
> btrfs-corrupt-block to fix your situation if no other method works for you.
>
> Thanks,
> Qu
>
> >
> > In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote:
> >>
> >> On 2017年07月14日 18:12, Filippe LeMarchand wrote:
> >>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
> >>>
> >>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
> >>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> >>>
> >>> $ sudo rm -rf /usr/share/doc/packages/util-linux/
> >>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
> >>>
> >>> $ sudo ls -l /usr/share/doc/packages/util-linux/
> >>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> >>> total 0
> >>> -????????? ? ? ? ? ? deprecated.txt
> >>
> >> Similar behavior is also detected using manually crafted image in our
> >> environment.
> >>
> >> Su Yue have sent patches to enhance error detection and test case for
> >> it, but repairing is not supported.
> >>
> >>>
> >>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
> >>>
> >>> $ ls -l /usr/share/doc/packages/util-linux/
> >>> total 104
> >>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
> >>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
> >>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
> >>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
> >>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
> >>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
> >>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
> >>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
> >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> >>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
> >>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
> >>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
> >>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
> >>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
> >>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
> >>>
> >>> So, is this situation actually dangerous? And what can I do to gather more information for you?
> >>
> >> The situation won't be worse. I'd recommend not to take any snapshot of
> >> those subvolumes (4546 and 5134) to limit the corruption to those
> >> subvolumes.
> >>
> >> However there is also no easy way to fix it yet.
> >>
> >> Currently possible solution may be deleting the whole subvolume.
> >> If no further error happens, it may be fixed.
> >>
> >> IIRC btrfs check --repair in original mode has
> >> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
> >> handle it well.
> >> Btrfs check --repair *MAY* fix it, or it may make things worse.
> >> If you have full backup, then you could try it.
> >> Otherwise, don't try it at all.
> >>
> >> Other solution includes a specific repair program just for your case.
> >> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
> >> (".sxt" one) and related DIR_INDEX/INODE_REF.
> >> But I'll only choose this if you really need to fix it as soon as possible.
> >>
> >> At least we have solution for it.
> >> I'm more concerned about how this happened.
> >>
> >> Any idea about the reproducer? Or just random memory corruption?
> >>
> >> Thanks,
> >> Qu
> >>>
> >>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
> >>>> Thanks for your dump.
> >>>>
> >>>> We're clear what is the direct cause of the problem.
> >>>>
> >>>> It's one corrupted DIR_ITEM causing the problem.
> >>>> And further more, original mode btrfs check can't detect it, and we will
> >>>> fix it soon.
> >>>>
> >>>> The corrupted DIR_ITEM is as the following:
> >>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
> >>>> location key (4222342 INODE_ITEM 0) type FILE
> >>>> transid 170929 data_len 0 name_len 14
> >>>> name: deprecated.sxt
> >>>> location key (13590433 INODE_ITEM 0) type FILE
> >>>> transid 796448 data_len 0 name_len 14
> >>>> name: deprecated.txt
> >>>>
> >>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
> >>>> (ino=4222342) and "deprecated.sxt" (ino=13590433)
> >>>>
> >>>> But something goes wrong here:
> >>>>
> >>>> 1) Hash of "deprecated.sxt" doesn't match 54846528
> >>>>
> >>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
> >>>> Also captured by dump:
> >>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
> >>>> inode ref index 417 namelen 14 name: deprecated.txt
> >>>>
> >>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be
> >>>> "deprecated.txt"
> >>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
> >>>> location key (4222342 INODE_ITEM 0) type FILE
> >>>> transid 170929 data_len 0 name_len 14
> >>>> name: deprecated.txt
> >>>>
> >>>> So generic speaking, it's DIR_ITEM wrong and causing the problem.
> >>>>
> >>>> But the root reason is still unknown.
> >>>>
> >>>> What I can see is, the corrupted DIR_ITEM points to an very old inode,
> >>>> its mtime is back to 2016-09-07.
> >>>> While the good DIR_ITEM points to newer inode, whose mtime is just
> >>>> 2017-05-02.
> >>>>
> >>>> But more weird, there should not be two child inodes with the same
> >>>> filename ("depercated.txt", I assume the sxt one is caused by a memory
> >>>> bit corruption).
> >>>>
> >>>> So, any details on the operation with util-linux/deprecated.txt will
> >>>> help us to locate the root cause in kernel.
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>
> >>>>
> >>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
> >>>>> Done, files added to same GDrive folder with corresponding names.
> >>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
> >>>>>
> >>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
> >>>>>>
> >>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
> >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
> >>>>>>
> >>>>>> It looks much better, thanks.
> >>>>>>
> >>>>>>>
> >>>>>>> And btrfs check --mode=lowmem gives this:
> >>>>>>>
> >>>>>>> checking extents
> >>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> >>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> >>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> >>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> >>>>>>> ERROR: errors found in extent allocation tree or chunk allocation
> >>>>>>
> >>>>>> Looks much like an exposed lowmem mode bug.
> >>>>>> Feel free to ignore these error from extent tree, they are just false
> >>>>>> alerts.
> >>>>>>
> >>>>>>> checking free space cache
> >>>>>>> checking fs roots
> >>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>>>
> >>>>>> The error report is much better than original mode, and that's what I need.
> >>>>>>
> >>>>>> Now I can wipe out all other noise as we know exactly which tree and
> >>>>>> which DIR_ITEM/INODE_REF is causing the problem.
> >>>>>>
> >>>>>> Would you please update the dump result with "-t 4546" passed to
> >>>>>> btrfs-debug-tree like:
> >>>>>>
> >>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
> >>>>>>
> >>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
> >>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
> >>>>>> updated.
> >>>>>>
> >>>>>> And it seems that my previous assumption is still right for this case.
> >>>>>> If it's caused by kernel, your dump would definitely help us to locate
> >>>>>> the problem.
> >>>>>>
> >>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> >>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>>>
> >>>>>> Also for root 5134 please.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Qu
> >>>>>>
> >>>>>>> ERROR: errors found in fs roots
> >>>>>>> Checking filesystem on /dev/sda2
> >>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>> found 153429872640 bytes used, error(s) found
> >>>>>>> total csum bytes: 121991672
> >>>>>>> total tree bytes: 1940160512
> >>>>>>> total fs tree bytes: 1683767296
> >>>>>>> total extent tree bytes: 103841792
> >>>>>>> btree space waste bytes: 310722480
> >>>>>>> file data blocks allocated: 842455031808
> >>>>>>> referenced 159286636544
> >>>>>>>
> >>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> >>>>>>>> Sorry for the late reply.
> >>>>>>>>
> >>>>>>>> After investigating the dumps, I found the output is quite strange.
> >>>>>>>>
> >>>>>>>> 1) Mismatching output.
> >>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> >>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> >>>>>>>> here at all.
> >>>>>>>>
> >>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> >>>>>>>> 79177 DIR_ITEM/DIR_INDEX.
> >>>>>>>>
> >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>>>>>
> >>>>>>>> 2) Mismatched hash
> >>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> >>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
> >>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
> >>>>>>>>
> >>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
> >>>>>>>> not "deprecated.sxt".
> >>>>>>>>
> >>>>>>>> I think that's the main problem.
> >>>>>>>>
> >>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> >>>>>>>> mode reports similar (well, output may differ) error?
> >>>>>>>>
> >>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> >>>>>>>> that's the problem.
> >>>>>>>>
> >>>>>>>> However it may take some time before we can fix it in repair mode.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Qu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> >>>>>>>>> Sure, here it is:
> >>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >>>>>>>>>
> >>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>>>>>>>>>> Hello everyone.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>>>>>>>>>
> >>>>>>>>>>>> enabling repair mode
> >>>>>>>>>>>> Checking filesystem on /dev/sda2
> >>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>>>>>>> checking extents
> >>>>>>>>>>>> Fixed 0 roots.
> >>>>>>>>>>>> checking free space cache
> >>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated
> >>>>>>>>>>>> checking fs roots
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>
> >>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
> >>>>>>>>>>> pointer pointing to depercated.sxt.
> >>>>>>>>>>>
> >>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>>>>>>>>>> the cross reference rule of btrfs.
> >>>>>>>>>>>
> >>>>>>>>>>> Would you please run the following command to dump needed info for us to
> >>>>>>>>>>> debug?
> >>>>>>>>>>>
> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>>>>>>>>>
> >>>>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>>>>>>>>>
> >>>>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>>>>>>>>>> But such bit-flip should be detected by tree block csum.
> >>>>>>>>>>> I'm not sure what's wrong with it.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Qu
> >>>>>>>>>>>
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>> checking csums
> >>>>>>>>>>>> checking root refs
> >>>>>>>>>>>> found 23421812736 bytes used err is 0
> >>>>>>>>>>>> total csum bytes: 21531608
> >>>>>>>>>>>> total tree bytes: 776650752
> >>>>>>>>>>>> total fs tree bytes: 711278592
> >>>>>>>>>>>> total extent tree bytes: 36798464
> >>>>>>>>>>>> btree space waste bytes: 116002036
> >>>>>>>>>>>> file data blocks allocated: 850546470912
> >>>>>>>>>>>> referenced 27611987968
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is it dangerous and what should I do about it?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>>>>
> >>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size
> >>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >>>>>>>>>> share the attachment by google drive?
> >>>>>>>>>>
> >>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>>>>>>>>>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 12:26 ` Filippe LeMarchand
@ 2017-07-14 12:41 ` Qu Wenruo
2017-07-14 12:45 ` Filippe LeMarchand
0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2017-07-14 12:41 UTC (permalink / raw)
To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
On 2017年07月14日 20:26, Filippe LeMarchand wrote:
> So, my options are
> a) Delete and re-create sobvolume
> b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help)
Then --repair doesn't help now.
> c) Do nothing and wait for further update
Further update plan includes:
c) Update btrfs check --repair to handle your case.
This will take some time for us to test and other guys to review.
d) Create a special purposed btrfs-corrupt-block patch for your image.
This will fix your fs, but only for your fs.
Not a generic solution, but at least it should work.
For now, it's recommend to backup important data, in case both c) and d)
fail.
Thanks,
Qu
> ?
>
> In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote:
>>
>> On 2017年07月14日 20:04, Filippe LeMarchand wrote:
>>>> Currently possible solution may be deleting the whole subvolume.
>>> Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync?
>>
>> You could try if you have backup.
>>
>> Personally speaking, I'm not sure if it will work or make things worse.
>> Such hash and name mismatch is really rare, I don't know how kernel send
>> will handle it.
>>
>>>
>>>> If you have full backup, then you could try it.
>>> It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes?
>>
>> Unfortunately, it may corrupt other subvolumes.
>> But from your fsck output, the possibility of corruption is not that
>> high AFAIK.
>>
>> I recommend to backup other good subvolumes/snapshots using send and
>> receive just in case.
>>
>>>
>>>> Any idea about the reproducer? Or just random memory corruption?
>>> No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago.
>>> Also I ran memtest recently and it didn't find any errors.
>>
>> Well, that's common.
>> I'll focus on checking your dump result to make a special purposed
>> btrfs-corrupt-block to fix your situation if no other method works for you.
>>
>> Thanks,
>> Qu
>>
>>>
>>> In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote:
>>>>
>>>> On 2017年07月14日 18:12, Filippe LeMarchand wrote:
>>>>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
>>>>>
>>>>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
>>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
>>>>>
>>>>> $ sudo rm -rf /usr/share/doc/packages/util-linux/
>>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
>>>>>
>>>>> $ sudo ls -l /usr/share/doc/packages/util-linux/
>>>>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
>>>>> total 0
>>>>> -????????? ? ? ? ? ? deprecated.txt
>>>>
>>>> Similar behavior is also detected using manually crafted image in our
>>>> environment.
>>>>
>>>> Su Yue have sent patches to enhance error detection and test case for
>>>> it, but repairing is not supported.
>>>>
>>>>>
>>>>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
>>>>>
>>>>> $ ls -l /usr/share/doc/packages/util-linux/
>>>>> total 104
>>>>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
>>>>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
>>>>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
>>>>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
>>>>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
>>>>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
>>>>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
>>>>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
>>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
>>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
>>>>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
>>>>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
>>>>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
>>>>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
>>>>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
>>>>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
>>>>>
>>>>> So, is this situation actually dangerous? And what can I do to gather more information for you?
>>>>
>>>> The situation won't be worse. I'd recommend not to take any snapshot of
>>>> those subvolumes (4546 and 5134) to limit the corruption to those
>>>> subvolumes.
>>>>
>>>> However there is also no easy way to fix it yet.
>>>>
>>>> Currently possible solution may be deleting the whole subvolume.
>>>> If no further error happens, it may be fixed.
>>>>
>>>> IIRC btrfs check --repair in original mode has
>>>> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
>>>> handle it well.
>>>> Btrfs check --repair *MAY* fix it, or it may make things worse.
>>>> If you have full backup, then you could try it.
>>>> Otherwise, don't try it at all.
>>>>
>>>> Other solution includes a specific repair program just for your case.
>>>> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
>>>> (".sxt" one) and related DIR_INDEX/INODE_REF.
>>>> But I'll only choose this if you really need to fix it as soon as possible.
>>>>
>>>> At least we have solution for it.
>>>> I'm more concerned about how this happened.
>>>>
>>>> Any idea about the reproducer? Or just random memory corruption?
>>>>
>>>> Thanks,
>>>> Qu
>>>>>
>>>>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
>>>>>> Thanks for your dump.
>>>>>>
>>>>>> We're clear what is the direct cause of the problem.
>>>>>>
>>>>>> It's one corrupted DIR_ITEM causing the problem.
>>>>>> And further more, original mode btrfs check can't detect it, and we will
>>>>>> fix it soon.
>>>>>>
>>>>>> The corrupted DIR_ITEM is as the following:
>>>>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
>>>>>> location key (4222342 INODE_ITEM 0) type FILE
>>>>>> transid 170929 data_len 0 name_len 14
>>>>>> name: deprecated.sxt
>>>>>> location key (13590433 INODE_ITEM 0) type FILE
>>>>>> transid 796448 data_len 0 name_len 14
>>>>>> name: deprecated.txt
>>>>>>
>>>>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
>>>>>> (ino=4222342) and "deprecated.sxt" (ino=13590433)
>>>>>>
>>>>>> But something goes wrong here:
>>>>>>
>>>>>> 1) Hash of "deprecated.sxt" doesn't match 54846528
>>>>>>
>>>>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
>>>>>> Also captured by dump:
>>>>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
>>>>>> inode ref index 417 namelen 14 name: deprecated.txt
>>>>>>
>>>>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be
>>>>>> "deprecated.txt"
>>>>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
>>>>>> location key (4222342 INODE_ITEM 0) type FILE
>>>>>> transid 170929 data_len 0 name_len 14
>>>>>> name: deprecated.txt
>>>>>>
>>>>>> So generic speaking, it's DIR_ITEM wrong and causing the problem.
>>>>>>
>>>>>> But the root reason is still unknown.
>>>>>>
>>>>>> What I can see is, the corrupted DIR_ITEM points to an very old inode,
>>>>>> its mtime is back to 2016-09-07.
>>>>>> While the good DIR_ITEM points to newer inode, whose mtime is just
>>>>>> 2017-05-02.
>>>>>>
>>>>>> But more weird, there should not be two child inodes with the same
>>>>>> filename ("depercated.txt", I assume the sxt one is caused by a memory
>>>>>> bit corruption).
>>>>>>
>>>>>> So, any details on the operation with util-linux/deprecated.txt will
>>>>>> help us to locate the root cause in kernel.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>
>>>>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
>>>>>>> Done, files added to same GDrive folder with corresponding names.
>>>>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
>>>>>>>
>>>>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
>>>>>>>>
>>>>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
>>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
>>>>>>>>
>>>>>>>> It looks much better, thanks.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> And btrfs check --mode=lowmem gives this:
>>>>>>>>>
>>>>>>>>> checking extents
>>>>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
>>>>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
>>>>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
>>>>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
>>>>>>>>> ERROR: errors found in extent allocation tree or chunk allocation
>>>>>>>>
>>>>>>>> Looks much like an exposed lowmem mode bug.
>>>>>>>> Feel free to ignore these error from extent tree, they are just false
>>>>>>>> alerts.
>>>>>>>>
>>>>>>>>> checking free space cache
>>>>>>>>> checking fs roots
>>>>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>>>>>
>>>>>>>> The error report is much better than original mode, and that's what I need.
>>>>>>>>
>>>>>>>> Now I can wipe out all other noise as we know exactly which tree and
>>>>>>>> which DIR_ITEM/INODE_REF is causing the problem.
>>>>>>>>
>>>>>>>> Would you please update the dump result with "-t 4546" passed to
>>>>>>>> btrfs-debug-tree like:
>>>>>>>>
>>>>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
>>>>>>>>
>>>>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
>>>>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
>>>>>>>> updated.
>>>>>>>>
>>>>>>>> And it seems that my previous assumption is still right for this case.
>>>>>>>> If it's caused by kernel, your dump would definitely help us to locate
>>>>>>>> the problem.
>>>>>>>>
>>>>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
>>>>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
>>>>>>>>
>>>>>>>> Also for root 5134 please.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>
>>>>>>>>> ERROR: errors found in fs roots
>>>>>>>>> Checking filesystem on /dev/sda2
>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>>>> found 153429872640 bytes used, error(s) found
>>>>>>>>> total csum bytes: 121991672
>>>>>>>>> total tree bytes: 1940160512
>>>>>>>>> total fs tree bytes: 1683767296
>>>>>>>>> total extent tree bytes: 103841792
>>>>>>>>> btree space waste bytes: 310722480
>>>>>>>>> file data blocks allocated: 842455031808
>>>>>>>>> referenced 159286636544
>>>>>>>>>
>>>>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
>>>>>>>>>> Sorry for the late reply.
>>>>>>>>>>
>>>>>>>>>> After investigating the dumps, I found the output is quite strange.
>>>>>>>>>>
>>>>>>>>>> 1) Mismatching output.
>>>>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
>>>>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
>>>>>>>>>> here at all.
>>>>>>>>>>
>>>>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
>>>>>>>>>> 79177 DIR_ITEM/DIR_INDEX.
>>>>>>>>>>
>>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
>>>>>>>>>>
>>>>>>>>>> 2) Mismatched hash
>>>>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
>>>>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
>>>>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
>>>>>>>>>>
>>>>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
>>>>>>>>>> not "deprecated.sxt".
>>>>>>>>>>
>>>>>>>>>> I think that's the main problem.
>>>>>>>>>>
>>>>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
>>>>>>>>>> mode reports similar (well, output may differ) error?
>>>>>>>>>>
>>>>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
>>>>>>>>>> that's the problem.
>>>>>>>>>>
>>>>>>>>>> However it may take some time before we can fix it in repair mode.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Qu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
>>>>>>>>>>> Sure, here it is:
>>>>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
>>>>>>>>>>>
>>>>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
>>>>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
>>>>>>>>>>>>>> Hello everyone.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
>>>>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> enabling repair mode
>>>>>>>>>>>>>> Checking filesystem on /dev/sda2
>>>>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
>>>>>>>>>>>>>> checking extents
>>>>>>>>>>>>>> Fixed 0 roots.
>>>>>>>>>>>>>> checking free space cache
>>>>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated
>>>>>>>>>>>>>> checking fs roots
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>
>>>>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
>>>>>>>>>>>>> pointer pointing to depercated.sxt.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
>>>>>>>>>>>>> the cross reference rule of btrfs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would you please run the following command to dump needed info for us to
>>>>>>>>>>>>> debug?
>>>>>>>>>>>>>
>>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>>>>>>>>
>>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>>>>>>>>
>>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
>>>>>>>>>>>>> But such bit-flip should be detected by tree block csum.
>>>>>>>>>>>>> I'm not sure what's wrong with it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Qu
>>>>>>>>>>>>>
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
>>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
>>>>>>>>>>>>>> checking csums
>>>>>>>>>>>>>> checking root refs
>>>>>>>>>>>>>> found 23421812736 bytes used err is 0
>>>>>>>>>>>>>> total csum bytes: 21531608
>>>>>>>>>>>>>> total tree bytes: 776650752
>>>>>>>>>>>>>> total fs tree bytes: 711278592
>>>>>>>>>>>>>> total extent tree bytes: 36798464
>>>>>>>>>>>>>> btree space waste bytes: 116002036
>>>>>>>>>>>>>> file data blocks allocated: 850546470912
>>>>>>>>>>>>>> referenced 27611987968
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is it dangerous and what should I do about it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>
>>>>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size
>>>>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
>>>>>>>>>>>> share the attachment by google drive?
>>>>>>>>>>>>
>>>>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
>>>>>>>>>>>>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine
2017-07-14 12:41 ` Qu Wenruo
@ 2017-07-14 12:45 ` Filippe LeMarchand
0 siblings, 0 replies; 16+ messages in thread
From: Filippe LeMarchand @ 2017-07-14 12:45 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo
[-- Attachment #1: Type: text/plain, Size: 19713 bytes --]
Ok then, many thanks.
In a letter from Friday, July 14, 2017 15:41:22 MSK user Qu Wenruo wrote:
>
> On 2017年07月14日 20:26, Filippe LeMarchand wrote:
> > So, my options are
> > a) Delete and re-create sobvolume
> > b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help)
>
> Then --repair doesn't help now.
>
> > c) Do nothing and wait for further update
>
> Further update plan includes:
> c) Update btrfs check --repair to handle your case.
> This will take some time for us to test and other guys to review.
>
> d) Create a special purposed btrfs-corrupt-block patch for your image.
> This will fix your fs, but only for your fs.
> Not a generic solution, but at least it should work.
>
> For now, it's recommend to backup important data, in case both c) and d)
> fail.
>
> Thanks,
> Qu
> > ?
> >
> > In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote:
> >>
> >> On 2017年07月14日 20:04, Filippe LeMarchand wrote:
> >>>> Currently possible solution may be deleting the whole subvolume.
> >>> Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync?
> >>
> >> You could try if you have backup.
> >>
> >> Personally speaking, I'm not sure if it will work or make things worse.
> >> Such hash and name mismatch is really rare, I don't know how kernel send
> >> will handle it.
> >>
> >>>
> >>>> If you have full backup, then you could try it.
> >>> It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes?
> >>
> >> Unfortunately, it may corrupt other subvolumes.
> >> But from your fsck output, the possibility of corruption is not that
> >> high AFAIK.
> >>
> >> I recommend to backup other good subvolumes/snapshots using send and
> >> receive just in case.
> >>
> >>>
> >>>> Any idea about the reproducer? Or just random memory corruption?
> >>> No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago.
> >>> Also I ran memtest recently and it didn't find any errors.
> >>
> >> Well, that's common.
> >> I'll focus on checking your dump result to make a special purposed
> >> btrfs-corrupt-block to fix your situation if no other method works for you.
> >>
> >> Thanks,
> >> Qu
> >>
> >>>
> >>> In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote:
> >>>>
> >>>> On 2017年07月14日 18:12, Filippe LeMarchand wrote:
> >>>>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted:
> >>>>>
> >>>>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt
> >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> >>>>>
> >>>>> $ sudo rm -rf /usr/share/doc/packages/util-linux/
> >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty
> >>>>>
> >>>>> $ sudo ls -l /usr/share/doc/packages/util-linux/
> >>>>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory
> >>>>> total 0
> >>>>> -????????? ? ? ? ? ? deprecated.txt
> >>>>
> >>>> Similar behavior is also detected using manually crafted image in our
> >>>> environment.
> >>>>
> >>>> Su Yue have sent patches to enhance error detection and test case for
> >>>> it, but repairing is not supported.
> >>>>
> >>>>>
> >>>>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot):
> >>>>>
> >>>>> $ ls -l /usr/share/doc/packages/util-linux/
> >>>>> total 104
> >>>>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING
> >>>>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3
> >>>>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1
> >>>>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB
> >>>>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing
> >>>>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt
> >>>>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt
> >>>>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt
> >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt
> >>>>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt
> >>>>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt
> >>>>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt
> >>>>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt
> >>>>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt
> >>>>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt
> >>>>>
> >>>>> So, is this situation actually dangerous? And what can I do to gather more information for you?
> >>>>
> >>>> The situation won't be worse. I'd recommend not to take any snapshot of
> >>>> those subvolumes (4546 and 5134) to limit the corruption to those
> >>>> subvolumes.
> >>>>
> >>>> However there is also no easy way to fix it yet.
> >>>>
> >>>> Currently possible solution may be deleting the whole subvolume.
> >>>> If no further error happens, it may be fixed.
> >>>>
> >>>> IIRC btrfs check --repair in original mode has
> >>>> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can
> >>>> handle it well.
> >>>> Btrfs check --repair *MAY* fix it, or it may make things worse.
> >>>> If you have full backup, then you could try it.
> >>>> Otherwise, don't try it at all.
> >>>>
> >>>> Other solution includes a specific repair program just for your case.
> >>>> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM
> >>>> (".sxt" one) and related DIR_INDEX/INODE_REF.
> >>>> But I'll only choose this if you really need to fix it as soon as possible.
> >>>>
> >>>> At least we have solution for it.
> >>>> I'm more concerned about how this happened.
> >>>>
> >>>> Any idea about the reproducer? Or just random memory corruption?
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>>
> >>>>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote:
> >>>>>> Thanks for your dump.
> >>>>>>
> >>>>>> We're clear what is the direct cause of the problem.
> >>>>>>
> >>>>>> It's one corrupted DIR_ITEM causing the problem.
> >>>>>> And further more, original mode btrfs check can't detect it, and we will
> >>>>>> fix it soon.
> >>>>>>
> >>>>>> The corrupted DIR_ITEM is as the following:
> >>>>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88
> >>>>>> location key (4222342 INODE_ITEM 0) type FILE
> >>>>>> transid 170929 data_len 0 name_len 14
> >>>>>> name: deprecated.sxt
> >>>>>> location key (13590433 INODE_ITEM 0) type FILE
> >>>>>> transid 796448 data_len 0 name_len 14
> >>>>>> name: deprecated.txt
> >>>>>>
> >>>>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt"
> >>>>>> (ino=4222342) and "deprecated.sxt" (ino=13590433)
> >>>>>>
> >>>>>> But something goes wrong here:
> >>>>>>
> >>>>>> 1) Hash of "deprecated.sxt" doesn't match 54846528
> >>>>>>
> >>>>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt"
> >>>>>> Also captured by dump:
> >>>>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24
> >>>>>> inode ref index 417 namelen 14 name: deprecated.txt
> >>>>>>
> >>>>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be
> >>>>>> "deprecated.txt"
> >>>>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44
> >>>>>> location key (4222342 INODE_ITEM 0) type FILE
> >>>>>> transid 170929 data_len 0 name_len 14
> >>>>>> name: deprecated.txt
> >>>>>>
> >>>>>> So generic speaking, it's DIR_ITEM wrong and causing the problem.
> >>>>>>
> >>>>>> But the root reason is still unknown.
> >>>>>>
> >>>>>> What I can see is, the corrupted DIR_ITEM points to an very old inode,
> >>>>>> its mtime is back to 2016-09-07.
> >>>>>> While the good DIR_ITEM points to newer inode, whose mtime is just
> >>>>>> 2017-05-02.
> >>>>>>
> >>>>>> But more weird, there should not be two child inodes with the same
> >>>>>> filename ("depercated.txt", I assume the sxt one is caused by a memory
> >>>>>> bit corruption).
> >>>>>>
> >>>>>> So, any details on the operation with util-linux/deprecated.txt will
> >>>>>> help us to locate the root cause in kernel.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Qu
> >>>>>>
> >>>>>>
> >>>>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote:
> >>>>>>> Done, files added to same GDrive folder with corresponding names.
> >>>>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot.
> >>>>>>>
> >>>>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote:
> >>>>>>>>
> >>>>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote:
> >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated.
> >>>>>>>>
> >>>>>>>> It looks much better, thanks.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> And btrfs check --mode=lowmem gives this:
> >>>>>>>>>
> >>>>>>>>> checking extents
> >>>>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5
> >>>>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114
> >>>>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5
> >>>>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25
> >>>>>>>>> ERROR: errors found in extent allocation tree or chunk allocation
> >>>>>>>>
> >>>>>>>> Looks much like an exposed lowmem mode bug.
> >>>>>>>> Feel free to ignore these error from extent tree, they are just false
> >>>>>>>> alerts.
> >>>>>>>>
> >>>>>>>>> checking free space cache
> >>>>>>>>> checking fs roots
> >>>>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>>>>>
> >>>>>>>> The error report is much better than original mode, and that's what I need.
> >>>>>>>>
> >>>>>>>> Now I can wipe out all other noise as we know exactly which tree and
> >>>>>>>> which DIR_ITEM/INODE_REF is causing the problem.
> >>>>>>>>
> >>>>>>>> Would you please update the dump result with "-t 4546" passed to
> >>>>>>>> btrfs-debug-tree like:
> >>>>>>>>
> >>>>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177
> >>>>>>>>
> >>>>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546.
> >>>>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be
> >>>>>>>> updated.
> >>>>>>>>
> >>>>>>>> And it seems that my previous assumption is still right for this case.
> >>>>>>>> If it's caused by kernel, your dump would definitely help us to locate
> >>>>>>>> the problem.
> >>>>>>>>
> >>>>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1
> >>>>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1
> >>>>>>>>
> >>>>>>>> Also for root 5134 please.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Qu
> >>>>>>>>
> >>>>>>>>> ERROR: errors found in fs roots
> >>>>>>>>> Checking filesystem on /dev/sda2
> >>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>>>> found 153429872640 bytes used, error(s) found
> >>>>>>>>> total csum bytes: 121991672
> >>>>>>>>> total tree bytes: 1940160512
> >>>>>>>>> total fs tree bytes: 1683767296
> >>>>>>>>> total extent tree bytes: 103841792
> >>>>>>>>> btree space waste bytes: 310722480
> >>>>>>>>> file data blocks allocated: 842455031808
> >>>>>>>>> referenced 159286636544
> >>>>>>>>>
> >>>>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote:
> >>>>>>>>>> Sorry for the late reply.
> >>>>>>>>>>
> >>>>>>>>>> After investigating the dumps, I found the output is quite strange.
> >>>>>>>>>>
> >>>>>>>>>> 1) Mismatching output.
> >>>>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for
> >>>>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not
> >>>>>>>>>> here at all.
> >>>>>>>>>>
> >>>>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected
> >>>>>>>>>> 79177 DIR_ITEM/DIR_INDEX.
> >>>>>>>>>>
> >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ?
> >>>>>>>>>>
> >>>>>>>>>> 2) Mismatched hash
> >>>>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the
> >>>>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2
> >>>>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt".
> >>>>>>>>>>
> >>>>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt",
> >>>>>>>>>> not "deprecated.sxt".
> >>>>>>>>>>
> >>>>>>>>>> I think that's the main problem.
> >>>>>>>>>>
> >>>>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem
> >>>>>>>>>> mode reports similar (well, output may differ) error?
> >>>>>>>>>>
> >>>>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure
> >>>>>>>>>> that's the problem.
> >>>>>>>>>>
> >>>>>>>>>> However it may take some time before we can fix it in repair mode.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Qu
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道:
> >>>>>>>>>>> Sure, here it is:
> >>>>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc
> >>>>>>>>>>>
> >>>>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote:
> >>>>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote:
> >>>>>>>>>>>>>> Hello everyone.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine,
> >>>>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors):
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> enabling repair mode
> >>>>>>>>>>>>>> Checking filesystem on /dev/sda2
> >>>>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e
> >>>>>>>>>>>>>> checking extents
> >>>>>>>>>>>>>> Fixed 0 roots.
> >>>>>>>>>>>>>> checking free space cache
> >>>>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated
> >>>>>>>>>>>>>> checking fs roots
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode
> >>>>>>>>>>>>> pointer pointing to depercated.sxt.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking
> >>>>>>>>>>>>> the cross reference rule of btrfs.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Would you please run the following command to dump needed info for us to
> >>>>>>>>>>>>> debug?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem.
> >>>>>>>>>>>>> But such bit-flip should be detected by tree block csum.
> >>>>>>>>>>>>> I'm not sure what's wrong with it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Qu
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref
> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item
> >>>>>>>>>>>>>> checking csums
> >>>>>>>>>>>>>> checking root refs
> >>>>>>>>>>>>>> found 23421812736 bytes used err is 0
> >>>>>>>>>>>>>> total csum bytes: 21531608
> >>>>>>>>>>>>>> total tree bytes: 776650752
> >>>>>>>>>>>>>> total fs tree bytes: 711278592
> >>>>>>>>>>>>>> total extent tree bytes: 36798464
> >>>>>>>>>>>>>> btree space waste bytes: 116002036
> >>>>>>>>>>>>>> file data blocks allocated: 850546470912
> >>>>>>>>>>>>>> referenced 27611987968
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Is it dangerous and what should I do about it?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>>>>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size
> >>>>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you
> >>>>>>>>>>>> share the attachment by google drive?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue.
> >>>>>>>>>>>>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5037 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-07-14 12:46 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-01 11:59 Btrfs check reports errors, filesystem seems fine Filippe LeMarchand
2017-07-03 0:34 ` Qu Wenruo
2017-07-04 13:16 ` Lu Fengqi
2017-07-04 13:24 ` Filippe LeMarchand
2017-07-12 7:15 ` Qu Wenruo
2017-07-12 11:12 ` Filippe LeMarchand
2017-07-12 12:44 ` Qu Wenruo
2017-07-12 13:11 ` Filippe LeMarchand
2017-07-14 6:11 ` Qu Wenruo
2017-07-14 10:12 ` Filippe LeMarchand
2017-07-14 11:28 ` Qu Wenruo
2017-07-14 12:04 ` Filippe LeMarchand
2017-07-14 12:11 ` Qu Wenruo
2017-07-14 12:26 ` Filippe LeMarchand
2017-07-14 12:41 ` Qu Wenruo
2017-07-14 12:45 ` Filippe LeMarchand
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.