* Btrfs check reports errors, filesystem seems fine @ 2017-07-01 11:59 Filippe LeMarchand 2017-07-03 0:34 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-01 11:59 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2784 bytes --] Hello everyone. I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): enabling repair mode Checking filesystem on /dev/sda2 UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e checking extents Fixed 0 roots. checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item checking csums checking root refs found 23421812736 bytes used err is 0 total csum bytes: 21531608 total tree bytes: 776650752 total fs tree bytes: 711278592 total extent tree bytes: 36798464 btree space waste bytes: 116002036 file data blocks allocated: 850546470912 referenced 27611987968 Is it dangerous and what should I do about it? I also tried --clear-space-cache, but it just removes the line about space cache. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-01 11:59 Btrfs check reports errors, filesystem seems fine Filippe LeMarchand @ 2017-07-03 0:34 ` Qu Wenruo 2017-07-04 13:16 ` Lu Fengqi 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-03 0:34 UTC (permalink / raw) To: Filippe LeMarchand, linux-btrfs At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > Hello everyone. > > I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > > enabling repair mode > Checking filesystem on /dev/sda2 > UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > checking extents > Fixed 0 roots. > checking free space cache > cache and super generation don't match, space cache will be invalidated > checking fs roots > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref This means that in dir whose inode number is 79177, it has a child inode pointer pointing to depercated.sxt. But it doesn't have dir index and corresponding inode ref, which is breaking the cross reference rule of btrfs. Would you please run the following command to dump needed info for us to debug? # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 and # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 and # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 Considering the output has both .txt and .sxt, I think that's the problem. But such bit-flip should be detected by tree block csum. I'm not sure what's wrong with it. Thanks, Qu > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > checking csums > checking root refs > found 23421812736 bytes used err is 0 > total csum bytes: 21531608 > total tree bytes: 776650752 > total fs tree bytes: 711278592 > total extent tree bytes: 36798464 > btree space waste bytes: 116002036 > file data blocks allocated: 850546470912 > referenced 27611987968 > > Is it dangerous and what should I do about it? > > I also tried --clear-space-cache, but it just removes the line about space cache. > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-03 0:34 ` Qu Wenruo @ 2017-07-04 13:16 ` Lu Fengqi 2017-07-04 13:24 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Lu Fengqi @ 2017-07-04 13:16 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: linux-btrfs, Qu Wenruo On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > > >At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >> Hello everyone. >> >> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >> >> enabling repair mode >> Checking filesystem on /dev/sda2 >> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >> checking extents >> Fixed 0 roots. >> checking free space cache >> cache and super generation don't match, space cache will be invalidated >> checking fs roots >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >This means that in dir whose inode number is 79177, it has a child inode >pointer pointing to depercated.sxt. > >But it doesn't have dir index and corresponding inode ref, which is breaking >the cross reference rule of btrfs. > >Would you please run the following command to dump needed info for us to >debug? > ># btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >and > ># btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >and > ># btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > > >Considering the output has both .txt and .sxt, I think that's the problem. >But such bit-flip should be detected by tree block csum. >I'm not sure what's wrong with it. > >Thanks, >Qu > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >> checking csums >> checking root refs >> found 23421812736 bytes used err is 0 >> total csum bytes: 21531608 >> total tree bytes: 776650752 >> total fs tree bytes: 711278592 >> total extent tree bytes: 36798464 >> btree space waste bytes: 116002036 >> file data blocks allocated: 850546470912 >> referenced 27611987968 >> >> Is it dangerous and what should I do about it? >> >> I also tried --clear-space-cache, but it just removes the line about space cache. >> > > >-- >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html I'm afraid that your mail may be rejected because the attachment size exceeds the allowable limit(100kB) of btrfs mailing list. Could you share the attachment by google drive? Lastly, while Qu's timing is too tight, I will assist you on this issue. -- Thanks, Lu ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-04 13:16 ` Lu Fengqi @ 2017-07-04 13:24 ` Filippe LeMarchand 2017-07-12 7:15 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-04 13:24 UTC (permalink / raw) To: Lu Fengqi; +Cc: linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 4530 bytes --] Sure, here it is: https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > > > > > >At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >> Hello everyone. > >> > >> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >> > >> enabling repair mode > >> Checking filesystem on /dev/sda2 > >> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >> checking extents > >> Fixed 0 roots. > >> checking free space cache > >> cache and super generation don't match, space cache will be invalidated > >> checking fs roots > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > > > >This means that in dir whose inode number is 79177, it has a child inode > >pointer pointing to depercated.sxt. > > > >But it doesn't have dir index and corresponding inode ref, which is breaking > >the cross reference rule of btrfs. > > > >Would you please run the following command to dump needed info for us to > >debug? > > > ># btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > > > >and > > > ># btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > > > >and > > > ># btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > > > > > >Considering the output has both .txt and .sxt, I think that's the problem. > >But such bit-flip should be detected by tree block csum. > >I'm not sure what's wrong with it. > > > >Thanks, > >Qu > > > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >> checking csums > >> checking root refs > >> found 23421812736 bytes used err is 0 > >> total csum bytes: 21531608 > >> total tree bytes: 776650752 > >> total fs tree bytes: 711278592 > >> total extent tree bytes: 36798464 > >> btree space waste bytes: 116002036 > >> file data blocks allocated: 850546470912 > >> referenced 27611987968 > >> > >> Is it dangerous and what should I do about it? > >> > >> I also tried --clear-space-cache, but it just removes the line about space cache. > >> > > > > > >-- > >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > I'm afraid that your mail may be rejected because the attachment size > exceeds the allowable limit(100kB) of btrfs mailing list. Could you > share the attachment by google drive? > > Lastly, while Qu's timing is too tight, I will assist you on this issue. > > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-04 13:24 ` Filippe LeMarchand @ 2017-07-12 7:15 ` Qu Wenruo 2017-07-12 11:12 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-12 7:15 UTC (permalink / raw) To: Filippe LeMarchand, Lu Fengqi; +Cc: linux-btrfs, Qu Wenruo Sorry for the late reply. After investigating the dumps, I found the output is quite strange. 1) Mismatching output. In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not here at all. While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected 79177 DIR_ITEM/DIR_INDEX. Maybe something wrong in grep happened which skip "(79177" ? 2) Mismatched hash The main problem I found is that, for key (79177 DIR_ITEM 54846528), the number 54846528 is the hash(crc32c) of filename, and it contains 2 items, one for "deprecated.txt" and one for "deprecated.sxt". But we found that 54846528 only matches the hash for "deprecated.txt", not "deprecated.sxt". I think that's the main problem. BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem mode reports similar (well, output may differ) error? If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure that's the problem. However it may take some time before we can fix it in repair mode. Thanks, Qu 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > Sure, here it is: > https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > > In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>> >>> >>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>> Hello everyone. >>>> >>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>> >>>> enabling repair mode >>>> Checking filesystem on /dev/sda2 >>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>> checking extents >>>> Fixed 0 roots. >>>> checking free space cache >>>> cache and super generation don't match, space cache will be invalidated >>>> checking fs roots >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>> >>> This means that in dir whose inode number is 79177, it has a child inode >>> pointer pointing to depercated.sxt. >>> >>> But it doesn't have dir index and corresponding inode ref, which is breaking >>> the cross reference rule of btrfs. >>> >>> Would you please run the following command to dump needed info for us to >>> debug? >>> >>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>> >>> and >>> >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>> >>> and >>> >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>> >>> >>> Considering the output has both .txt and .sxt, I think that's the problem. >>> But such bit-flip should be detected by tree block csum. >>> I'm not sure what's wrong with it. >>> >>> Thanks, >>> Qu >>> >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>> checking csums >>>> checking root refs >>>> found 23421812736 bytes used err is 0 >>>> total csum bytes: 21531608 >>>> total tree bytes: 776650752 >>>> total fs tree bytes: 711278592 >>>> total extent tree bytes: 36798464 >>>> btree space waste bytes: 116002036 >>>> file data blocks allocated: 850546470912 >>>> referenced 27611987968 >>>> >>>> Is it dangerous and what should I do about it? >>>> >>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> I'm afraid that your mail may be rejected because the attachment size >> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >> share the attachment by google drive? >> >> Lastly, while Qu's timing is too tight, I will assist you on this issue. >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-12 7:15 ` Qu Wenruo @ 2017-07-12 11:12 ` Filippe LeMarchand 2017-07-12 12:44 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-12 11:12 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 7758 bytes --] > Maybe something wrong in grep happened which skip "(79177" ? Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. And btrfs check --mode=lowmem gives this: checking extents ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 ERROR: errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 ERROR: errors found in fs roots Checking filesystem on /dev/sda2 UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e found 153429872640 bytes used, error(s) found total csum bytes: 121991672 total tree bytes: 1940160512 total fs tree bytes: 1683767296 total extent tree bytes: 103841792 btree space waste bytes: 310722480 file data blocks allocated: 842455031808 referenced 159286636544 In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > Sorry for the late reply. > > After investigating the dumps, I found the output is quite strange. > > 1) Mismatching output. > In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > here at all. > > While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > 79177 DIR_ITEM/DIR_INDEX. > > Maybe something wrong in grep happened which skip "(79177" ? > > 2) Mismatched hash > The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > number 54846528 is the hash(crc32c) of filename, and it contains 2 > items, one for "deprecated.txt" and one for "deprecated.sxt". > > But we found that 54846528 only matches the hash for "deprecated.txt", > not "deprecated.sxt". > > I think that's the main problem. > > BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > mode reports similar (well, output may differ) error? > > If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > that's the problem. > > However it may take some time before we can fix it in repair mode. > > Thanks, > Qu > > > > 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > > Sure, here it is: > > https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > > > > In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>> > >>> > >>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>> Hello everyone. > >>>> > >>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>> > >>>> enabling repair mode > >>>> Checking filesystem on /dev/sda2 > >>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>> checking extents > >>>> Fixed 0 roots. > >>>> checking free space cache > >>>> cache and super generation don't match, space cache will be invalidated > >>>> checking fs roots > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>> > >>> This means that in dir whose inode number is 79177, it has a child inode > >>> pointer pointing to depercated.sxt. > >>> > >>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>> the cross reference rule of btrfs. > >>> > >>> Would you please run the following command to dump needed info for us to > >>> debug? > >>> > >>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>> > >>> and > >>> > >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>> > >>> and > >>> > >>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>> > >>> > >>> Considering the output has both .txt and .sxt, I think that's the problem. > >>> But such bit-flip should be detected by tree block csum. > >>> I'm not sure what's wrong with it. > >>> > >>> Thanks, > >>> Qu > >>> > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>> checking csums > >>>> checking root refs > >>>> found 23421812736 bytes used err is 0 > >>>> total csum bytes: 21531608 > >>>> total tree bytes: 776650752 > >>>> total fs tree bytes: 711278592 > >>>> total extent tree bytes: 36798464 > >>>> btree space waste bytes: 116002036 > >>>> file data blocks allocated: 850546470912 > >>>> referenced 27611987968 > >>>> > >>>> Is it dangerous and what should I do about it? > >>>> > >>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>> > >>> > >>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >> I'm afraid that your mail may be rejected because the attachment size > >> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >> share the attachment by google drive? > >> > >> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-12 11:12 ` Filippe LeMarchand @ 2017-07-12 12:44 ` Qu Wenruo 2017-07-12 13:11 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-12 12:44 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo On 2017年07月12日 19:12, Filippe LeMarchand wrote: >> Maybe something wrong in grep happened which skip "(79177" ? > Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. It looks much better, thanks. > > And btrfs check --mode=lowmem gives this: > > checking extents > ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > ERROR: errors found in extent allocation tree or chunk allocation Looks much like an exposed lowmem mode bug. Feel free to ignore these error from extent tree, they are just false alerts. > checking free space cache > checking fs roots > ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 The error report is much better than original mode, and that's what I need. Now I can wipe out all other noise as we know exactly which tree and which DIR_ITEM/INODE_REF is causing the problem. Would you please update the dump result with "-t 4546" passed to btrfs-debug-tree like: # btrfs-debug-tree -t 4546 <device>| grep 79177 Only "-t 4546" is added, to only dump the result of subvolume 4546. As always, all 3 grep results (2 "deprecated" and one 79177) need to be updated. And it seems that my previous assumption is still right for this case. If it's caused by kernel, your dump would definitely help us to locate the problem. > ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 Also for root 5134 please. Thanks, Qu > ERROR: errors found in fs roots > Checking filesystem on /dev/sda2 > UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > found 153429872640 bytes used, error(s) found > total csum bytes: 121991672 > total tree bytes: 1940160512 > total fs tree bytes: 1683767296 > total extent tree bytes: 103841792 > btree space waste bytes: 310722480 > file data blocks allocated: 842455031808 > referenced 159286636544 > > In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: >> Sorry for the late reply. >> >> After investigating the dumps, I found the output is quite strange. >> >> 1) Mismatching output. >> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for >> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not >> here at all. >> >> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected >> 79177 DIR_ITEM/DIR_INDEX. >> >> Maybe something wrong in grep happened which skip "(79177" ? >> >> 2) Mismatched hash >> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the >> number 54846528 is the hash(crc32c) of filename, and it contains 2 >> items, one for "deprecated.txt" and one for "deprecated.sxt". >> >> But we found that 54846528 only matches the hash for "deprecated.txt", >> not "deprecated.sxt". >> >> I think that's the main problem. >> >> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem >> mode reports similar (well, output may differ) error? >> >> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure >> that's the problem. >> >> However it may take some time before we can fix it in repair mode. >> >> Thanks, >> Qu >> >> >> >> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: >>> Sure, here it is: >>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc >>> >>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>>>> >>>>> >>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>>>> Hello everyone. >>>>>> >>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>>>> >>>>>> enabling repair mode >>>>>> Checking filesystem on /dev/sda2 >>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>> checking extents >>>>>> Fixed 0 roots. >>>>>> checking free space cache >>>>>> cache and super generation don't match, space cache will be invalidated >>>>>> checking fs roots >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>> >>>>> This means that in dir whose inode number is 79177, it has a child inode >>>>> pointer pointing to depercated.sxt. >>>>> >>>>> But it doesn't have dir index and corresponding inode ref, which is breaking >>>>> the cross reference rule of btrfs. >>>>> >>>>> Would you please run the following command to dump needed info for us to >>>>> debug? >>>>> >>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>>>> >>>>> and >>>>> >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>>>> >>>>> and >>>>> >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>>>> >>>>> >>>>> Considering the output has both .txt and .sxt, I think that's the problem. >>>>> But such bit-flip should be detected by tree block csum. >>>>> I'm not sure what's wrong with it. >>>>> >>>>> Thanks, >>>>> Qu >>>>> >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>> checking csums >>>>>> checking root refs >>>>>> found 23421812736 bytes used err is 0 >>>>>> total csum bytes: 21531608 >>>>>> total tree bytes: 776650752 >>>>>> total fs tree bytes: 711278592 >>>>>> total extent tree bytes: 36798464 >>>>>> btree space waste bytes: 116002036 >>>>>> file data blocks allocated: 850546470912 >>>>>> referenced 27611987968 >>>>>> >>>>>> Is it dangerous and what should I do about it? >>>>>> >>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>>>> >>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> I'm afraid that your mail may be rejected because the attachment size >>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >>>> share the attachment by google drive? >>>> >>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. >>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-12 12:44 ` Qu Wenruo @ 2017-07-12 13:11 ` Filippe LeMarchand 2017-07-14 6:11 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-12 13:11 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 9429 bytes --] Done, files added to same GDrive folder with corresponding names. If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: > > On 2017年07月12日 19:12, Filippe LeMarchand wrote: > >> Maybe something wrong in grep happened which skip "(79177" ? > > Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. > > It looks much better, thanks. > > > > > And btrfs check --mode=lowmem gives this: > > > > checking extents > > ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > > ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > > ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > > ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > > ERROR: errors found in extent allocation tree or chunk allocation > > Looks much like an exposed lowmem mode bug. > Feel free to ignore these error from extent tree, they are just false > alerts. > > > checking free space cache > > checking fs roots > > ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > > The error report is much better than original mode, and that's what I need. > > Now I can wipe out all other noise as we know exactly which tree and > which DIR_ITEM/INODE_REF is causing the problem. > > Would you please update the dump result with "-t 4546" passed to > btrfs-debug-tree like: > > # btrfs-debug-tree -t 4546 <device>| grep 79177 > > Only "-t 4546" is added, to only dump the result of subvolume 4546. > As always, all 3 grep results (2 "deprecated" and one 79177) need to be > updated. > > And it seems that my previous assumption is still right for this case. > If it's caused by kernel, your dump would definitely help us to locate > the problem. > > > ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > > ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > > Also for root 5134 please. > > Thanks, > Qu > > > ERROR: errors found in fs roots > > Checking filesystem on /dev/sda2 > > UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > > found 153429872640 bytes used, error(s) found > > total csum bytes: 121991672 > > total tree bytes: 1940160512 > > total fs tree bytes: 1683767296 > > total extent tree bytes: 103841792 > > btree space waste bytes: 310722480 > > file data blocks allocated: 842455031808 > > referenced 159286636544 > > > > In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > >> Sorry for the late reply. > >> > >> After investigating the dumps, I found the output is quite strange. > >> > >> 1) Mismatching output. > >> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > >> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > >> here at all. > >> > >> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > >> 79177 DIR_ITEM/DIR_INDEX. > >> > >> Maybe something wrong in grep happened which skip "(79177" ? > >> > >> 2) Mismatched hash > >> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > >> number 54846528 is the hash(crc32c) of filename, and it contains 2 > >> items, one for "deprecated.txt" and one for "deprecated.sxt". > >> > >> But we found that 54846528 only matches the hash for "deprecated.txt", > >> not "deprecated.sxt". > >> > >> I think that's the main problem. > >> > >> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > >> mode reports similar (well, output may differ) error? > >> > >> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > >> that's the problem. > >> > >> However it may take some time before we can fix it in repair mode. > >> > >> Thanks, > >> Qu > >> > >> > >> > >> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > >>> Sure, here it is: > >>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > >>> > >>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>>>> > >>>>> > >>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>>>> Hello everyone. > >>>>>> > >>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>>>> > >>>>>> enabling repair mode > >>>>>> Checking filesystem on /dev/sda2 > >>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>> checking extents > >>>>>> Fixed 0 roots. > >>>>>> checking free space cache > >>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>> checking fs roots > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>> > >>>>> This means that in dir whose inode number is 79177, it has a child inode > >>>>> pointer pointing to depercated.sxt. > >>>>> > >>>>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>>>> the cross reference rule of btrfs. > >>>>> > >>>>> Would you please run the following command to dump needed info for us to > >>>>> debug? > >>>>> > >>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>>>> > >>>>> and > >>>>> > >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>>>> > >>>>> and > >>>>> > >>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>>>> > >>>>> > >>>>> Considering the output has both .txt and .sxt, I think that's the problem. > >>>>> But such bit-flip should be detected by tree block csum. > >>>>> I'm not sure what's wrong with it. > >>>>> > >>>>> Thanks, > >>>>> Qu > >>>>> > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>> checking csums > >>>>>> checking root refs > >>>>>> found 23421812736 bytes used err is 0 > >>>>>> total csum bytes: 21531608 > >>>>>> total tree bytes: 776650752 > >>>>>> total fs tree bytes: 711278592 > >>>>>> total extent tree bytes: 36798464 > >>>>>> btree space waste bytes: 116002036 > >>>>>> file data blocks allocated: 850546470912 > >>>>>> referenced 27611987968 > >>>>>> > >>>>>> Is it dangerous and what should I do about it? > >>>>>> > >>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>>> the body of a message to majordomo@vger.kernel.org > >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>> > >>>> I'm afraid that your mail may be rejected because the attachment size > >>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >>>> share the attachment by google drive? > >>>> > >>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >>>> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-12 13:11 ` Filippe LeMarchand @ 2017-07-14 6:11 ` Qu Wenruo 2017-07-14 10:12 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-14 6:11 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo Thanks for your dump. We're clear what is the direct cause of the problem. It's one corrupted DIR_ITEM causing the problem. And further more, original mode btrfs check can't detect it, and we will fix it soon. The corrupted DIR_ITEM is as the following: item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 location key (4222342 INODE_ITEM 0) type FILE transid 170929 data_len 0 name_len 14 name: deprecated.sxt location key (13590433 INODE_ITEM 0) type FILE transid 796448 data_len 0 name_len 14 name: deprecated.txt For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" (ino=4222342) and "deprecated.sxt" (ino=13590433) But something goes wrong here: 1) Hash of "deprecated.sxt" doesn't match 54846528 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" Also captured by dump: item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 inode ref index 417 namelen 14 name: deprecated.txt 3) DIR_INDEX also shows that filename for inode 4222342 should be "deprecated.txt" item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 location key (4222342 INODE_ITEM 0) type FILE transid 170929 data_len 0 name_len 14 name: deprecated.txt So generic speaking, it's DIR_ITEM wrong and causing the problem. But the root reason is still unknown. What I can see is, the corrupted DIR_ITEM points to an very old inode, its mtime is back to 2016-09-07. While the good DIR_ITEM points to newer inode, whose mtime is just 2017-05-02. But more weird, there should not be two child inodes with the same filename ("depercated.txt", I assume the sxt one is caused by a memory bit corruption). So, any details on the operation with util-linux/deprecated.txt will help us to locate the root cause in kernel. Thanks, Qu On 2017年07月12日 21:11, Filippe LeMarchand wrote: > Done, files added to same GDrive folder with corresponding names. > If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. > > In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: >> >> On 2017年07月12日 19:12, Filippe LeMarchand wrote: >>>> Maybe something wrong in grep happened which skip "(79177" ? >>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. >> >> It looks much better, thanks. >> >>> >>> And btrfs check --mode=lowmem gives this: >>> >>> checking extents >>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 >>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 >>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 >>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 >>> ERROR: errors found in extent allocation tree or chunk allocation >> >> Looks much like an exposed lowmem mode bug. >> Feel free to ignore these error from extent tree, they are just false >> alerts. >> >>> checking free space cache >>> checking fs roots >>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >> >> The error report is much better than original mode, and that's what I need. >> >> Now I can wipe out all other noise as we know exactly which tree and >> which DIR_ITEM/INODE_REF is causing the problem. >> >> Would you please update the dump result with "-t 4546" passed to >> btrfs-debug-tree like: >> >> # btrfs-debug-tree -t 4546 <device>| grep 79177 >> >> Only "-t 4546" is added, to only dump the result of subvolume 4546. >> As always, all 3 grep results (2 "deprecated" and one 79177) need to be >> updated. >> >> And it seems that my previous assumption is still right for this case. >> If it's caused by kernel, your dump would definitely help us to locate >> the problem. >> >>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 >>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >> >> Also for root 5134 please. >> >> Thanks, >> Qu >> >>> ERROR: errors found in fs roots >>> Checking filesystem on /dev/sda2 >>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>> found 153429872640 bytes used, error(s) found >>> total csum bytes: 121991672 >>> total tree bytes: 1940160512 >>> total fs tree bytes: 1683767296 >>> total extent tree bytes: 103841792 >>> btree space waste bytes: 310722480 >>> file data blocks allocated: 842455031808 >>> referenced 159286636544 >>> >>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: >>>> Sorry for the late reply. >>>> >>>> After investigating the dumps, I found the output is quite strange. >>>> >>>> 1) Mismatching output. >>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for >>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not >>>> here at all. >>>> >>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected >>>> 79177 DIR_ITEM/DIR_INDEX. >>>> >>>> Maybe something wrong in grep happened which skip "(79177" ? >>>> >>>> 2) Mismatched hash >>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the >>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 >>>> items, one for "deprecated.txt" and one for "deprecated.sxt". >>>> >>>> But we found that 54846528 only matches the hash for "deprecated.txt", >>>> not "deprecated.sxt". >>>> >>>> I think that's the main problem. >>>> >>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem >>>> mode reports similar (well, output may differ) error? >>>> >>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure >>>> that's the problem. >>>> >>>> However it may take some time before we can fix it in repair mode. >>>> >>>> Thanks, >>>> Qu >>>> >>>> >>>> >>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: >>>>> Sure, here it is: >>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc >>>>> >>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>>>>>> >>>>>>> >>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>>>>>> Hello everyone. >>>>>>>> >>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>>>>>> >>>>>>>> enabling repair mode >>>>>>>> Checking filesystem on /dev/sda2 >>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>>> checking extents >>>>>>>> Fixed 0 roots. >>>>>>>> checking free space cache >>>>>>>> cache and super generation don't match, space cache will be invalidated >>>>>>>> checking fs roots >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>> >>>>>>> This means that in dir whose inode number is 79177, it has a child inode >>>>>>> pointer pointing to depercated.sxt. >>>>>>> >>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking >>>>>>> the cross reference rule of btrfs. >>>>>>> >>>>>>> Would you please run the following command to dump needed info for us to >>>>>>> debug? >>>>>>> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>>>>>> >>>>>>> and >>>>>>> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>>>>>> >>>>>>> and >>>>>>> >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>>>>>> >>>>>>> >>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. >>>>>>> But such bit-flip should be detected by tree block csum. >>>>>>> I'm not sure what's wrong with it. >>>>>>> >>>>>>> Thanks, >>>>>>> Qu >>>>>>> >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>> checking csums >>>>>>>> checking root refs >>>>>>>> found 23421812736 bytes used err is 0 >>>>>>>> total csum bytes: 21531608 >>>>>>>> total tree bytes: 776650752 >>>>>>>> total fs tree bytes: 711278592 >>>>>>>> total extent tree bytes: 36798464 >>>>>>>> btree space waste bytes: 116002036 >>>>>>>> file data blocks allocated: 850546470912 >>>>>>>> referenced 27611987968 >>>>>>>> >>>>>>>> Is it dangerous and what should I do about it? >>>>>>>> >>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>>> I'm afraid that your mail may be rejected because the attachment size >>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >>>>>> share the attachment by google drive? >>>>>> >>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. >>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 6:11 ` Qu Wenruo @ 2017-07-14 10:12 ` Filippe LeMarchand 2017-07-14 11:28 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-14 10:12 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 13743 bytes --] First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory $ sudo rm -rf /usr/share/doc/packages/util-linux/ rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty $ sudo ls -l /usr/share/doc/packages/util-linux/ ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory total 0 -????????? ? ? ? ? ? deprecated.txt Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): $ ls -l /usr/share/doc/packages/util-linux/ total 104 -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt So, is this situation actually dangerous? And what can I do to gather more information for you? In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: > Thanks for your dump. > > We're clear what is the direct cause of the problem. > > It's one corrupted DIR_ITEM causing the problem. > And further more, original mode btrfs check can't detect it, and we will > fix it soon. > > The corrupted DIR_ITEM is as the following: > item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 > location key (4222342 INODE_ITEM 0) type FILE > transid 170929 data_len 0 name_len 14 > name: deprecated.sxt > location key (13590433 INODE_ITEM 0) type FILE > transid 796448 data_len 0 name_len 14 > name: deprecated.txt > > For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" > (ino=4222342) and "deprecated.sxt" (ino=13590433) > > But something goes wrong here: > > 1) Hash of "deprecated.sxt" doesn't match 54846528 > > 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" > Also captured by dump: > item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 > inode ref index 417 namelen 14 name: deprecated.txt > > 3) DIR_INDEX also shows that filename for inode 4222342 should be > "deprecated.txt" > item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 > location key (4222342 INODE_ITEM 0) type FILE > transid 170929 data_len 0 name_len 14 > name: deprecated.txt > > So generic speaking, it's DIR_ITEM wrong and causing the problem. > > But the root reason is still unknown. > > What I can see is, the corrupted DIR_ITEM points to an very old inode, > its mtime is back to 2016-09-07. > While the good DIR_ITEM points to newer inode, whose mtime is just > 2017-05-02. > > But more weird, there should not be two child inodes with the same > filename ("depercated.txt", I assume the sxt one is caused by a memory > bit corruption). > > So, any details on the operation with util-linux/deprecated.txt will > help us to locate the root cause in kernel. > > Thanks, > Qu > > > On 2017年07月12日 21:11, Filippe LeMarchand wrote: > > Done, files added to same GDrive folder with corresponding names. > > If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. > > > > In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: > >> > >> On 2017年07月12日 19:12, Filippe LeMarchand wrote: > >>>> Maybe something wrong in grep happened which skip "(79177" ? > >>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. > >> > >> It looks much better, thanks. > >> > >>> > >>> And btrfs check --mode=lowmem gives this: > >>> > >>> checking extents > >>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > >>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > >>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > >>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > >>> ERROR: errors found in extent allocation tree or chunk allocation > >> > >> Looks much like an exposed lowmem mode bug. > >> Feel free to ignore these error from extent tree, they are just false > >> alerts. > >> > >>> checking free space cache > >>> checking fs roots > >>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >> > >> The error report is much better than original mode, and that's what I need. > >> > >> Now I can wipe out all other noise as we know exactly which tree and > >> which DIR_ITEM/INODE_REF is causing the problem. > >> > >> Would you please update the dump result with "-t 4546" passed to > >> btrfs-debug-tree like: > >> > >> # btrfs-debug-tree -t 4546 <device>| grep 79177 > >> > >> Only "-t 4546" is added, to only dump the result of subvolume 4546. > >> As always, all 3 grep results (2 "deprecated" and one 79177) need to be > >> updated. > >> > >> And it seems that my previous assumption is still right for this case. > >> If it's caused by kernel, your dump would definitely help us to locate > >> the problem. > >> > >>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > >>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >> > >> Also for root 5134 please. > >> > >> Thanks, > >> Qu > >> > >>> ERROR: errors found in fs roots > >>> Checking filesystem on /dev/sda2 > >>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>> found 153429872640 bytes used, error(s) found > >>> total csum bytes: 121991672 > >>> total tree bytes: 1940160512 > >>> total fs tree bytes: 1683767296 > >>> total extent tree bytes: 103841792 > >>> btree space waste bytes: 310722480 > >>> file data blocks allocated: 842455031808 > >>> referenced 159286636544 > >>> > >>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > >>>> Sorry for the late reply. > >>>> > >>>> After investigating the dumps, I found the output is quite strange. > >>>> > >>>> 1) Mismatching output. > >>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > >>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > >>>> here at all. > >>>> > >>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > >>>> 79177 DIR_ITEM/DIR_INDEX. > >>>> > >>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>> > >>>> 2) Mismatched hash > >>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > >>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 > >>>> items, one for "deprecated.txt" and one for "deprecated.sxt". > >>>> > >>>> But we found that 54846528 only matches the hash for "deprecated.txt", > >>>> not "deprecated.sxt". > >>>> > >>>> I think that's the main problem. > >>>> > >>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > >>>> mode reports similar (well, output may differ) error? > >>>> > >>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > >>>> that's the problem. > >>>> > >>>> However it may take some time before we can fix it in repair mode. > >>>> > >>>> Thanks, > >>>> Qu > >>>> > >>>> > >>>> > >>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > >>>>> Sure, here it is: > >>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > >>>>> > >>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>>>>>> > >>>>>>> > >>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>>>>>> Hello everyone. > >>>>>>>> > >>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>>>>>> > >>>>>>>> enabling repair mode > >>>>>>>> Checking filesystem on /dev/sda2 > >>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>>> checking extents > >>>>>>>> Fixed 0 roots. > >>>>>>>> checking free space cache > >>>>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>>>> checking fs roots > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>> > >>>>>>> This means that in dir whose inode number is 79177, it has a child inode > >>>>>>> pointer pointing to depercated.sxt. > >>>>>>> > >>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>>>>>> the cross reference rule of btrfs. > >>>>>>> > >>>>>>> Would you please run the following command to dump needed info for us to > >>>>>>> debug? > >>>>>>> > >>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>>>>>> > >>>>>>> and > >>>>>>> > >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>>>>>> > >>>>>>> and > >>>>>>> > >>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>>>>>> > >>>>>>> > >>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. > >>>>>>> But such bit-flip should be detected by tree block csum. > >>>>>>> I'm not sure what's wrong with it. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Qu > >>>>>>> > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>> checking csums > >>>>>>>> checking root refs > >>>>>>>> found 23421812736 bytes used err is 0 > >>>>>>>> total csum bytes: 21531608 > >>>>>>>> total tree bytes: 776650752 > >>>>>>>> total fs tree bytes: 711278592 > >>>>>>>> total extent tree bytes: 36798464 > >>>>>>>> btree space waste bytes: 116002036 > >>>>>>>> file data blocks allocated: 850546470912 > >>>>>>>> referenced 27611987968 > >>>>>>>> > >>>>>>>> Is it dangerous and what should I do about it? > >>>>>>>> > >>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > >>>>>> I'm afraid that your mail may be rejected because the attachment size > >>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >>>>>> share the attachment by google drive? > >>>>>> > >>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >>>>>> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 10:12 ` Filippe LeMarchand @ 2017-07-14 11:28 ` Qu Wenruo 2017-07-14 12:04 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-14 11:28 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo On 2017年07月14日 18:12, Filippe LeMarchand wrote: > First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: > > $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt > rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > > $ sudo rm -rf /usr/share/doc/packages/util-linux/ > rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty > > $ sudo ls -l /usr/share/doc/packages/util-linux/ > ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > total 0 > -????????? ? ? ? ? ? deprecated.txt Similar behavior is also detected using manually crafted image in our environment. Su Yue have sent patches to enhance error detection and test case for it, but repairing is not supported. > > Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): > > $ ls -l /usr/share/doc/packages/util-linux/ > total 104 > -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING > -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 > -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 > -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB > -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing > -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt > -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt > -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt > -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt > -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt > -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt > -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt > -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt > > So, is this situation actually dangerous? And what can I do to gather more information for you? The situation won't be worse. I'd recommend not to take any snapshot of those subvolumes (4546 and 5134) to limit the corruption to those subvolumes. However there is also no easy way to fix it yet. Currently possible solution may be deleting the whole subvolume. If no further error happens, it may be fixed. IIRC btrfs check --repair in original mode has DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can handle it well. Btrfs check --repair *MAY* fix it, or it may make things worse. If you have full backup, then you could try it. Otherwise, don't try it at all. Other solution includes a specific repair program just for your case. We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM (".sxt" one) and related DIR_INDEX/INODE_REF. But I'll only choose this if you really need to fix it as soon as possible. At least we have solution for it. I'm more concerned about how this happened. Any idea about the reproducer? Or just random memory corruption? Thanks, Qu > > In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: >> Thanks for your dump. >> >> We're clear what is the direct cause of the problem. >> >> It's one corrupted DIR_ITEM causing the problem. >> And further more, original mode btrfs check can't detect it, and we will >> fix it soon. >> >> The corrupted DIR_ITEM is as the following: >> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 >> location key (4222342 INODE_ITEM 0) type FILE >> transid 170929 data_len 0 name_len 14 >> name: deprecated.sxt >> location key (13590433 INODE_ITEM 0) type FILE >> transid 796448 data_len 0 name_len 14 >> name: deprecated.txt >> >> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" >> (ino=4222342) and "deprecated.sxt" (ino=13590433) >> >> But something goes wrong here: >> >> 1) Hash of "deprecated.sxt" doesn't match 54846528 >> >> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" >> Also captured by dump: >> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 >> inode ref index 417 namelen 14 name: deprecated.txt >> >> 3) DIR_INDEX also shows that filename for inode 4222342 should be >> "deprecated.txt" >> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 >> location key (4222342 INODE_ITEM 0) type FILE >> transid 170929 data_len 0 name_len 14 >> name: deprecated.txt >> >> So generic speaking, it's DIR_ITEM wrong and causing the problem. >> >> But the root reason is still unknown. >> >> What I can see is, the corrupted DIR_ITEM points to an very old inode, >> its mtime is back to 2016-09-07. >> While the good DIR_ITEM points to newer inode, whose mtime is just >> 2017-05-02. >> >> But more weird, there should not be two child inodes with the same >> filename ("depercated.txt", I assume the sxt one is caused by a memory >> bit corruption). >> >> So, any details on the operation with util-linux/deprecated.txt will >> help us to locate the root cause in kernel. >> >> Thanks, >> Qu >> >> >> On 2017年07月12日 21:11, Filippe LeMarchand wrote: >>> Done, files added to same GDrive folder with corresponding names. >>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. >>> >>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: >>>> >>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: >>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. >>>> >>>> It looks much better, thanks. >>>> >>>>> >>>>> And btrfs check --mode=lowmem gives this: >>>>> >>>>> checking extents >>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 >>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 >>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 >>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 >>>>> ERROR: errors found in extent allocation tree or chunk allocation >>>> >>>> Looks much like an exposed lowmem mode bug. >>>> Feel free to ignore these error from extent tree, they are just false >>>> alerts. >>>> >>>>> checking free space cache >>>>> checking fs roots >>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>> >>>> The error report is much better than original mode, and that's what I need. >>>> >>>> Now I can wipe out all other noise as we know exactly which tree and >>>> which DIR_ITEM/INODE_REF is causing the problem. >>>> >>>> Would you please update the dump result with "-t 4546" passed to >>>> btrfs-debug-tree like: >>>> >>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 >>>> >>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. >>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be >>>> updated. >>>> >>>> And it seems that my previous assumption is still right for this case. >>>> If it's caused by kernel, your dump would definitely help us to locate >>>> the problem. >>>> >>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 >>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>> >>>> Also for root 5134 please. >>>> >>>> Thanks, >>>> Qu >>>> >>>>> ERROR: errors found in fs roots >>>>> Checking filesystem on /dev/sda2 >>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>> found 153429872640 bytes used, error(s) found >>>>> total csum bytes: 121991672 >>>>> total tree bytes: 1940160512 >>>>> total fs tree bytes: 1683767296 >>>>> total extent tree bytes: 103841792 >>>>> btree space waste bytes: 310722480 >>>>> file data blocks allocated: 842455031808 >>>>> referenced 159286636544 >>>>> >>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: >>>>>> Sorry for the late reply. >>>>>> >>>>>> After investigating the dumps, I found the output is quite strange. >>>>>> >>>>>> 1) Mismatching output. >>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for >>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not >>>>>> here at all. >>>>>> >>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected >>>>>> 79177 DIR_ITEM/DIR_INDEX. >>>>>> >>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>>> >>>>>> 2) Mismatched hash >>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the >>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 >>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". >>>>>> >>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", >>>>>> not "deprecated.sxt". >>>>>> >>>>>> I think that's the main problem. >>>>>> >>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem >>>>>> mode reports similar (well, output may differ) error? >>>>>> >>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure >>>>>> that's the problem. >>>>>> >>>>>> However it may take some time before we can fix it in repair mode. >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>> >>>>>> >>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: >>>>>>> Sure, here it is: >>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc >>>>>>> >>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>>>>>>>> Hello everyone. >>>>>>>>>> >>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>>>>>>>> >>>>>>>>>> enabling repair mode >>>>>>>>>> Checking filesystem on /dev/sda2 >>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>>>>> checking extents >>>>>>>>>> Fixed 0 roots. >>>>>>>>>> checking free space cache >>>>>>>>>> cache and super generation don't match, space cache will be invalidated >>>>>>>>>> checking fs roots >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>> >>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode >>>>>>>>> pointer pointing to depercated.sxt. >>>>>>>>> >>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking >>>>>>>>> the cross reference rule of btrfs. >>>>>>>>> >>>>>>>>> Would you please run the following command to dump needed info for us to >>>>>>>>> debug? >>>>>>>>> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>>>>>>>> >>>>>>>>> >>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. >>>>>>>>> But such bit-flip should be detected by tree block csum. >>>>>>>>> I'm not sure what's wrong with it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Qu >>>>>>>>> >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>> checking csums >>>>>>>>>> checking root refs >>>>>>>>>> found 23421812736 bytes used err is 0 >>>>>>>>>> total csum bytes: 21531608 >>>>>>>>>> total tree bytes: 776650752 >>>>>>>>>> total fs tree bytes: 711278592 >>>>>>>>>> total extent tree bytes: 36798464 >>>>>>>>>> btree space waste bytes: 116002036 >>>>>>>>>> file data blocks allocated: 850546470912 >>>>>>>>>> referenced 27611987968 >>>>>>>>>> >>>>>>>>>> Is it dangerous and what should I do about it? >>>>>>>>>> >>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>> >>>>>>>> I'm afraid that your mail may be rejected because the attachment size >>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >>>>>>>> share the attachment by google drive? >>>>>>>> >>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. >>>>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 11:28 ` Qu Wenruo @ 2017-07-14 12:04 ` Filippe LeMarchand 2017-07-14 12:11 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-14 12:04 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 16533 bytes --] > Currently possible solution may be deleting the whole subvolume. Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync? > If you have full backup, then you could try it. It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes? > Any idea about the reproducer? Or just random memory corruption? No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago. Also I ran memtest recently and it didn't find any errors. In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote: > > On 2017年07月14日 18:12, Filippe LeMarchand wrote: > > First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: > > > > $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt > > rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > > > > $ sudo rm -rf /usr/share/doc/packages/util-linux/ > > rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty > > > > $ sudo ls -l /usr/share/doc/packages/util-linux/ > > ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > > total 0 > > -????????? ? ? ? ? ? deprecated.txt > > Similar behavior is also detected using manually crafted image in our > environment. > > Su Yue have sent patches to enhance error detection and test case for > it, but repairing is not supported. > > > > > Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): > > > > $ ls -l /usr/share/doc/packages/util-linux/ > > total 104 > > -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING > > -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 > > -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 > > -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB > > -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing > > -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt > > -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt > > -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt > > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > > -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > > -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt > > -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt > > -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt > > -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt > > -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt > > -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt > > > > So, is this situation actually dangerous? And what can I do to gather more information for you? > > The situation won't be worse. I'd recommend not to take any snapshot of > those subvolumes (4546 and 5134) to limit the corruption to those > subvolumes. > > However there is also no easy way to fix it yet. > > Currently possible solution may be deleting the whole subvolume. > If no further error happens, it may be fixed. > > IIRC btrfs check --repair in original mode has > DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can > handle it well. > Btrfs check --repair *MAY* fix it, or it may make things worse. > If you have full backup, then you could try it. > Otherwise, don't try it at all. > > Other solution includes a specific repair program just for your case. > We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM > (".sxt" one) and related DIR_INDEX/INODE_REF. > But I'll only choose this if you really need to fix it as soon as possible. > > At least we have solution for it. > I'm more concerned about how this happened. > > Any idea about the reproducer? Or just random memory corruption? > > Thanks, > Qu > > > > In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: > >> Thanks for your dump. > >> > >> We're clear what is the direct cause of the problem. > >> > >> It's one corrupted DIR_ITEM causing the problem. > >> And further more, original mode btrfs check can't detect it, and we will > >> fix it soon. > >> > >> The corrupted DIR_ITEM is as the following: > >> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 > >> location key (4222342 INODE_ITEM 0) type FILE > >> transid 170929 data_len 0 name_len 14 > >> name: deprecated.sxt > >> location key (13590433 INODE_ITEM 0) type FILE > >> transid 796448 data_len 0 name_len 14 > >> name: deprecated.txt > >> > >> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" > >> (ino=4222342) and "deprecated.sxt" (ino=13590433) > >> > >> But something goes wrong here: > >> > >> 1) Hash of "deprecated.sxt" doesn't match 54846528 > >> > >> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" > >> Also captured by dump: > >> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 > >> inode ref index 417 namelen 14 name: deprecated.txt > >> > >> 3) DIR_INDEX also shows that filename for inode 4222342 should be > >> "deprecated.txt" > >> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 > >> location key (4222342 INODE_ITEM 0) type FILE > >> transid 170929 data_len 0 name_len 14 > >> name: deprecated.txt > >> > >> So generic speaking, it's DIR_ITEM wrong and causing the problem. > >> > >> But the root reason is still unknown. > >> > >> What I can see is, the corrupted DIR_ITEM points to an very old inode, > >> its mtime is back to 2016-09-07. > >> While the good DIR_ITEM points to newer inode, whose mtime is just > >> 2017-05-02. > >> > >> But more weird, there should not be two child inodes with the same > >> filename ("depercated.txt", I assume the sxt one is caused by a memory > >> bit corruption). > >> > >> So, any details on the operation with util-linux/deprecated.txt will > >> help us to locate the root cause in kernel. > >> > >> Thanks, > >> Qu > >> > >> > >> On 2017年07月12日 21:11, Filippe LeMarchand wrote: > >>> Done, files added to same GDrive folder with corresponding names. > >>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. > >>> > >>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: > >>>> > >>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: > >>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. > >>>> > >>>> It looks much better, thanks. > >>>> > >>>>> > >>>>> And btrfs check --mode=lowmem gives this: > >>>>> > >>>>> checking extents > >>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > >>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > >>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > >>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > >>>>> ERROR: errors found in extent allocation tree or chunk allocation > >>>> > >>>> Looks much like an exposed lowmem mode bug. > >>>> Feel free to ignore these error from extent tree, they are just false > >>>> alerts. > >>>> > >>>>> checking free space cache > >>>>> checking fs roots > >>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>> > >>>> The error report is much better than original mode, and that's what I need. > >>>> > >>>> Now I can wipe out all other noise as we know exactly which tree and > >>>> which DIR_ITEM/INODE_REF is causing the problem. > >>>> > >>>> Would you please update the dump result with "-t 4546" passed to > >>>> btrfs-debug-tree like: > >>>> > >>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 > >>>> > >>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. > >>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be > >>>> updated. > >>>> > >>>> And it seems that my previous assumption is still right for this case. > >>>> If it's caused by kernel, your dump would definitely help us to locate > >>>> the problem. > >>>> > >>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > >>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>> > >>>> Also for root 5134 please. > >>>> > >>>> Thanks, > >>>> Qu > >>>> > >>>>> ERROR: errors found in fs roots > >>>>> Checking filesystem on /dev/sda2 > >>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>> found 153429872640 bytes used, error(s) found > >>>>> total csum bytes: 121991672 > >>>>> total tree bytes: 1940160512 > >>>>> total fs tree bytes: 1683767296 > >>>>> total extent tree bytes: 103841792 > >>>>> btree space waste bytes: 310722480 > >>>>> file data blocks allocated: 842455031808 > >>>>> referenced 159286636544 > >>>>> > >>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > >>>>>> Sorry for the late reply. > >>>>>> > >>>>>> After investigating the dumps, I found the output is quite strange. > >>>>>> > >>>>>> 1) Mismatching output. > >>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > >>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > >>>>>> here at all. > >>>>>> > >>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > >>>>>> 79177 DIR_ITEM/DIR_INDEX. > >>>>>> > >>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>>> > >>>>>> 2) Mismatched hash > >>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > >>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 > >>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". > >>>>>> > >>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", > >>>>>> not "deprecated.sxt". > >>>>>> > >>>>>> I think that's the main problem. > >>>>>> > >>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > >>>>>> mode reports similar (well, output may differ) error? > >>>>>> > >>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > >>>>>> that's the problem. > >>>>>> > >>>>>> However it may take some time before we can fix it in repair mode. > >>>>>> > >>>>>> Thanks, > >>>>>> Qu > >>>>>> > >>>>>> > >>>>>> > >>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > >>>>>>> Sure, here it is: > >>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > >>>>>>> > >>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>>>>>>>> Hello everyone. > >>>>>>>>>> > >>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>>>>>>>> > >>>>>>>>>> enabling repair mode > >>>>>>>>>> Checking filesystem on /dev/sda2 > >>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>>>>> checking extents > >>>>>>>>>> Fixed 0 roots. > >>>>>>>>>> checking free space cache > >>>>>>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>>>>>> checking fs roots > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>> > >>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode > >>>>>>>>> pointer pointing to depercated.sxt. > >>>>>>>>> > >>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>>>>>>>> the cross reference rule of btrfs. > >>>>>>>>> > >>>>>>>>> Would you please run the following command to dump needed info for us to > >>>>>>>>> debug? > >>>>>>>>> > >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>>>>>>>> > >>>>>>>>> and > >>>>>>>>> > >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>>>>>>>> > >>>>>>>>> and > >>>>>>>>> > >>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. > >>>>>>>>> But such bit-flip should be detected by tree block csum. > >>>>>>>>> I'm not sure what's wrong with it. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Qu > >>>>>>>>> > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>> checking csums > >>>>>>>>>> checking root refs > >>>>>>>>>> found 23421812736 bytes used err is 0 > >>>>>>>>>> total csum bytes: 21531608 > >>>>>>>>>> total tree bytes: 776650752 > >>>>>>>>>> total fs tree bytes: 711278592 > >>>>>>>>>> total extent tree bytes: 36798464 > >>>>>>>>>> btree space waste bytes: 116002036 > >>>>>>>>>> file data blocks allocated: 850546470912 > >>>>>>>>>> referenced 27611987968 > >>>>>>>>>> > >>>>>>>>>> Is it dangerous and what should I do about it? > >>>>>>>>>> > >>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>>>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>>>> > >>>>>>>> I'm afraid that your mail may be rejected because the attachment size > >>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >>>>>>>> share the attachment by google drive? > >>>>>>>> > >>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >>>>>>>> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 12:04 ` Filippe LeMarchand @ 2017-07-14 12:11 ` Qu Wenruo 2017-07-14 12:26 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-14 12:11 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo On 2017年07月14日 20:04, Filippe LeMarchand wrote: >> Currently possible solution may be deleting the whole subvolume. > Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync? You could try if you have backup. Personally speaking, I'm not sure if it will work or make things worse. Such hash and name mismatch is really rare, I don't know how kernel send will handle it. > >> If you have full backup, then you could try it. > It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes? Unfortunately, it may corrupt other subvolumes. But from your fsck output, the possibility of corruption is not that high AFAIK. I recommend to backup other good subvolumes/snapshots using send and receive just in case. > >> Any idea about the reproducer? Or just random memory corruption? > No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago. > Also I ran memtest recently and it didn't find any errors. Well, that's common. I'll focus on checking your dump result to make a special purposed btrfs-corrupt-block to fix your situation if no other method works for you. Thanks, Qu > > In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote: >> >> On 2017年07月14日 18:12, Filippe LeMarchand wrote: >>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: >>> >>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt >>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory >>> >>> $ sudo rm -rf /usr/share/doc/packages/util-linux/ >>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty >>> >>> $ sudo ls -l /usr/share/doc/packages/util-linux/ >>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory >>> total 0 >>> -????????? ? ? ? ? ? deprecated.txt >> >> Similar behavior is also detected using manually crafted image in our >> environment. >> >> Su Yue have sent patches to enhance error detection and test case for >> it, but repairing is not supported. >> >>> >>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): >>> >>> $ ls -l /usr/share/doc/packages/util-linux/ >>> total 104 >>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING >>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 >>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 >>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB >>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing >>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt >>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt >>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt >>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt >>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt >>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt >>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt >>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt >>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt >>> >>> So, is this situation actually dangerous? And what can I do to gather more information for you? >> >> The situation won't be worse. I'd recommend not to take any snapshot of >> those subvolumes (4546 and 5134) to limit the corruption to those >> subvolumes. >> >> However there is also no easy way to fix it yet. >> >> Currently possible solution may be deleting the whole subvolume. >> If no further error happens, it may be fixed. >> >> IIRC btrfs check --repair in original mode has >> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can >> handle it well. >> Btrfs check --repair *MAY* fix it, or it may make things worse. >> If you have full backup, then you could try it. >> Otherwise, don't try it at all. >> >> Other solution includes a specific repair program just for your case. >> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM >> (".sxt" one) and related DIR_INDEX/INODE_REF. >> But I'll only choose this if you really need to fix it as soon as possible. >> >> At least we have solution for it. >> I'm more concerned about how this happened. >> >> Any idea about the reproducer? Or just random memory corruption? >> >> Thanks, >> Qu >>> >>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: >>>> Thanks for your dump. >>>> >>>> We're clear what is the direct cause of the problem. >>>> >>>> It's one corrupted DIR_ITEM causing the problem. >>>> And further more, original mode btrfs check can't detect it, and we will >>>> fix it soon. >>>> >>>> The corrupted DIR_ITEM is as the following: >>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 >>>> location key (4222342 INODE_ITEM 0) type FILE >>>> transid 170929 data_len 0 name_len 14 >>>> name: deprecated.sxt >>>> location key (13590433 INODE_ITEM 0) type FILE >>>> transid 796448 data_len 0 name_len 14 >>>> name: deprecated.txt >>>> >>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" >>>> (ino=4222342) and "deprecated.sxt" (ino=13590433) >>>> >>>> But something goes wrong here: >>>> >>>> 1) Hash of "deprecated.sxt" doesn't match 54846528 >>>> >>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" >>>> Also captured by dump: >>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 >>>> inode ref index 417 namelen 14 name: deprecated.txt >>>> >>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be >>>> "deprecated.txt" >>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 >>>> location key (4222342 INODE_ITEM 0) type FILE >>>> transid 170929 data_len 0 name_len 14 >>>> name: deprecated.txt >>>> >>>> So generic speaking, it's DIR_ITEM wrong and causing the problem. >>>> >>>> But the root reason is still unknown. >>>> >>>> What I can see is, the corrupted DIR_ITEM points to an very old inode, >>>> its mtime is back to 2016-09-07. >>>> While the good DIR_ITEM points to newer inode, whose mtime is just >>>> 2017-05-02. >>>> >>>> But more weird, there should not be two child inodes with the same >>>> filename ("depercated.txt", I assume the sxt one is caused by a memory >>>> bit corruption). >>>> >>>> So, any details on the operation with util-linux/deprecated.txt will >>>> help us to locate the root cause in kernel. >>>> >>>> Thanks, >>>> Qu >>>> >>>> >>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote: >>>>> Done, files added to same GDrive folder with corresponding names. >>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. >>>>> >>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: >>>>>> >>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. >>>>>> >>>>>> It looks much better, thanks. >>>>>> >>>>>>> >>>>>>> And btrfs check --mode=lowmem gives this: >>>>>>> >>>>>>> checking extents >>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 >>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 >>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 >>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 >>>>>>> ERROR: errors found in extent allocation tree or chunk allocation >>>>>> >>>>>> Looks much like an exposed lowmem mode bug. >>>>>> Feel free to ignore these error from extent tree, they are just false >>>>>> alerts. >>>>>> >>>>>>> checking free space cache >>>>>>> checking fs roots >>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>>>> >>>>>> The error report is much better than original mode, and that's what I need. >>>>>> >>>>>> Now I can wipe out all other noise as we know exactly which tree and >>>>>> which DIR_ITEM/INODE_REF is causing the problem. >>>>>> >>>>>> Would you please update the dump result with "-t 4546" passed to >>>>>> btrfs-debug-tree like: >>>>>> >>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 >>>>>> >>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. >>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be >>>>>> updated. >>>>>> >>>>>> And it seems that my previous assumption is still right for this case. >>>>>> If it's caused by kernel, your dump would definitely help us to locate >>>>>> the problem. >>>>>> >>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 >>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>>>> >>>>>> Also for root 5134 please. >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>>> ERROR: errors found in fs roots >>>>>>> Checking filesystem on /dev/sda2 >>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>> found 153429872640 bytes used, error(s) found >>>>>>> total csum bytes: 121991672 >>>>>>> total tree bytes: 1940160512 >>>>>>> total fs tree bytes: 1683767296 >>>>>>> total extent tree bytes: 103841792 >>>>>>> btree space waste bytes: 310722480 >>>>>>> file data blocks allocated: 842455031808 >>>>>>> referenced 159286636544 >>>>>>> >>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: >>>>>>>> Sorry for the late reply. >>>>>>>> >>>>>>>> After investigating the dumps, I found the output is quite strange. >>>>>>>> >>>>>>>> 1) Mismatching output. >>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for >>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not >>>>>>>> here at all. >>>>>>>> >>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected >>>>>>>> 79177 DIR_ITEM/DIR_INDEX. >>>>>>>> >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>>>>> >>>>>>>> 2) Mismatched hash >>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the >>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 >>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". >>>>>>>> >>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", >>>>>>>> not "deprecated.sxt". >>>>>>>> >>>>>>>> I think that's the main problem. >>>>>>>> >>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem >>>>>>>> mode reports similar (well, output may differ) error? >>>>>>>> >>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure >>>>>>>> that's the problem. >>>>>>>> >>>>>>>> However it may take some time before we can fix it in repair mode. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Qu >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: >>>>>>>>> Sure, here it is: >>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc >>>>>>>>> >>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>>>>>>>>>> Hello everyone. >>>>>>>>>>>> >>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>>>>>>>>>> >>>>>>>>>>>> enabling repair mode >>>>>>>>>>>> Checking filesystem on /dev/sda2 >>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>>>>>>> checking extents >>>>>>>>>>>> Fixed 0 roots. >>>>>>>>>>>> checking free space cache >>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated >>>>>>>>>>>> checking fs roots >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>> >>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode >>>>>>>>>>> pointer pointing to depercated.sxt. >>>>>>>>>>> >>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking >>>>>>>>>>> the cross reference rule of btrfs. >>>>>>>>>>> >>>>>>>>>>> Would you please run the following command to dump needed info for us to >>>>>>>>>>> debug? >>>>>>>>>>> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>>>>>>>>>> >>>>>>>>>>> and >>>>>>>>>>> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>>>>>>>>>> >>>>>>>>>>> and >>>>>>>>>>> >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. >>>>>>>>>>> But such bit-flip should be detected by tree block csum. >>>>>>>>>>> I'm not sure what's wrong with it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Qu >>>>>>>>>>> >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>> checking csums >>>>>>>>>>>> checking root refs >>>>>>>>>>>> found 23421812736 bytes used err is 0 >>>>>>>>>>>> total csum bytes: 21531608 >>>>>>>>>>>> total tree bytes: 776650752 >>>>>>>>>>>> total fs tree bytes: 711278592 >>>>>>>>>>>> total extent tree bytes: 36798464 >>>>>>>>>>>> btree space waste bytes: 116002036 >>>>>>>>>>>> file data blocks allocated: 850546470912 >>>>>>>>>>>> referenced 27611987968 >>>>>>>>>>>> >>>>>>>>>>>> Is it dangerous and what should I do about it? >>>>>>>>>>>> >>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>>>> >>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size >>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >>>>>>>>>> share the attachment by google drive? >>>>>>>>>> >>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. >>>>>>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 12:11 ` Qu Wenruo @ 2017-07-14 12:26 ` Filippe LeMarchand 2017-07-14 12:41 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-14 12:26 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 18265 bytes --] So, my options are a) Delete and re-create sobvolume b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help) c) Do nothing and wait for further update ? In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote: > > On 2017年07月14日 20:04, Filippe LeMarchand wrote: > >> Currently possible solution may be deleting the whole subvolume. > > Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync? > > You could try if you have backup. > > Personally speaking, I'm not sure if it will work or make things worse. > Such hash and name mismatch is really rare, I don't know how kernel send > will handle it. > > > > >> If you have full backup, then you could try it. > > It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes? > > Unfortunately, it may corrupt other subvolumes. > But from your fsck output, the possibility of corruption is not that > high AFAIK. > > I recommend to backup other good subvolumes/snapshots using send and > receive just in case. > > > > >> Any idea about the reproducer? Or just random memory corruption? > > No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago. > > Also I ran memtest recently and it didn't find any errors. > > Well, that's common. > I'll focus on checking your dump result to make a special purposed > btrfs-corrupt-block to fix your situation if no other method works for you. > > Thanks, > Qu > > > > > In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote: > >> > >> On 2017年07月14日 18:12, Filippe LeMarchand wrote: > >>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: > >>> > >>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt > >>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > >>> > >>> $ sudo rm -rf /usr/share/doc/packages/util-linux/ > >>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty > >>> > >>> $ sudo ls -l /usr/share/doc/packages/util-linux/ > >>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > >>> total 0 > >>> -????????? ? ? ? ? ? deprecated.txt > >> > >> Similar behavior is also detected using manually crafted image in our > >> environment. > >> > >> Su Yue have sent patches to enhance error detection and test case for > >> it, but repairing is not supported. > >> > >>> > >>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): > >>> > >>> $ ls -l /usr/share/doc/packages/util-linux/ > >>> total 104 > >>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING > >>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 > >>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 > >>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB > >>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing > >>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt > >>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt > >>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt > >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > >>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > >>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt > >>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt > >>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt > >>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt > >>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt > >>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt > >>> > >>> So, is this situation actually dangerous? And what can I do to gather more information for you? > >> > >> The situation won't be worse. I'd recommend not to take any snapshot of > >> those subvolumes (4546 and 5134) to limit the corruption to those > >> subvolumes. > >> > >> However there is also no easy way to fix it yet. > >> > >> Currently possible solution may be deleting the whole subvolume. > >> If no further error happens, it may be fixed. > >> > >> IIRC btrfs check --repair in original mode has > >> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can > >> handle it well. > >> Btrfs check --repair *MAY* fix it, or it may make things worse. > >> If you have full backup, then you could try it. > >> Otherwise, don't try it at all. > >> > >> Other solution includes a specific repair program just for your case. > >> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM > >> (".sxt" one) and related DIR_INDEX/INODE_REF. > >> But I'll only choose this if you really need to fix it as soon as possible. > >> > >> At least we have solution for it. > >> I'm more concerned about how this happened. > >> > >> Any idea about the reproducer? Or just random memory corruption? > >> > >> Thanks, > >> Qu > >>> > >>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: > >>>> Thanks for your dump. > >>>> > >>>> We're clear what is the direct cause of the problem. > >>>> > >>>> It's one corrupted DIR_ITEM causing the problem. > >>>> And further more, original mode btrfs check can't detect it, and we will > >>>> fix it soon. > >>>> > >>>> The corrupted DIR_ITEM is as the following: > >>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 > >>>> location key (4222342 INODE_ITEM 0) type FILE > >>>> transid 170929 data_len 0 name_len 14 > >>>> name: deprecated.sxt > >>>> location key (13590433 INODE_ITEM 0) type FILE > >>>> transid 796448 data_len 0 name_len 14 > >>>> name: deprecated.txt > >>>> > >>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" > >>>> (ino=4222342) and "deprecated.sxt" (ino=13590433) > >>>> > >>>> But something goes wrong here: > >>>> > >>>> 1) Hash of "deprecated.sxt" doesn't match 54846528 > >>>> > >>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" > >>>> Also captured by dump: > >>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 > >>>> inode ref index 417 namelen 14 name: deprecated.txt > >>>> > >>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be > >>>> "deprecated.txt" > >>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 > >>>> location key (4222342 INODE_ITEM 0) type FILE > >>>> transid 170929 data_len 0 name_len 14 > >>>> name: deprecated.txt > >>>> > >>>> So generic speaking, it's DIR_ITEM wrong and causing the problem. > >>>> > >>>> But the root reason is still unknown. > >>>> > >>>> What I can see is, the corrupted DIR_ITEM points to an very old inode, > >>>> its mtime is back to 2016-09-07. > >>>> While the good DIR_ITEM points to newer inode, whose mtime is just > >>>> 2017-05-02. > >>>> > >>>> But more weird, there should not be two child inodes with the same > >>>> filename ("depercated.txt", I assume the sxt one is caused by a memory > >>>> bit corruption). > >>>> > >>>> So, any details on the operation with util-linux/deprecated.txt will > >>>> help us to locate the root cause in kernel. > >>>> > >>>> Thanks, > >>>> Qu > >>>> > >>>> > >>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote: > >>>>> Done, files added to same GDrive folder with corresponding names. > >>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. > >>>>> > >>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: > >>>>>> > >>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: > >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. > >>>>>> > >>>>>> It looks much better, thanks. > >>>>>> > >>>>>>> > >>>>>>> And btrfs check --mode=lowmem gives this: > >>>>>>> > >>>>>>> checking extents > >>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > >>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > >>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > >>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > >>>>>>> ERROR: errors found in extent allocation tree or chunk allocation > >>>>>> > >>>>>> Looks much like an exposed lowmem mode bug. > >>>>>> Feel free to ignore these error from extent tree, they are just false > >>>>>> alerts. > >>>>>> > >>>>>>> checking free space cache > >>>>>>> checking fs roots > >>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>>>> > >>>>>> The error report is much better than original mode, and that's what I need. > >>>>>> > >>>>>> Now I can wipe out all other noise as we know exactly which tree and > >>>>>> which DIR_ITEM/INODE_REF is causing the problem. > >>>>>> > >>>>>> Would you please update the dump result with "-t 4546" passed to > >>>>>> btrfs-debug-tree like: > >>>>>> > >>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 > >>>>>> > >>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. > >>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be > >>>>>> updated. > >>>>>> > >>>>>> And it seems that my previous assumption is still right for this case. > >>>>>> If it's caused by kernel, your dump would definitely help us to locate > >>>>>> the problem. > >>>>>> > >>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > >>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>>>> > >>>>>> Also for root 5134 please. > >>>>>> > >>>>>> Thanks, > >>>>>> Qu > >>>>>> > >>>>>>> ERROR: errors found in fs roots > >>>>>>> Checking filesystem on /dev/sda2 > >>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>> found 153429872640 bytes used, error(s) found > >>>>>>> total csum bytes: 121991672 > >>>>>>> total tree bytes: 1940160512 > >>>>>>> total fs tree bytes: 1683767296 > >>>>>>> total extent tree bytes: 103841792 > >>>>>>> btree space waste bytes: 310722480 > >>>>>>> file data blocks allocated: 842455031808 > >>>>>>> referenced 159286636544 > >>>>>>> > >>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > >>>>>>>> Sorry for the late reply. > >>>>>>>> > >>>>>>>> After investigating the dumps, I found the output is quite strange. > >>>>>>>> > >>>>>>>> 1) Mismatching output. > >>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > >>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > >>>>>>>> here at all. > >>>>>>>> > >>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > >>>>>>>> 79177 DIR_ITEM/DIR_INDEX. > >>>>>>>> > >>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>>>>> > >>>>>>>> 2) Mismatched hash > >>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > >>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 > >>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". > >>>>>>>> > >>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", > >>>>>>>> not "deprecated.sxt". > >>>>>>>> > >>>>>>>> I think that's the main problem. > >>>>>>>> > >>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > >>>>>>>> mode reports similar (well, output may differ) error? > >>>>>>>> > >>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > >>>>>>>> that's the problem. > >>>>>>>> > >>>>>>>> However it may take some time before we can fix it in repair mode. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Qu > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > >>>>>>>>> Sure, here it is: > >>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > >>>>>>>>> > >>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>>>>>>>>>> Hello everyone. > >>>>>>>>>>>> > >>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>>>>>>>>>> > >>>>>>>>>>>> enabling repair mode > >>>>>>>>>>>> Checking filesystem on /dev/sda2 > >>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>>>>>>> checking extents > >>>>>>>>>>>> Fixed 0 roots. > >>>>>>>>>>>> checking free space cache > >>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>>>>>>>> checking fs roots > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>> > >>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode > >>>>>>>>>>> pointer pointing to depercated.sxt. > >>>>>>>>>>> > >>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>>>>>>>>>> the cross reference rule of btrfs. > >>>>>>>>>>> > >>>>>>>>>>> Would you please run the following command to dump needed info for us to > >>>>>>>>>>> debug? > >>>>>>>>>>> > >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>>>>>>>>>> > >>>>>>>>>>> and > >>>>>>>>>>> > >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>>>>>>>>>> > >>>>>>>>>>> and > >>>>>>>>>>> > >>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. > >>>>>>>>>>> But such bit-flip should be detected by tree block csum. > >>>>>>>>>>> I'm not sure what's wrong with it. > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Qu > >>>>>>>>>>> > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>> checking csums > >>>>>>>>>>>> checking root refs > >>>>>>>>>>>> found 23421812736 bytes used err is 0 > >>>>>>>>>>>> total csum bytes: 21531608 > >>>>>>>>>>>> total tree bytes: 776650752 > >>>>>>>>>>>> total fs tree bytes: 711278592 > >>>>>>>>>>>> total extent tree bytes: 36798464 > >>>>>>>>>>>> btree space waste bytes: 116002036 > >>>>>>>>>>>> file data blocks allocated: 850546470912 > >>>>>>>>>>>> referenced 27611987968 > >>>>>>>>>>>> > >>>>>>>>>>>> Is it dangerous and what should I do about it? > >>>>>>>>>>>> > >>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>>>>>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>>>>>> > >>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size > >>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >>>>>>>>>> share the attachment by google drive? > >>>>>>>>>> > >>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >>>>>>>>>> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 12:26 ` Filippe LeMarchand @ 2017-07-14 12:41 ` Qu Wenruo 2017-07-14 12:45 ` Filippe LeMarchand 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2017-07-14 12:41 UTC (permalink / raw) To: Filippe LeMarchand; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo On 2017年07月14日 20:26, Filippe LeMarchand wrote: > So, my options are > a) Delete and re-create sobvolume > b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help) Then --repair doesn't help now. > c) Do nothing and wait for further update Further update plan includes: c) Update btrfs check --repair to handle your case. This will take some time for us to test and other guys to review. d) Create a special purposed btrfs-corrupt-block patch for your image. This will fix your fs, but only for your fs. Not a generic solution, but at least it should work. For now, it's recommend to backup important data, in case both c) and d) fail. Thanks, Qu > ? > > In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote: >> >> On 2017年07月14日 20:04, Filippe LeMarchand wrote: >>>> Currently possible solution may be deleting the whole subvolume. >>> Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync? >> >> You could try if you have backup. >> >> Personally speaking, I'm not sure if it will work or make things worse. >> Such hash and name mismatch is really rare, I don't know how kernel send >> will handle it. >> >>> >>>> If you have full backup, then you could try it. >>> It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes? >> >> Unfortunately, it may corrupt other subvolumes. >> But from your fsck output, the possibility of corruption is not that >> high AFAIK. >> >> I recommend to backup other good subvolumes/snapshots using send and >> receive just in case. >> >>> >>>> Any idea about the reproducer? Or just random memory corruption? >>> No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago. >>> Also I ran memtest recently and it didn't find any errors. >> >> Well, that's common. >> I'll focus on checking your dump result to make a special purposed >> btrfs-corrupt-block to fix your situation if no other method works for you. >> >> Thanks, >> Qu >> >>> >>> In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote: >>>> >>>> On 2017年07月14日 18:12, Filippe LeMarchand wrote: >>>>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: >>>>> >>>>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory >>>>> >>>>> $ sudo rm -rf /usr/share/doc/packages/util-linux/ >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty >>>>> >>>>> $ sudo ls -l /usr/share/doc/packages/util-linux/ >>>>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory >>>>> total 0 >>>>> -????????? ? ? ? ? ? deprecated.txt >>>> >>>> Similar behavior is also detected using manually crafted image in our >>>> environment. >>>> >>>> Su Yue have sent patches to enhance error detection and test case for >>>> it, but repairing is not supported. >>>> >>>>> >>>>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): >>>>> >>>>> $ ls -l /usr/share/doc/packages/util-linux/ >>>>> total 104 >>>>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING >>>>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 >>>>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 >>>>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB >>>>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing >>>>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt >>>>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt >>>>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt >>>>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt >>>>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt >>>>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt >>>>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt >>>>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt >>>>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt >>>>> >>>>> So, is this situation actually dangerous? And what can I do to gather more information for you? >>>> >>>> The situation won't be worse. I'd recommend not to take any snapshot of >>>> those subvolumes (4546 and 5134) to limit the corruption to those >>>> subvolumes. >>>> >>>> However there is also no easy way to fix it yet. >>>> >>>> Currently possible solution may be deleting the whole subvolume. >>>> If no further error happens, it may be fixed. >>>> >>>> IIRC btrfs check --repair in original mode has >>>> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can >>>> handle it well. >>>> Btrfs check --repair *MAY* fix it, or it may make things worse. >>>> If you have full backup, then you could try it. >>>> Otherwise, don't try it at all. >>>> >>>> Other solution includes a specific repair program just for your case. >>>> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM >>>> (".sxt" one) and related DIR_INDEX/INODE_REF. >>>> But I'll only choose this if you really need to fix it as soon as possible. >>>> >>>> At least we have solution for it. >>>> I'm more concerned about how this happened. >>>> >>>> Any idea about the reproducer? Or just random memory corruption? >>>> >>>> Thanks, >>>> Qu >>>>> >>>>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: >>>>>> Thanks for your dump. >>>>>> >>>>>> We're clear what is the direct cause of the problem. >>>>>> >>>>>> It's one corrupted DIR_ITEM causing the problem. >>>>>> And further more, original mode btrfs check can't detect it, and we will >>>>>> fix it soon. >>>>>> >>>>>> The corrupted DIR_ITEM is as the following: >>>>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 >>>>>> location key (4222342 INODE_ITEM 0) type FILE >>>>>> transid 170929 data_len 0 name_len 14 >>>>>> name: deprecated.sxt >>>>>> location key (13590433 INODE_ITEM 0) type FILE >>>>>> transid 796448 data_len 0 name_len 14 >>>>>> name: deprecated.txt >>>>>> >>>>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" >>>>>> (ino=4222342) and "deprecated.sxt" (ino=13590433) >>>>>> >>>>>> But something goes wrong here: >>>>>> >>>>>> 1) Hash of "deprecated.sxt" doesn't match 54846528 >>>>>> >>>>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" >>>>>> Also captured by dump: >>>>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 >>>>>> inode ref index 417 namelen 14 name: deprecated.txt >>>>>> >>>>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be >>>>>> "deprecated.txt" >>>>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 >>>>>> location key (4222342 INODE_ITEM 0) type FILE >>>>>> transid 170929 data_len 0 name_len 14 >>>>>> name: deprecated.txt >>>>>> >>>>>> So generic speaking, it's DIR_ITEM wrong and causing the problem. >>>>>> >>>>>> But the root reason is still unknown. >>>>>> >>>>>> What I can see is, the corrupted DIR_ITEM points to an very old inode, >>>>>> its mtime is back to 2016-09-07. >>>>>> While the good DIR_ITEM points to newer inode, whose mtime is just >>>>>> 2017-05-02. >>>>>> >>>>>> But more weird, there should not be two child inodes with the same >>>>>> filename ("depercated.txt", I assume the sxt one is caused by a memory >>>>>> bit corruption). >>>>>> >>>>>> So, any details on the operation with util-linux/deprecated.txt will >>>>>> help us to locate the root cause in kernel. >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>> >>>>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote: >>>>>>> Done, files added to same GDrive folder with corresponding names. >>>>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. >>>>>>> >>>>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: >>>>>>>> >>>>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. >>>>>>>> >>>>>>>> It looks much better, thanks. >>>>>>>> >>>>>>>>> >>>>>>>>> And btrfs check --mode=lowmem gives this: >>>>>>>>> >>>>>>>>> checking extents >>>>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 >>>>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 >>>>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 >>>>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 >>>>>>>>> ERROR: errors found in extent allocation tree or chunk allocation >>>>>>>> >>>>>>>> Looks much like an exposed lowmem mode bug. >>>>>>>> Feel free to ignore these error from extent tree, they are just false >>>>>>>> alerts. >>>>>>>> >>>>>>>>> checking free space cache >>>>>>>>> checking fs roots >>>>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>>>>>> >>>>>>>> The error report is much better than original mode, and that's what I need. >>>>>>>> >>>>>>>> Now I can wipe out all other noise as we know exactly which tree and >>>>>>>> which DIR_ITEM/INODE_REF is causing the problem. >>>>>>>> >>>>>>>> Would you please update the dump result with "-t 4546" passed to >>>>>>>> btrfs-debug-tree like: >>>>>>>> >>>>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 >>>>>>>> >>>>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. >>>>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be >>>>>>>> updated. >>>>>>>> >>>>>>>> And it seems that my previous assumption is still right for this case. >>>>>>>> If it's caused by kernel, your dump would definitely help us to locate >>>>>>>> the problem. >>>>>>>> >>>>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 >>>>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 >>>>>>>> >>>>>>>> Also for root 5134 please. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Qu >>>>>>>> >>>>>>>>> ERROR: errors found in fs roots >>>>>>>>> Checking filesystem on /dev/sda2 >>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>>>> found 153429872640 bytes used, error(s) found >>>>>>>>> total csum bytes: 121991672 >>>>>>>>> total tree bytes: 1940160512 >>>>>>>>> total fs tree bytes: 1683767296 >>>>>>>>> total extent tree bytes: 103841792 >>>>>>>>> btree space waste bytes: 310722480 >>>>>>>>> file data blocks allocated: 842455031808 >>>>>>>>> referenced 159286636544 >>>>>>>>> >>>>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: >>>>>>>>>> Sorry for the late reply. >>>>>>>>>> >>>>>>>>>> After investigating the dumps, I found the output is quite strange. >>>>>>>>>> >>>>>>>>>> 1) Mismatching output. >>>>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for >>>>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not >>>>>>>>>> here at all. >>>>>>>>>> >>>>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected >>>>>>>>>> 79177 DIR_ITEM/DIR_INDEX. >>>>>>>>>> >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? >>>>>>>>>> >>>>>>>>>> 2) Mismatched hash >>>>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the >>>>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 >>>>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". >>>>>>>>>> >>>>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", >>>>>>>>>> not "deprecated.sxt". >>>>>>>>>> >>>>>>>>>> I think that's the main problem. >>>>>>>>>> >>>>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem >>>>>>>>>> mode reports similar (well, output may differ) error? >>>>>>>>>> >>>>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure >>>>>>>>>> that's the problem. >>>>>>>>>> >>>>>>>>>> However it may take some time before we can fix it in repair mode. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Qu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: >>>>>>>>>>> Sure, here it is: >>>>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc >>>>>>>>>>> >>>>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: >>>>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: >>>>>>>>>>>>>> Hello everyone. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, >>>>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): >>>>>>>>>>>>>> >>>>>>>>>>>>>> enabling repair mode >>>>>>>>>>>>>> Checking filesystem on /dev/sda2 >>>>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e >>>>>>>>>>>>>> checking extents >>>>>>>>>>>>>> Fixed 0 roots. >>>>>>>>>>>>>> checking free space cache >>>>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated >>>>>>>>>>>>>> checking fs roots >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>> >>>>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode >>>>>>>>>>>>> pointer pointing to depercated.sxt. >>>>>>>>>>>>> >>>>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking >>>>>>>>>>>>> the cross reference rule of btrfs. >>>>>>>>>>>>> >>>>>>>>>>>>> Would you please run the following command to dump needed info for us to >>>>>>>>>>>>> debug? >>>>>>>>>>>>> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 >>>>>>>>>>>>> >>>>>>>>>>>>> and >>>>>>>>>>>>> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 >>>>>>>>>>>>> >>>>>>>>>>>>> and >>>>>>>>>>>>> >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. >>>>>>>>>>>>> But such bit-flip should be detected by tree block csum. >>>>>>>>>>>>> I'm not sure what's wrong with it. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Qu >>>>>>>>>>>>> >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item >>>>>>>>>>>>>> checking csums >>>>>>>>>>>>>> checking root refs >>>>>>>>>>>>>> found 23421812736 bytes used err is 0 >>>>>>>>>>>>>> total csum bytes: 21531608 >>>>>>>>>>>>>> total tree bytes: 776650752 >>>>>>>>>>>>>> total fs tree bytes: 711278592 >>>>>>>>>>>>>> total extent tree bytes: 36798464 >>>>>>>>>>>>>> btree space waste bytes: 116002036 >>>>>>>>>>>>>> file data blocks allocated: 850546470912 >>>>>>>>>>>>>> referenced 27611987968 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is it dangerous and what should I do about it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>>>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>>>>>> >>>>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size >>>>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you >>>>>>>>>>>> share the attachment by google drive? >>>>>>>>>>>> >>>>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. >>>>>>>>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Btrfs check reports errors, filesystem seems fine 2017-07-14 12:41 ` Qu Wenruo @ 2017-07-14 12:45 ` Filippe LeMarchand 0 siblings, 0 replies; 16+ messages in thread From: Filippe LeMarchand @ 2017-07-14 12:45 UTC (permalink / raw) To: Qu Wenruo; +Cc: Lu Fengqi, linux-btrfs, Qu Wenruo [-- Attachment #1: Type: text/plain, Size: 19713 bytes --] Ok then, many thanks. In a letter from Friday, July 14, 2017 15:41:22 MSK user Qu Wenruo wrote: > > On 2017年07月14日 20:26, Filippe LeMarchand wrote: > > So, my options are > > a) Delete and re-create sobvolume > > b) Try btrfs check --repair --mode original (if original mode is default, it already didn't help) > > Then --repair doesn't help now. > > > c) Do nothing and wait for further update > > Further update plan includes: > c) Update btrfs check --repair to handle your case. > This will take some time for us to test and other guys to review. > > d) Create a special purposed btrfs-corrupt-block patch for your image. > This will fix your fs, but only for your fs. > Not a generic solution, but at least it should work. > > For now, it's recommend to backup important data, in case both c) and d) > fail. > > Thanks, > Qu > > ? > > > > In a letter from Friday, July 14, 2017 15:11:05 MSK user Qu Wenruo wrote: > >> > >> On 2017年07月14日 20:04, Filippe LeMarchand wrote: > >>>> Currently possible solution may be deleting the whole subvolume. > >>> Can btrfs send (to external drive) and then btrfs receive back fix it? Or should I use simple cp/rsync? > >> > >> You could try if you have backup. > >> > >> Personally speaking, I'm not sure if it will work or make things worse. > >> Such hash and name mismatch is really rare, I don't know how kernel send > >> will handle it. > >> > >>> > >>>> If you have full backup, then you could try it. > >>> It is my root subvolume (sensitive data is on other ones), thus it is expendable. Can btrfs check --repair damage other subvolumes? > >> > >> Unfortunately, it may corrupt other subvolumes. > >> But from your fsck output, the possibility of corruption is not that > >> high AFAIK. > >> > >> I recommend to backup other good subvolumes/snapshots using send and > >> receive just in case. > >> > >>> > >>>> Any idea about the reproducer? Or just random memory corruption? > >>> No idea why and no idea when. This partition is about year and a half old, and I did btrfs check for the first time just about a month ago. > >>> Also I ran memtest recently and it didn't find any errors. > >> > >> Well, that's common. > >> I'll focus on checking your dump result to make a special purposed > >> btrfs-corrupt-block to fix your situation if no other method works for you. > >> > >> Thanks, > >> Qu > >> > >>> > >>> In a letter from Friday, July 14, 2017 14:28:58 MSK user Qu Wenruo wrote: > >>>> > >>>> On 2017年07月14日 18:12, Filippe LeMarchand wrote: > >>>>> First "rm" on deprecated.txt worked, but file is still there. Neither the file, nor its parent directory cannot be deleted: > >>>>> > >>>>> $ sudo rm /usr/share/doc/packages/util-linux/deprecated.txt > >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > >>>>> > >>>>> $ sudo rm -rf /usr/share/doc/packages/util-linux/ > >>>>> rm: cannot remove '/usr/share/doc/packages/util-linux/': Directory not empty > >>>>> > >>>>> $ sudo ls -l /usr/share/doc/packages/util-linux/ > >>>>> ls: cannot access '/usr/share/doc/packages/util-linux/deprecated.txt': No such file or directory > >>>>> total 0 > >>>>> -????????? ? ? ? ? ? deprecated.txt > >>>> > >>>> Similar behavior is also detected using manually crafted image in our > >>>> environment. > >>>> > >>>> Su Yue have sent patches to enhance error detection and test case for > >>>> it, but repairing is not supported. > >>>> > >>>>> > >>>>> Reinstall of util-linux package gives me two of that file (and also two files present on previous snapshot): > >>>>> > >>>>> $ ls -l /usr/share/doc/packages/util-linux/ > >>>>> total 104 > >>>>> -rw-r--r-- 1 root root 18092 Jul 20 2016 COPYING > >>>>> -rw-r--r-- 1 root root 1391 Jul 20 2016 COPYING.BSD-3 > >>>>> -rw-r--r-- 1 root root 26530 Jul 20 2016 COPYING.LGPLv2.1 > >>>>> -rw-r--r-- 1 root root 1824 Jul 20 2016 COPYING.UCB > >>>>> -rw-r--r-- 1 root root 555 Jul 20 2016 README.licensing > >>>>> -rw-r--r-- 1 root root 3257 Jul 20 2016 blkid.txt > >>>>> -rw-r--r-- 1 root root 2264 Jul 20 2016 cal.txt > >>>>> -rw-r--r-- 1 root root 1913 Jul 20 2016 col.txt > >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > >>>>> -rw-r--r-- 1 root root 2825 May 2 13:17 deprecated.txt > >>>>> -rw-r--r-- 1 root root 992 Jul 20 2016 getopt.txt > >>>>> -rw-r--r-- 1 root root 2437 Nov 2 2016 howto-debug.txt > >>>>> -rw-r--r-- 1 root root 148 Jul 20 2016 hwclock.txt > >>>>> -rw-r--r-- 1 root root 2617 Jul 20 2016 modems-with-agetty.txt > >>>>> -rw-r--r-- 1 root root 522 Jul 20 2016 mount.txt > >>>>> -rw-r--r-- 1 root root 448 Jul 20 2016 pg.txt > >>>>> > >>>>> So, is this situation actually dangerous? And what can I do to gather more information for you? > >>>> > >>>> The situation won't be worse. I'd recommend not to take any snapshot of > >>>> those subvolumes (4546 and 5134) to limit the corruption to those > >>>> subvolumes. > >>>> > >>>> However there is also no easy way to fix it yet. > >>>> > >>>> Currently possible solution may be deleting the whole subvolume. > >>>> If no further error happens, it may be fixed. > >>>> > >>>> IIRC btrfs check --repair in original mode has > >>>> DIR_ITEM/DIR_INDEX/INODE_REF repair function, but I'm not sure if it can > >>>> handle it well. > >>>> Btrfs check --repair *MAY* fix it, or it may make things worse. > >>>> If you have full backup, then you could try it. > >>>> Otherwise, don't try it at all. > >>>> > >>>> Other solution includes a specific repair program just for your case. > >>>> We can modify btrfs-corrupt-block to just delete the corrupted DIR_ITEM > >>>> (".sxt" one) and related DIR_INDEX/INODE_REF. > >>>> But I'll only choose this if you really need to fix it as soon as possible. > >>>> > >>>> At least we have solution for it. > >>>> I'm more concerned about how this happened. > >>>> > >>>> Any idea about the reproducer? Or just random memory corruption? > >>>> > >>>> Thanks, > >>>> Qu > >>>>> > >>>>> In a letter from Friday, July 14, 2017 9:11:06 MSK user Qu Wenruo wrote: > >>>>>> Thanks for your dump. > >>>>>> > >>>>>> We're clear what is the direct cause of the problem. > >>>>>> > >>>>>> It's one corrupted DIR_ITEM causing the problem. > >>>>>> And further more, original mode btrfs check can't detect it, and we will > >>>>>> fix it soon. > >>>>>> > >>>>>> The corrupted DIR_ITEM is as the following: > >>>>>> item 72 key (79177 DIR_ITEM 54846528) itemoff 12380 itemsize 88 > >>>>>> location key (4222342 INODE_ITEM 0) type FILE > >>>>>> transid 170929 data_len 0 name_len 14 > >>>>>> name: deprecated.sxt > >>>>>> location key (13590433 INODE_ITEM 0) type FILE > >>>>>> transid 796448 data_len 0 name_len 14 > >>>>>> name: deprecated.txt > >>>>>> > >>>>>> For dir inode 79177, it has 2 child inodes, with name "deprecated.txt" > >>>>>> (ino=4222342) and "deprecated.sxt" (ino=13590433) > >>>>>> > >>>>>> But something goes wrong here: > >>>>>> > >>>>>> 1) Hash of "deprecated.sxt" doesn't match 54846528 > >>>>>> > >>>>>> 2) Inode backref of inode 4222342 thinks its filename is "deprecated.txt" > >>>>>> Also captured by dump: > >>>>>> item 40 key (4222342 INODE_REF 79177) itemoff 7189 itemsize 24 > >>>>>> inode ref index 417 namelen 14 name: deprecated.txt > >>>>>> > >>>>>> 3) DIR_INDEX also shows that filename for inode 4222342 should be > >>>>>> "deprecated.txt" > >>>>>> item 87 key (79177 DIR_INDEX 417) itemoff 11757 itemsize 44 > >>>>>> location key (4222342 INODE_ITEM 0) type FILE > >>>>>> transid 170929 data_len 0 name_len 14 > >>>>>> name: deprecated.txt > >>>>>> > >>>>>> So generic speaking, it's DIR_ITEM wrong and causing the problem. > >>>>>> > >>>>>> But the root reason is still unknown. > >>>>>> > >>>>>> What I can see is, the corrupted DIR_ITEM points to an very old inode, > >>>>>> its mtime is back to 2016-09-07. > >>>>>> While the good DIR_ITEM points to newer inode, whose mtime is just > >>>>>> 2017-05-02. > >>>>>> > >>>>>> But more weird, there should not be two child inodes with the same > >>>>>> filename ("depercated.txt", I assume the sxt one is caused by a memory > >>>>>> bit corruption). > >>>>>> > >>>>>> So, any details on the operation with util-linux/deprecated.txt will > >>>>>> help us to locate the root cause in kernel. > >>>>>> > >>>>>> Thanks, > >>>>>> Qu > >>>>>> > >>>>>> > >>>>>> On 2017年07月12日 21:11, Filippe LeMarchand wrote: > >>>>>>> Done, files added to same GDrive folder with corresponding names. > >>>>>>> If it matters, subvol 4546 is my root filesystem (r/w snapshot created with snapper rollback), and 5134 is its snapshot. > >>>>>>> > >>>>>>> In a letter dated Wednesday, July 12, 2017 15:44:52 MSK user Qu Wenruo wrote: > >>>>>>>> > >>>>>>>> On 2017年07月12日 19:12, Filippe LeMarchand wrote: > >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>>>>>> Yes, my bad. Now I used grep -E "\(79177| 79177" pattern, file on GDrive updated. > >>>>>>>> > >>>>>>>> It looks much better, thanks. > >>>>>>>> > >>>>>>>>> > >>>>>>>>> And btrfs check --mode=lowmem gives this: > >>>>>>>>> > >>>>>>>>> checking extents > >>>>>>>>> ERROR: extent[1609877700608, 94208] referencer count mismatch (root: 260, owner: 61720, offset: 6742016) wanted: 2, have: 5 > >>>>>>>>> ERROR: extent[1630301675520, 39583744] referencer count mismatch (root: 260, owner: 5847554, offset: 0) wanted: 36, have: 114 > >>>>>>>>> ERROR: extent[1658646986752, 10551296] referencer count mismatch (root: 274, owner: 283675, offset: 0) wanted: 2, have: 5 > >>>>>>>>> ERROR: extent[1672239132672, 84381696] referencer count mismatch (root: 274, owner: 2521382, offset: 0) wanted: 21, have: 25 > >>>>>>>>> ERROR: errors found in extent allocation tree or chunk allocation > >>>>>>>> > >>>>>>>> Looks much like an exposed lowmem mode bug. > >>>>>>>> Feel free to ignore these error from extent tree, they are just false > >>>>>>>> alerts. > >>>>>>>> > >>>>>>>>> checking free space cache > >>>>>>>>> checking fs roots > >>>>>>>>> ERROR: root 4546 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>>>>>> > >>>>>>>> The error report is much better than original mode, and that's what I need. > >>>>>>>> > >>>>>>>> Now I can wipe out all other noise as we know exactly which tree and > >>>>>>>> which DIR_ITEM/INODE_REF is causing the problem. > >>>>>>>> > >>>>>>>> Would you please update the dump result with "-t 4546" passed to > >>>>>>>> btrfs-debug-tree like: > >>>>>>>> > >>>>>>>> # btrfs-debug-tree -t 4546 <device>| grep 79177 > >>>>>>>> > >>>>>>>> Only "-t 4546" is added, to only dump the result of subvolume 4546. > >>>>>>>> As always, all 3 grep results (2 "deprecated" and one 79177) need to be > >>>>>>>> updated. > >>>>>>>> > >>>>>>>> And it seems that my previous assumption is still right for this case. > >>>>>>>> If it's caused by kernel, your dump would definitely help us to locate > >>>>>>>> the problem. > >>>>>>>> > >>>>>>>>> ERROR: root 4546 INODE REF[4222342 79177] and DIR_ITEM[79177 54846528] mismatch namelen 14 filename deprecated.txt filetype 1 > >>>>>>>>> ERROR: root 5134 DIR_ITEM[79177 54846528] relative INODE_REF missing namelen 14 filename deprecated.sxt filetype 1 > >>>>>>>> > >>>>>>>> Also for root 5134 please. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Qu > >>>>>>>> > >>>>>>>>> ERROR: errors found in fs roots > >>>>>>>>> Checking filesystem on /dev/sda2 > >>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>>>> found 153429872640 bytes used, error(s) found > >>>>>>>>> total csum bytes: 121991672 > >>>>>>>>> total tree bytes: 1940160512 > >>>>>>>>> total fs tree bytes: 1683767296 > >>>>>>>>> total extent tree bytes: 103841792 > >>>>>>>>> btree space waste bytes: 310722480 > >>>>>>>>> file data blocks allocated: 842455031808 > >>>>>>>>> referenced 159286636544 > >>>>>>>>> > >>>>>>>>> In a letter from Wednesday, July 12, 2017 10:15:18 MSK user Qu Wenruo wrote: > >>>>>>>>>> Sorry for the late reply. > >>>>>>>>>> > >>>>>>>>>> After investigating the dumps, I found the output is quite strange. > >>>>>>>>>> > >>>>>>>>>> 1) Mismatching output. > >>>>>>>>>> In "btrfs-debug-tree-grep-79177.txt" I found only 79177 as offset for > >>>>>>>>>> INODE_REF is here, while 79177 as objectid for DIR_ITEM/DIR_INDEX is not > >>>>>>>>>> here at all. > >>>>>>>>>> > >>>>>>>>>> While in "btrfs-debug-tree-grep-deprecated-txt.txt" there is epected > >>>>>>>>>> 79177 DIR_ITEM/DIR_INDEX. > >>>>>>>>>> > >>>>>>>>>> Maybe something wrong in grep happened which skip "(79177" ? > >>>>>>>>>> > >>>>>>>>>> 2) Mismatched hash > >>>>>>>>>> The main problem I found is that, for key (79177 DIR_ITEM 54846528), the > >>>>>>>>>> number 54846528 is the hash(crc32c) of filename, and it contains 2 > >>>>>>>>>> items, one for "deprecated.txt" and one for "deprecated.sxt". > >>>>>>>>>> > >>>>>>>>>> But we found that 54846528 only matches the hash for "deprecated.txt", > >>>>>>>>>> not "deprecated.sxt". > >>>>>>>>>> > >>>>>>>>>> I think that's the main problem. > >>>>>>>>>> > >>>>>>>>>> BTW, would you please try "btrfs check --mode=lowmem" to see if lowmem > >>>>>>>>>> mode reports similar (well, output may differ) error? > >>>>>>>>>> > >>>>>>>>>> If lowmem mode also reports error on such DIR_ITEM, I'm pretty sure > >>>>>>>>>> that's the problem. > >>>>>>>>>> > >>>>>>>>>> However it may take some time before we can fix it in repair mode. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Qu > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> 在 2017年07月04日 21:24, Filippe LeMarchand 写道: > >>>>>>>>>>> Sure, here it is: > >>>>>>>>>>> https://drive.google.com/drive/folders/0B1ax9Am81gx9YjJBVVA0LXRHeGc > >>>>>>>>>>> > >>>>>>>>>>> In a letter dated Tuesday, July 4, 2017 16:16:36 MSK user Lu Fengqi wrote: > >>>>>>>>>>>> On Mon, Jul 03, 2017 at 08:34:52AM +0800, Qu Wenruo wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> At 07/01/2017 07:59 PM, Filippe LeMarchand wrote: > >>>>>>>>>>>>>> Hello everyone. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I have an btrfs root partition on Intel 530 ssd, which mounts without errors and seem to work fine, > >>>>>>>>>>>>>> but `btrfs check` gives me foloowing output (and --repair doesn't remove errors): > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> enabling repair mode > >>>>>>>>>>>>>> Checking filesystem on /dev/sda2 > >>>>>>>>>>>>>> UUID: 12c84aa3-ce65-4390-807e-a72cc8a7445e > >>>>>>>>>>>>>> checking extents > >>>>>>>>>>>>>> Fixed 0 roots. > >>>>>>>>>>>>>> checking free space cache > >>>>>>>>>>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>>>>>>>>>> checking fs roots > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>> > >>>>>>>>>>>>> This means that in dir whose inode number is 79177, it has a child inode > >>>>>>>>>>>>> pointer pointing to depercated.sxt. > >>>>>>>>>>>>> > >>>>>>>>>>>>> But it doesn't have dir index and corresponding inode ref, which is breaking > >>>>>>>>>>>>> the cross reference rule of btrfs. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Would you please run the following command to dump needed info for us to > >>>>>>>>>>>>> debug? > >>>>>>>>>>>>> > >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep 79177 -C 10 > >>>>>>>>>>>>> > >>>>>>>>>>>>> and > >>>>>>>>>>>>> > >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.sxt -C 10 > >>>>>>>>>>>>> > >>>>>>>>>>>>> and > >>>>>>>>>>>>> > >>>>>>>>>>>>> # btrfs-debug-tree /dev/sda2 | grep deprecated.txt -C 10 > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Considering the output has both .txt and .sxt, I think that's the problem. > >>>>>>>>>>>>> But such bit-flip should be detected by tree block csum. > >>>>>>>>>>>>> I'm not sure what's wrong with it. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> Qu > >>>>>>>>>>>>> > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> unresolved ref dir 79177 index 0 namelen 14 name deprecated.sxt filetype 1 errors 6, no dir index, no inode ref > >>>>>>>>>>>>>> unresolved ref dir 79177 index 417 namelen 14 name deprecated.txt filetype 1 errors 1, no dir item > >>>>>>>>>>>>>> checking csums > >>>>>>>>>>>>>> checking root refs > >>>>>>>>>>>>>> found 23421812736 bytes used err is 0 > >>>>>>>>>>>>>> total csum bytes: 21531608 > >>>>>>>>>>>>>> total tree bytes: 776650752 > >>>>>>>>>>>>>> total fs tree bytes: 711278592 > >>>>>>>>>>>>>> total extent tree bytes: 36798464 > >>>>>>>>>>>>>> btree space waste bytes: 116002036 > >>>>>>>>>>>>>> file data blocks allocated: 850546470912 > >>>>>>>>>>>>>> referenced 27611987968 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Is it dangerous and what should I do about it? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I also tried --clear-space-cache, but it just removes the line about space cache. > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>>>>>>>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>>>>>>>> > >>>>>>>>>>>> I'm afraid that your mail may be rejected because the attachment size > >>>>>>>>>>>> exceeds the allowable limit(100kB) of btrfs mailing list. Could you > >>>>>>>>>>>> share the attachment by google drive? > >>>>>>>>>>>> > >>>>>>>>>>>> Lastly, while Qu's timing is too tight, I will assist you on this issue. > >>>>>>>>>>>> > [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5037 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-07-14 12:46 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-07-01 11:59 Btrfs check reports errors, filesystem seems fine Filippe LeMarchand 2017-07-03 0:34 ` Qu Wenruo 2017-07-04 13:16 ` Lu Fengqi 2017-07-04 13:24 ` Filippe LeMarchand 2017-07-12 7:15 ` Qu Wenruo 2017-07-12 11:12 ` Filippe LeMarchand 2017-07-12 12:44 ` Qu Wenruo 2017-07-12 13:11 ` Filippe LeMarchand 2017-07-14 6:11 ` Qu Wenruo 2017-07-14 10:12 ` Filippe LeMarchand 2017-07-14 11:28 ` Qu Wenruo 2017-07-14 12:04 ` Filippe LeMarchand 2017-07-14 12:11 ` Qu Wenruo 2017-07-14 12:26 ` Filippe LeMarchand 2017-07-14 12:41 ` Qu Wenruo 2017-07-14 12:45 ` Filippe LeMarchand
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.