* Next steps in recovery? @ 2021-09-03 2:43 Robert Wyrick 2021-09-03 2:47 ` Robert Wyrick 2021-09-03 6:48 ` Qu Wenruo 0 siblings, 2 replies; 20+ messages in thread From: Robert Wyrick @ 2021-09-03 2:43 UTC (permalink / raw) To: linux-btrfs I cannot mount my btrfs filesystem. $ uname -a Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux $ btrfs version btrfs-progs v5.4.1 I'm seeing the following from check: $ btrfs check -p /dev/sda Opening filesystem to check... Checking filesystem on /dev/sda UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf [1/7] checking root items (0:00:59 elapsed, 2649102 items checked) ERROR: invalid generation for extent 38179182174208, have 140737491486755 expect (0, 4057084] [2/7] checking extents (0:02:17 elapsed, 1116143 items checked) ERROR: errors found in extent allocation tree or chunk allocation cache and super generation don't match, space cache will be invalidated [3/7] checking free space cache (0:00:00 elapsed) [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters filetype 2 errors 2, no dir index unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 errors 5, no dir item, no inode ref [4/7] checking fs roots (0:00:42 elapsed, 108894 items checked) ERROR: errors found in fs roots found 15729059057664 bytes used, error(s) found total csum bytes: 15313288548 total tree bytes: 18286739456 total fs tree bytes: 1791819776 total extent tree bytes: 229130240 btree space waste bytes: 1018844959 file data blocks allocated: 51587230502912 referenced 15627926712320 I've tried everything I've found on the internet, but haven't attempted to repair based on the warnings... What more info do you need to help me diagnose/fix this? Thanks! -Rob ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-03 2:43 Next steps in recovery? Robert Wyrick @ 2021-09-03 2:47 ` Robert Wyrick 2021-09-03 6:48 ` Qu Wenruo 1 sibling, 0 replies; 20+ messages in thread From: Robert Wyrick @ 2021-09-03 2:47 UTC (permalink / raw) To: linux-btrfs One more thing of note. I was running linux kernel 4.15.? when things went bad. I upgraded hoping that the newer tools could fix things. On Thu, Sep 2, 2021 at 8:43 PM Robert Wyrick <rob@wyrick.org> wrote: > > I cannot mount my btrfs filesystem. > $ uname -a > Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > $ btrfs version > btrfs-progs v5.4.1 > > I'm seeing the following from check: > $ btrfs check -p /dev/sda > Opening filesystem to check... > Checking filesystem on /dev/sda > UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > [1/7] checking root items (0:00:59 elapsed, > 2649102 items checked) > ERROR: invalid generation for extent 38179182174208, have > 140737491486755 expect (0, 4057084] > [2/7] checking extents (0:02:17 elapsed, > 1116143 items checked) > ERROR: errors found in extent allocation tree or chunk allocation > cache and super generation don't match, space cache will be invalidated > [3/7] checking free space cache (0:00:00 elapsed) > [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters > filetype 2 errors 2, no dir index > unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 > errors 5, no dir item, no inode ref > [4/7] checking fs roots (0:00:42 elapsed, > 108894 items checked) > ERROR: errors found in fs roots > found 15729059057664 bytes used, error(s) found > total csum bytes: 15313288548 > total tree bytes: 18286739456 > total fs tree bytes: 1791819776 > total extent tree bytes: 229130240 > btree space waste bytes: 1018844959 > file data blocks allocated: 51587230502912 > referenced 15627926712320 > > I've tried everything I've found on the internet, but haven't > attempted to repair based on the warnings... > > What more info do you need to help me diagnose/fix this? > > Thanks! > -Rob ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-03 2:43 Next steps in recovery? Robert Wyrick 2021-09-03 2:47 ` Robert Wyrick @ 2021-09-03 6:48 ` Qu Wenruo 2021-09-03 6:53 ` Qu Wenruo 1 sibling, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-03 6:48 UTC (permalink / raw) To: Robert Wyrick, linux-btrfs On 2021/9/3 上午10:43, Robert Wyrick wrote: > I cannot mount my btrfs filesystem. > $ uname -a > Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > $ btrfs version > btrfs-progs v5.4.1 The tool is a little too old, thus if you're going to repair, you'd better to update the progs. > > I'm seeing the following from check: > $ btrfs check -p /dev/sda > Opening filesystem to check... > Checking filesystem on /dev/sda > UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > [1/7] checking root items (0:00:59 elapsed, > 2649102 items checked) > ERROR: invalid generation for extent 38179182174208, have > 140737491486755 expect (0, 4057084] This is a repairable problem. We have test case for exactly the same case in tests/fsck-test/044 for it. > [2/7] checking extents (0:02:17 elapsed, > 1116143 items checked) > ERROR: errors found in extent allocation tree or chunk allocation > cache and super generation don't match, space cache will be invalidated > [3/7] checking free space cache (0:00:00 elapsed) > [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters > filetype 2 errors 2, no dir index No dir index can also be repaired. The dir index will be added back. > unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 > errors 5, no dir item, no inode ref No dir item nor inode ref can also be repaired, but with dir item and inode ref removed. But the problem here looks very strange. It's the same dir and the same index, but different name. posters vs poSters. 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad memory which caused a bitflip and the problem. Thus I prefer to do a full memtest before running btrfs check --repair. Thanks, Qu > [4/7] checking fs roots (0:00:42 elapsed, > 108894 items checked) > ERROR: errors found in fs roots > found 15729059057664 bytes used, error(s) found > total csum bytes: 15313288548 > total tree bytes: 18286739456 > total fs tree bytes: 1791819776 > total extent tree bytes: 229130240 > btree space waste bytes: 1018844959 > file data blocks allocated: 51587230502912 > referenced 15627926712320 > > I've tried everything I've found on the internet, but haven't > attempted to repair based on the warnings... > > What more info do you need to help me diagnose/fix this? > > Thanks! > -Rob > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-03 6:48 ` Qu Wenruo @ 2021-09-03 6:53 ` Qu Wenruo [not found] ` <CAA_aC99-C8xOf7EAvJAMk2ZkYSaN2vyK7YFMw06utQ0T+tsh9A@mail.gmail.com> 0 siblings, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-03 6:53 UTC (permalink / raw) To: Robert Wyrick, linux-btrfs On 2021/9/3 下午2:48, Qu Wenruo wrote: > > > On 2021/9/3 上午10:43, Robert Wyrick wrote: >> I cannot mount my btrfs filesystem. >> $ uname -a >> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >> $ btrfs version >> btrfs-progs v5.4.1 > > The tool is a little too old, thus if you're going to repair, you'd > better to update the progs. >> >> I'm seeing the following from check: >> $ btrfs check -p /dev/sda >> Opening filesystem to check... >> Checking filesystem on /dev/sda >> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >> [1/7] checking root items (0:00:59 elapsed, >> 2649102 items checked) >> ERROR: invalid generation for extent 38179182174208, have >> 140737491486755 expect (0, 4057084] > > This is a repairable problem. > > We have test case for exactly the same case in tests/fsck-test/044 for it. Oh, this invalid extent generation is already a more direct indication of memory bitflip. 140737491486755 = 0x8000002fc823 Without the high 0x8 bit, the remaining part is completely valid generation, 0x2fc823, which is inside the expectation. So, a memtest is a must before doing any repair. You won't want another bitflip to ruin your perfectly repairable fs. Thanks, Qu > > >> [2/7] checking extents (0:02:17 elapsed, >> 1116143 items checked) >> ERROR: errors found in extent allocation tree or chunk allocation >> cache and super generation don't match, space cache will be invalidated >> [3/7] checking free space cache (0:00:00 elapsed) >> [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters >> filetype 2 errors 2, no dir index > > No dir index can also be repaired. > > The dir index will be added back. > >> unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 >> errors 5, no dir item, no inode ref > > No dir item nor inode ref can also be repaired, but with dir item and > inode ref removed. > > But the problem here looks very strange. > > It's the same dir and the same index, but different name. > posters vs poSters. > > 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad > memory which caused a bitflip and the problem. > > Thus I prefer to do a full memtest before running btrfs check --repair. > > Thanks, > Qu > >> [4/7] checking fs roots (0:00:42 elapsed, >> 108894 items checked) >> ERROR: errors found in fs roots >> found 15729059057664 bytes used, error(s) found >> total csum bytes: 15313288548 >> total tree bytes: 18286739456 >> total fs tree bytes: 1791819776 >> total extent tree bytes: 229130240 >> btree space waste bytes: 1018844959 >> file data blocks allocated: 51587230502912 >> referenced 15627926712320 >> >> I've tried everything I've found on the internet, but haven't >> attempted to repair based on the warnings... >> >> What more info do you need to help me diagnose/fix this? >> >> Thanks! >> -Rob >> ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAA_aC99-C8xOf7EAvJAMk2ZkYSaN2vyK7YFMw06utQ0T+tsh9A@mail.gmail.com>]
* Re: Next steps in recovery? [not found] ` <CAA_aC99-C8xOf7EAvJAMk2ZkYSaN2vyK7YFMw06utQ0T+tsh9A@mail.gmail.com> @ 2021-09-05 22:03 ` Qu Wenruo 2021-09-06 14:42 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-05 22:03 UTC (permalink / raw) To: Robert Wyrick, linux-btrfs On 2021/9/6 上午12:00, Robert Wyrick wrote: > Running memtest86+ now.... 20 hours in. No errors yet. > Thanks for the analysis. I'll let this run for another day or so. Just to mention, since 5.11 btrfs kernel module has the ability to detect most high bitflip before writing tree blocks to disks. Thus even with less reliable RAM, it's still more reliable than nothing. But still, with the existing errors, the RAM test is still an essential one before doing anything. Thanks, Qu > > > On Fri, Sep 3, 2021 at 12:53 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2021/9/3 下午2:48, Qu Wenruo wrote: >>> >>> >>> On 2021/9/3 上午10:43, Robert Wyrick wrote: >>>> I cannot mount my btrfs filesystem. >>>> $ uname -a >>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>> $ btrfs version >>>> btrfs-progs v5.4.1 >>> >>> The tool is a little too old, thus if you're going to repair, you'd >>> better to update the progs. >>>> >>>> I'm seeing the following from check: >>>> $ btrfs check -p /dev/sda >>>> Opening filesystem to check... >>>> Checking filesystem on /dev/sda >>>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >>>> [1/7] checking root items (0:00:59 elapsed, >>>> 2649102 items checked) >>>> ERROR: invalid generation for extent 38179182174208, have >>>> 140737491486755 expect (0, 4057084] >>> >>> This is a repairable problem. >>> >>> We have test case for exactly the same case in tests/fsck-test/044 for it. >> >> Oh, this invalid extent generation is already a more direct indication >> of memory bitflip. >> >> 140737491486755 = 0x8000002fc823 >> >> Without the high 0x8 bit, the remaining part is completely valid >> generation, 0x2fc823, which is inside the expectation. >> >> So, a memtest is a must before doing any repair. >> You won't want another bitflip to ruin your perfectly repairable fs. >> >> Thanks, >> Qu >>> >>> >>>> [2/7] checking extents (0:02:17 elapsed, >>>> 1116143 items checked) >>>> ERROR: errors found in extent allocation tree or chunk allocation >>>> cache and super generation don't match, space cache will be invalidated >>>> [3/7] checking free space cache (0:00:00 elapsed) >>>> [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters >>>> filetype 2 errors 2, no dir index >>> >>> No dir index can also be repaired. >>> >>> The dir index will be added back. >>> >>>> unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 >>>> errors 5, no dir item, no inode ref >>> >>> No dir item nor inode ref can also be repaired, but with dir item and >>> inode ref removed. >>> >>> But the problem here looks very strange. >>> >>> It's the same dir and the same index, but different name. >>> posters vs poSters. >>> >>> 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad >>> memory which caused a bitflip and the problem. >>> >>> Thus I prefer to do a full memtest before running btrfs check --repair. >>> >>> Thanks, >>> Qu >>> >>>> [4/7] checking fs roots (0:00:42 elapsed, >>>> 108894 items checked) >>>> ERROR: errors found in fs roots >>>> found 15729059057664 bytes used, error(s) found >>>> total csum bytes: 15313288548 >>>> total tree bytes: 18286739456 >>>> total fs tree bytes: 1791819776 >>>> total extent tree bytes: 229130240 >>>> btree space waste bytes: 1018844959 >>>> file data blocks allocated: 51587230502912 >>>> referenced 15627926712320 >>>> >>>> I've tried everything I've found on the internet, but haven't >>>> attempted to repair based on the warnings... >>>> >>>> What more info do you need to help me diagnose/fix this? >>>> >>>> Thanks! >>>> -Rob >>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-05 22:03 ` Qu Wenruo @ 2021-09-06 14:42 ` Robert Wyrick 2021-09-06 23:26 ` Qu Wenruo 0 siblings, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-06 14:42 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs 42+ hours of memtest86+, no errors detected. 4 passes complete. Is that good enough? On Sun, Sep 5, 2021 at 4:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2021/9/6 上午12:00, Robert Wyrick wrote: > > Running memtest86+ now.... 20 hours in. No errors yet. > > Thanks for the analysis. I'll let this run for another day or so. > > Just to mention, since 5.11 btrfs kernel module has the ability to > detect most high bitflip before writing tree blocks to disks. > > Thus even with less reliable RAM, it's still more reliable than nothing. > > But still, with the existing errors, the RAM test is still an essential > one before doing anything. > > Thanks, > Qu > > > > > > On Fri, Sep 3, 2021 at 12:53 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >> > >> > >> > >> On 2021/9/3 下午2:48, Qu Wenruo wrote: > >>> > >>> > >>> On 2021/9/3 上午10:43, Robert Wyrick wrote: > >>>> I cannot mount my btrfs filesystem. > >>>> $ uname -a > >>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > >>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > >>>> $ btrfs version > >>>> btrfs-progs v5.4.1 > >>> > >>> The tool is a little too old, thus if you're going to repair, you'd > >>> better to update the progs. > >>>> > >>>> I'm seeing the following from check: > >>>> $ btrfs check -p /dev/sda > >>>> Opening filesystem to check... > >>>> Checking filesystem on /dev/sda > >>>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > >>>> [1/7] checking root items (0:00:59 elapsed, > >>>> 2649102 items checked) > >>>> ERROR: invalid generation for extent 38179182174208, have > >>>> 140737491486755 expect (0, 4057084] > >>> > >>> This is a repairable problem. > >>> > >>> We have test case for exactly the same case in tests/fsck-test/044 for it. > >> > >> Oh, this invalid extent generation is already a more direct indication > >> of memory bitflip. > >> > >> 140737491486755 = 0x8000002fc823 > >> > >> Without the high 0x8 bit, the remaining part is completely valid > >> generation, 0x2fc823, which is inside the expectation. > >> > >> So, a memtest is a must before doing any repair. > >> You won't want another bitflip to ruin your perfectly repairable fs. > >> > >> Thanks, > >> Qu > >>> > >>> > >>>> [2/7] checking extents (0:02:17 elapsed, > >>>> 1116143 items checked) > >>>> ERROR: errors found in extent allocation tree or chunk allocation > >>>> cache and super generation don't match, space cache will be invalidated > >>>> [3/7] checking free space cache (0:00:00 elapsed) > >>>> [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters > >>>> filetype 2 errors 2, no dir index > >>> > >>> No dir index can also be repaired. > >>> > >>> The dir index will be added back. > >>> > >>>> unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 > >>>> errors 5, no dir item, no inode ref > >>> > >>> No dir item nor inode ref can also be repaired, but with dir item and > >>> inode ref removed. > >>> > >>> But the problem here looks very strange. > >>> > >>> It's the same dir and the same index, but different name. > >>> posters vs poSters. > >>> > >>> 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad > >>> memory which caused a bitflip and the problem. > >>> > >>> Thus I prefer to do a full memtest before running btrfs check --repair. > >>> > >>> Thanks, > >>> Qu > >>> > >>>> [4/7] checking fs roots (0:00:42 elapsed, > >>>> 108894 items checked) > >>>> ERROR: errors found in fs roots > >>>> found 15729059057664 bytes used, error(s) found > >>>> total csum bytes: 15313288548 > >>>> total tree bytes: 18286739456 > >>>> total fs tree bytes: 1791819776 > >>>> total extent tree bytes: 229130240 > >>>> btree space waste bytes: 1018844959 > >>>> file data blocks allocated: 51587230502912 > >>>> referenced 15627926712320 > >>>> > >>>> I've tried everything I've found on the internet, but haven't > >>>> attempted to repair based on the warnings... > >>>> > >>>> What more info do you need to help me diagnose/fix this? > >>>> > >>>> Thanks! > >>>> -Rob > >>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-06 14:42 ` Robert Wyrick @ 2021-09-06 23:26 ` Qu Wenruo 2021-09-07 2:36 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-06 23:26 UTC (permalink / raw) To: Robert Wyrick, Qu Wenruo; +Cc: linux-btrfs On 2021/9/6 下午10:42, Robert Wyrick wrote: > 42+ hours of memtest86+, no errors detected. 4 passes complete. > Is that good enough? That's strange, such obvious bitflip should be easily detected. Is the fs only mounted on that computer? Anyway, you can continue try to repair with *latest* btrfs-progs. Thanks, Qu > > On Sun, Sep 5, 2021 at 4:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2021/9/6 上午12:00, Robert Wyrick wrote: >>> Running memtest86+ now.... 20 hours in. No errors yet. >>> Thanks for the analysis. I'll let this run for another day or so. >> >> Just to mention, since 5.11 btrfs kernel module has the ability to >> detect most high bitflip before writing tree blocks to disks. >> >> Thus even with less reliable RAM, it's still more reliable than nothing. >> >> But still, with the existing errors, the RAM test is still an essential >> one before doing anything. >> >> Thanks, >> Qu >>> >>> >>> On Fri, Sep 3, 2021 at 12:53 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >>>> >>>> >>>> >>>> On 2021/9/3 下午2:48, Qu Wenruo wrote: >>>>> >>>>> >>>>> On 2021/9/3 上午10:43, Robert Wyrick wrote: >>>>>> I cannot mount my btrfs filesystem. >>>>>> $ uname -a >>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>>>> $ btrfs version >>>>>> btrfs-progs v5.4.1 >>>>> >>>>> The tool is a little too old, thus if you're going to repair, you'd >>>>> better to update the progs. >>>>>> >>>>>> I'm seeing the following from check: >>>>>> $ btrfs check -p /dev/sda >>>>>> Opening filesystem to check... >>>>>> Checking filesystem on /dev/sda >>>>>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >>>>>> [1/7] checking root items (0:00:59 elapsed, >>>>>> 2649102 items checked) >>>>>> ERROR: invalid generation for extent 38179182174208, have >>>>>> 140737491486755 expect (0, 4057084] >>>>> >>>>> This is a repairable problem. >>>>> >>>>> We have test case for exactly the same case in tests/fsck-test/044 for it. >>>> >>>> Oh, this invalid extent generation is already a more direct indication >>>> of memory bitflip. >>>> >>>> 140737491486755 = 0x8000002fc823 >>>> >>>> Without the high 0x8 bit, the remaining part is completely valid >>>> generation, 0x2fc823, which is inside the expectation. >>>> >>>> So, a memtest is a must before doing any repair. >>>> You won't want another bitflip to ruin your perfectly repairable fs. >>>> >>>> Thanks, >>>> Qu >>>>> >>>>> >>>>>> [2/7] checking extents (0:02:17 elapsed, >>>>>> 1116143 items checked) >>>>>> ERROR: errors found in extent allocation tree or chunk allocation >>>>>> cache and super generation don't match, space cache will be invalidated >>>>>> [3/7] checking free space cache (0:00:00 elapsed) >>>>>> [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters >>>>>> filetype 2 errors 2, no dir index >>>>> >>>>> No dir index can also be repaired. >>>>> >>>>> The dir index will be added back. >>>>> >>>>>> unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 >>>>>> errors 5, no dir item, no inode ref >>>>> >>>>> No dir item nor inode ref can also be repaired, but with dir item and >>>>> inode ref removed. >>>>> >>>>> But the problem here looks very strange. >>>>> >>>>> It's the same dir and the same index, but different name. >>>>> posters vs poSters. >>>>> >>>>> 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad >>>>> memory which caused a bitflip and the problem. >>>>> >>>>> Thus I prefer to do a full memtest before running btrfs check --repair. >>>>> >>>>> Thanks, >>>>> Qu >>>>> >>>>>> [4/7] checking fs roots (0:00:42 elapsed, >>>>>> 108894 items checked) >>>>>> ERROR: errors found in fs roots >>>>>> found 15729059057664 bytes used, error(s) found >>>>>> total csum bytes: 15313288548 >>>>>> total tree bytes: 18286739456 >>>>>> total fs tree bytes: 1791819776 >>>>>> total extent tree bytes: 229130240 >>>>>> btree space waste bytes: 1018844959 >>>>>> file data blocks allocated: 51587230502912 >>>>>> referenced 15627926712320 >>>>>> >>>>>> I've tried everything I've found on the internet, but haven't >>>>>> attempted to repair based on the warnings... >>>>>> >>>>>> What more info do you need to help me diagnose/fix this? >>>>>> >>>>>> Thanks! >>>>>> -Rob >>>>>> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-06 23:26 ` Qu Wenruo @ 2021-09-07 2:36 ` Robert Wyrick 2021-09-07 3:06 ` Anand Jain 0 siblings, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 2:36 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs Trying to build latest btrfs-progs. I'm seeing errors in the configure script. $ cat /etc/os-release NAME="Linux Mint" VERSION="20.2 (Uma)" ID=linuxmint ID_LIKE=ubuntu PRETTY_NAME="Linux Mint 20.2" VERSION_ID="20.2" HOME_URL="https://www.linuxmint.com/" SUPPORT_URL="https://forums.linuxmint.com/" BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" PRIVACY_POLICY_URL="https://www.linuxmint.com/" VERSION_CODENAME=uma UBUNTU_CODENAME=focal $ uname -a Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux $ ./configure checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking minix/config.h usability... no checking minix/config.h presence... no checking for minix/config.h... no checking whether it is safe to define __EXTENSIONS__... yes checking for gcc... (cached) gcc checking whether we are using the GNU C compiler... (cached) yes checking whether gcc accepts -g... (cached) yes checking for gcc option to accept ISO C89... (cached) none needed checking whether C compiler accepts -std=gnu90... yes checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking for an ANSI C-conforming const... yes checking for working volatile... yes checking whether byte ordering is bigendian... no checking for special C compiler options needed for large files... no checking for _FILE_OFFSET_BITS value needed for large files... no checking for a BSD-compatible install... /usr/bin/install -c checking whether ln -s works... yes checking for ar... ar checking for rm... /bin/rm checking for rmdir... /bin/rmdir checking for openat... yes checking for reallocarray... yes checking for clock_gettime... yes checking linux/perf_event.h usability... yes checking linux/perf_event.h presence... yes checking for linux/perf_event.h... yes checking linux/hw_breakpoint.h usability... yes checking linux/hw_breakpoint.h presence... yes checking for linux/hw_breakpoint.h... yes checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking execinfo.h usability... yes checking execinfo.h presence... yes checking for execinfo.h... yes checking for backtrace... yes checking for backtrace_symbols_fd... yes checking for xmlto... /usr/bin/xmlto checking for mv... /bin/mv checking for a sed that does not truncate output... /bin/sed checking for asciidoc... /usr/bin/asciidoc checking for asciidoctor... no checking for EXT2FS... yes checking for COM_ERR... yes checking for REISERFS... yes checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes checking linux/blkzoned.h usability... yes checking linux/blkzoned.h presence... yes checking for linux/blkzoned.h... yes checking for struct blk_zone.capacity... no checking for BLKGETZONESZ defined in linux/blkzoned.h... yes configure: error: linux/blkzoned.h does not provide blk_zone.capacity --- Info on the file in question (linux/blkzoned.h): $ dpkg -S /usr/include/linux/blkzoned.h linux-libc-dev:amd64: /usr/include/linux/blkzoned.h $ dpkg -l linux-libc-dev Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-====================-============-============-==================================== ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel Headers for development So it appears that linux-libc-dev is way out-dated compared to my kernel. I don't know how to update it, though... there doesn't appear to be a newer version available. -Rob On Mon, Sep 6, 2021 at 5:26 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2021/9/6 下午10:42, Robert Wyrick wrote: > > 42+ hours of memtest86+, no errors detected. 4 passes complete. > > Is that good enough? > > That's strange, such obvious bitflip should be easily detected. > > Is the fs only mounted on that computer? > > Anyway, you can continue try to repair with *latest* btrfs-progs. > > Thanks, > Qu > > > > On Sun, Sep 5, 2021 at 4:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >> > >> > >> > >> On 2021/9/6 上午12:00, Robert Wyrick wrote: > >>> Running memtest86+ now.... 20 hours in. No errors yet. > >>> Thanks for the analysis. I'll let this run for another day or so. > >> > >> Just to mention, since 5.11 btrfs kernel module has the ability to > >> detect most high bitflip before writing tree blocks to disks. > >> > >> Thus even with less reliable RAM, it's still more reliable than nothing. > >> > >> But still, with the existing errors, the RAM test is still an essential > >> one before doing anything. > >> > >> Thanks, > >> Qu > >>> > >>> > >>> On Fri, Sep 3, 2021 at 12:53 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2021/9/3 下午2:48, Qu Wenruo wrote: > >>>>> > >>>>> > >>>>> On 2021/9/3 上午10:43, Robert Wyrick wrote: > >>>>>> I cannot mount my btrfs filesystem. > >>>>>> $ uname -a > >>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > >>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > >>>>>> $ btrfs version > >>>>>> btrfs-progs v5.4.1 > >>>>> > >>>>> The tool is a little too old, thus if you're going to repair, you'd > >>>>> better to update the progs. > >>>>>> > >>>>>> I'm seeing the following from check: > >>>>>> $ btrfs check -p /dev/sda > >>>>>> Opening filesystem to check... > >>>>>> Checking filesystem on /dev/sda > >>>>>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > >>>>>> [1/7] checking root items (0:00:59 elapsed, > >>>>>> 2649102 items checked) > >>>>>> ERROR: invalid generation for extent 38179182174208, have > >>>>>> 140737491486755 expect (0, 4057084] > >>>>> > >>>>> This is a repairable problem. > >>>>> > >>>>> We have test case for exactly the same case in tests/fsck-test/044 for it. > >>>> > >>>> Oh, this invalid extent generation is already a more direct indication > >>>> of memory bitflip. > >>>> > >>>> 140737491486755 = 0x8000002fc823 > >>>> > >>>> Without the high 0x8 bit, the remaining part is completely valid > >>>> generation, 0x2fc823, which is inside the expectation. > >>>> > >>>> So, a memtest is a must before doing any repair. > >>>> You won't want another bitflip to ruin your perfectly repairable fs. > >>>> > >>>> Thanks, > >>>> Qu > >>>>> > >>>>> > >>>>>> [2/7] checking extents (0:02:17 elapsed, > >>>>>> 1116143 items checked) > >>>>>> ERROR: errors found in extent allocation tree or chunk allocation > >>>>>> cache and super generation don't match, space cache will be invalidated > >>>>>> [3/7] checking free space cache (0:00:00 elapsed) > >>>>>> [4/7] chunresolved ref dir 8348950 index 3 namelen 7 name posters > >>>>>> filetype 2 errors 2, no dir index > >>>>> > >>>>> No dir index can also be repaired. > >>>>> > >>>>> The dir index will be added back. > >>>>> > >>>>>> unresolved ref dir 8348950 index 3 namelen 7 name poSters filetype 2 > >>>>>> errors 5, no dir item, no inode ref > >>>>> > >>>>> No dir item nor inode ref can also be repaired, but with dir item and > >>>>> inode ref removed. > >>>>> > >>>>> But the problem here looks very strange. > >>>>> > >>>>> It's the same dir and the same index, but different name. > >>>>> posters vs poSters. > >>>>> > >>>>> 'S' is 0x53 and 's' is 0x73, I'm wondering if your system had a bad > >>>>> memory which caused a bitflip and the problem. > >>>>> > >>>>> Thus I prefer to do a full memtest before running btrfs check --repair. > >>>>> > >>>>> Thanks, > >>>>> Qu > >>>>> > >>>>>> [4/7] checking fs roots (0:00:42 elapsed, > >>>>>> 108894 items checked) > >>>>>> ERROR: errors found in fs roots > >>>>>> found 15729059057664 bytes used, error(s) found > >>>>>> total csum bytes: 15313288548 > >>>>>> total tree bytes: 18286739456 > >>>>>> total fs tree bytes: 1791819776 > >>>>>> total extent tree bytes: 229130240 > >>>>>> btree space waste bytes: 1018844959 > >>>>>> file data blocks allocated: 51587230502912 > >>>>>> referenced 15627926712320 > >>>>>> > >>>>>> I've tried everything I've found on the internet, but haven't > >>>>>> attempted to repair based on the warnings... > >>>>>> > >>>>>> What more info do you need to help me diagnose/fix this? > >>>>>> > >>>>>> Thanks! > >>>>>> -Rob > >>>>>> > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 2:36 ` Robert Wyrick @ 2021-09-07 3:06 ` Anand Jain 2021-09-07 4:36 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Anand Jain @ 2021-09-07 3:06 UTC (permalink / raw) To: Robert Wyrick; +Cc: linux-btrfs, Qu Wenruo On 07/09/2021 10:36, Robert Wyrick wrote: > Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > > $ cat /etc/os-release > NAME="Linux Mint" > VERSION="20.2 (Uma)" > ID=linuxmint > ID_LIKE=ubuntu > PRETTY_NAME="Linux Mint 20.2" > VERSION_ID="20.2" > HOME_URL="https://www.linuxmint.com/" > SUPPORT_URL="https://forums.linuxmint.com/" > BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > PRIVACY_POLICY_URL="https://www.linuxmint.com/" > VERSION_CODENAME=uma > UBUNTU_CODENAME=focal > > $ uname -a > Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > $ ./configure > checking for gcc... gcc > checking whether the C compiler works... yes > checking for C compiler default output file name... a.out > checking for suffix of executables... > checking whether we are cross compiling... no > checking for suffix of object files... o > checking whether we are using the GNU C compiler... yes > checking whether gcc accepts -g... yes > checking for gcc option to accept ISO C89... none needed > checking how to run the C preprocessor... gcc -E > checking for grep that handles long lines and -e... /bin/grep > checking for egrep... /bin/grep -E > checking for ANSI C header files... yes > checking for sys/types.h... yes > checking for sys/stat.h... yes > checking for stdlib.h... yes > checking for string.h... yes > checking for memory.h... yes > checking for strings.h... yes > checking for inttypes.h... yes > checking for stdint.h... yes > checking for unistd.h... yes > checking minix/config.h usability... no > checking minix/config.h presence... no > checking for minix/config.h... no > checking whether it is safe to define __EXTENSIONS__... yes > checking for gcc... (cached) gcc > checking whether we are using the GNU C compiler... (cached) yes > checking whether gcc accepts -g... (cached) yes > checking for gcc option to accept ISO C89... (cached) none needed > checking whether C compiler accepts -std=gnu90... yes > checking build system type... x86_64-pc-linux-gnu > checking host system type... x86_64-pc-linux-gnu > checking for an ANSI C-conforming const... yes > checking for working volatile... yes > checking whether byte ordering is bigendian... no > checking for special C compiler options needed for large files... no > checking for _FILE_OFFSET_BITS value needed for large files... no > checking for a BSD-compatible install... /usr/bin/install -c > checking whether ln -s works... yes > checking for ar... ar > checking for rm... /bin/rm > checking for rmdir... /bin/rmdir > checking for openat... yes > checking for reallocarray... yes > checking for clock_gettime... yes > checking linux/perf_event.h usability... yes > checking linux/perf_event.h presence... yes > checking for linux/perf_event.h... yes > checking linux/hw_breakpoint.h usability... yes > checking linux/hw_breakpoint.h presence... yes > checking for linux/hw_breakpoint.h... yes > checking for pkg-config... /usr/bin/pkg-config > checking pkg-config is at least version 0.9.0... yes > checking execinfo.h usability... yes > checking execinfo.h presence... yes > checking for execinfo.h... yes > checking for backtrace... yes > checking for backtrace_symbols_fd... yes > checking for xmlto... /usr/bin/xmlto > checking for mv... /bin/mv > checking for a sed that does not truncate output... /bin/sed > checking for asciidoc... /usr/bin/asciidoc > checking for asciidoctor... no > checking for EXT2FS... yes > checking for COM_ERR... yes > checking for REISERFS... yes > checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > checking linux/blkzoned.h usability... yes > checking linux/blkzoned.h presence... yes > checking for linux/blkzoned.h... yes > checking for struct blk_zone.capacity... no > checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > configure: error: linux/blkzoned.h does not provide blk_zone.capacity > > --- > > Info on the file in question (linux/blkzoned.h): > > $ dpkg -S /usr/include/linux/blkzoned.h > linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > > $ dpkg -l linux-libc-dev > Desired=Unknown/Install/Remove/Purge/Hold > | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version Architecture Description > +++-====================-============-============-==================================== > ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > Headers for development > > > So it appears that linux-libc-dev is way out-dated compared to my > kernel. I don't know how to update it, though... there doesn't appear > to be a newer version available. You could disable the zoned. ./configure --disable-zoned ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 3:06 ` Anand Jain @ 2021-09-07 4:36 ` Robert Wyrick 2021-09-07 4:53 ` Qu Wenruo 0 siblings, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 4:36 UTC (permalink / raw) To: Anand Jain; +Cc: linux-btrfs, Qu Wenruo What exactly would i be disabling? I don't know what zoned does. On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: > > On 07/09/2021 10:36, Robert Wyrick wrote: > > Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > > > > $ cat /etc/os-release > > NAME="Linux Mint" > > VERSION="20.2 (Uma)" > > ID=linuxmint > > ID_LIKE=ubuntu > > PRETTY_NAME="Linux Mint 20.2" > > VERSION_ID="20.2" > > HOME_URL="https://www.linuxmint.com/" > > SUPPORT_URL="https://forums.linuxmint.com/" > > BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > > PRIVACY_POLICY_URL="https://www.linuxmint.com/" > > VERSION_CODENAME=uma > > UBUNTU_CODENAME=focal > > > > $ uname -a > > Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > > 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > > > $ ./configure > > checking for gcc... gcc > > checking whether the C compiler works... yes > > checking for C compiler default output file name... a.out > > checking for suffix of executables... > > checking whether we are cross compiling... no > > checking for suffix of object files... o > > checking whether we are using the GNU C compiler... yes > > checking whether gcc accepts -g... yes > > checking for gcc option to accept ISO C89... none needed > > checking how to run the C preprocessor... gcc -E > > checking for grep that handles long lines and -e... /bin/grep > > checking for egrep... /bin/grep -E > > checking for ANSI C header files... yes > > checking for sys/types.h... yes > > checking for sys/stat.h... yes > > checking for stdlib.h... yes > > checking for string.h... yes > > checking for memory.h... yes > > checking for strings.h... yes > > checking for inttypes.h... yes > > checking for stdint.h... yes > > checking for unistd.h... yes > > checking minix/config.h usability... no > > checking minix/config.h presence... no > > checking for minix/config.h... no > > checking whether it is safe to define __EXTENSIONS__... yes > > checking for gcc... (cached) gcc > > checking whether we are using the GNU C compiler... (cached) yes > > checking whether gcc accepts -g... (cached) yes > > checking for gcc option to accept ISO C89... (cached) none needed > > checking whether C compiler accepts -std=gnu90... yes > > checking build system type... x86_64-pc-linux-gnu > > checking host system type... x86_64-pc-linux-gnu > > checking for an ANSI C-conforming const... yes > > checking for working volatile... yes > > checking whether byte ordering is bigendian... no > > checking for special C compiler options needed for large files... no > > checking for _FILE_OFFSET_BITS value needed for large files... no > > checking for a BSD-compatible install... /usr/bin/install -c > > checking whether ln -s works... yes > > checking for ar... ar > > checking for rm... /bin/rm > > checking for rmdir... /bin/rmdir > > checking for openat... yes > > checking for reallocarray... yes > > checking for clock_gettime... yes > > checking linux/perf_event.h usability... yes > > checking linux/perf_event.h presence... yes > > checking for linux/perf_event.h... yes > > checking linux/hw_breakpoint.h usability... yes > > checking linux/hw_breakpoint.h presence... yes > > checking for linux/hw_breakpoint.h... yes > > checking for pkg-config... /usr/bin/pkg-config > > checking pkg-config is at least version 0.9.0... yes > > checking execinfo.h usability... yes > > checking execinfo.h presence... yes > > checking for execinfo.h... yes > > checking for backtrace... yes > > checking for backtrace_symbols_fd... yes > > checking for xmlto... /usr/bin/xmlto > > checking for mv... /bin/mv > > checking for a sed that does not truncate output... /bin/sed > > checking for asciidoc... /usr/bin/asciidoc > > checking for asciidoctor... no > > checking for EXT2FS... yes > > checking for COM_ERR... yes > > checking for REISERFS... yes > > checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > > checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > > checking linux/blkzoned.h usability... yes > > checking linux/blkzoned.h presence... yes > > checking for linux/blkzoned.h... yes > > checking for struct blk_zone.capacity... no > > checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > > > configure: error: linux/blkzoned.h does not provide blk_zone.capacity > > > > > > --- > > > > Info on the file in question (linux/blkzoned.h): > > > > $ dpkg -S /usr/include/linux/blkzoned.h > > linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > > > > $ dpkg -l linux-libc-dev > > Desired=Unknown/Install/Remove/Purge/Hold > > | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > > ||/ Name Version Architecture Description > > +++-====================-============-============-==================================== > > ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > > Headers for development > > > > > > So it appears that linux-libc-dev is way out-dated compared to my > > kernel. I don't know how to update it, though... there doesn't appear > > to be a newer version available. > > You could disable the zoned. > > ./configure --disable-zoned > > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 4:36 ` Robert Wyrick @ 2021-09-07 4:53 ` Qu Wenruo 2021-09-07 17:02 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-07 4:53 UTC (permalink / raw) To: Robert Wyrick, Anand Jain; +Cc: linux-btrfs On 2021/9/7 下午12:36, Robert Wyrick wrote: > What exactly would i be disabling? I don't know what zoned does. The zoned device support. If you don't have any host-managed zoned device, there is no reason you would like to enable it. https://zonedstorage.io/introduction/ Thanks, Qu > > On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: >> >> On 07/09/2021 10:36, Robert Wyrick wrote: >>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. >>> >>> $ cat /etc/os-release >>> NAME="Linux Mint" >>> VERSION="20.2 (Uma)" >>> ID=linuxmint >>> ID_LIKE=ubuntu >>> PRETTY_NAME="Linux Mint 20.2" >>> VERSION_ID="20.2" >>> HOME_URL="https://www.linuxmint.com/" >>> SUPPORT_URL="https://forums.linuxmint.com/" >>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" >>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" >>> VERSION_CODENAME=uma >>> UBUNTU_CODENAME=focal >>> >>> $ uname -a >>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>> >>> $ ./configure >>> checking for gcc... gcc >>> checking whether the C compiler works... yes >>> checking for C compiler default output file name... a.out >>> checking for suffix of executables... >>> checking whether we are cross compiling... no >>> checking for suffix of object files... o >>> checking whether we are using the GNU C compiler... yes >>> checking whether gcc accepts -g... yes >>> checking for gcc option to accept ISO C89... none needed >>> checking how to run the C preprocessor... gcc -E >>> checking for grep that handles long lines and -e... /bin/grep >>> checking for egrep... /bin/grep -E >>> checking for ANSI C header files... yes >>> checking for sys/types.h... yes >>> checking for sys/stat.h... yes >>> checking for stdlib.h... yes >>> checking for string.h... yes >>> checking for memory.h... yes >>> checking for strings.h... yes >>> checking for inttypes.h... yes >>> checking for stdint.h... yes >>> checking for unistd.h... yes >>> checking minix/config.h usability... no >>> checking minix/config.h presence... no >>> checking for minix/config.h... no >>> checking whether it is safe to define __EXTENSIONS__... yes >>> checking for gcc... (cached) gcc >>> checking whether we are using the GNU C compiler... (cached) yes >>> checking whether gcc accepts -g... (cached) yes >>> checking for gcc option to accept ISO C89... (cached) none needed >>> checking whether C compiler accepts -std=gnu90... yes >>> checking build system type... x86_64-pc-linux-gnu >>> checking host system type... x86_64-pc-linux-gnu >>> checking for an ANSI C-conforming const... yes >>> checking for working volatile... yes >>> checking whether byte ordering is bigendian... no >>> checking for special C compiler options needed for large files... no >>> checking for _FILE_OFFSET_BITS value needed for large files... no >>> checking for a BSD-compatible install... /usr/bin/install -c >>> checking whether ln -s works... yes >>> checking for ar... ar >>> checking for rm... /bin/rm >>> checking for rmdir... /bin/rmdir >>> checking for openat... yes >>> checking for reallocarray... yes >>> checking for clock_gettime... yes >>> checking linux/perf_event.h usability... yes >>> checking linux/perf_event.h presence... yes >>> checking for linux/perf_event.h... yes >>> checking linux/hw_breakpoint.h usability... yes >>> checking linux/hw_breakpoint.h presence... yes >>> checking for linux/hw_breakpoint.h... yes >>> checking for pkg-config... /usr/bin/pkg-config >>> checking pkg-config is at least version 0.9.0... yes >>> checking execinfo.h usability... yes >>> checking execinfo.h presence... yes >>> checking for execinfo.h... yes >>> checking for backtrace... yes >>> checking for backtrace_symbols_fd... yes >>> checking for xmlto... /usr/bin/xmlto >>> checking for mv... /bin/mv >>> checking for a sed that does not truncate output... /bin/sed >>> checking for asciidoc... /usr/bin/asciidoc >>> checking for asciidoctor... no >>> checking for EXT2FS... yes >>> checking for COM_ERR... yes >>> checking for REISERFS... yes >>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes >>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes >>> checking linux/blkzoned.h usability... yes >>> checking linux/blkzoned.h presence... yes >>> checking for linux/blkzoned.h... yes >>> checking for struct blk_zone.capacity... no >>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes >> >>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity >> >> >>> >>> --- >>> >>> Info on the file in question (linux/blkzoned.h): >>> >>> $ dpkg -S /usr/include/linux/blkzoned.h >>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h >>> >>> $ dpkg -l linux-libc-dev >>> Desired=Unknown/Install/Remove/Purge/Hold >>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >>> ||/ Name Version Architecture Description >>> +++-====================-============-============-==================================== >>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel >>> Headers for development >>> >>> >>> So it appears that linux-libc-dev is way out-dated compared to my >>> kernel. I don't know how to update it, though... there doesn't appear >>> to be a newer version available. >> >> You could disable the zoned. >> >> ./configure --disable-zoned >> >> >> >> >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 4:53 ` Qu Wenruo @ 2021-09-07 17:02 ` Robert Wyrick 2021-09-07 17:17 ` Robert Wyrick 2021-09-07 23:15 ` Qu Wenruo 0 siblings, 2 replies; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 17:02 UTC (permalink / raw) To: Qu Wenruo; +Cc: Anand Jain, linux-btrfs Ran a repair: $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, just ran from the compiled directory enabling repair mode WARNING: Do not use --repair unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck can successfully repair all types of filesystem corruption. Eg. some software or hardware bugs can fatally damage a volume. The operation will start in 10 seconds. Use Ctrl-C to stop it. 10 9 8 7 6 5 4 3 2 1 Starting repair. Opening filesystem to check... Checking filesystem on /dev/sda UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf [1/7] checking root items (0:00:59 elapsed, 2649102 items checked) Fixed 0 roots. Reset extent item (38179182174208) generation to 4057084elapsed, 1116143 items checked) No device size related problem found (0:02:22 elapsed, 1116143 items checked) [2/7] checking extents (0:02:23 elapsed, 1116143 items checked) cache and super generation don't match, space cache will be invalidated [3/7] checking free space cache (0:00:00 elapsed) Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, 106695 items checked) repairing missing dir index item for inode 834922400:26 elapsed, 108893 items checked) [4/7] checking fs roots (0:01:04 elapsed, 217787 items checked) [5/7] checking csums (without verifying data) (0:00:04 elapsed, 12350321 items checked) [6/7] checking root refs (0:00:00 elapsed, 4 items checked) [7/7] checking quota groups skipped (not enabled on this FS) found 15729059057664 bytes used, no error found total csum bytes: 15313288548 total tree bytes: 18286739456 total fs tree bytes: 1791819776 total extent tree bytes: 229130240 btree space waste bytes: 1018844959 file data blocks allocated: 51587230502912 referenced 15627926712320 I can now mount the filesystem successfully! Thank you for your help. I do have some additional questions if you don't mind... I am already using RAID 1 to handle single disk outages. I assume things could have gone much worse and I could have lost the whole filesystem. Aside from backups (I know, I know), is there anything else I can do to prevent such issues or make them easier to recover from? Could this problem have been avoided/detected earlier? This wasn't a disk failure and according to memtest86+, it wasn't due to bad memory either.... I don't run scrubs very often. Should I? I guess the more general question is: What are the best practices for maintaining a healthy btrfs file system? Thanks again! On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2021/9/7 下午12:36, Robert Wyrick wrote: > > What exactly would i be disabling? I don't know what zoned does. > > The zoned device support. > > If you don't have any host-managed zoned device, there is no reason you > would like to enable it. > > https://zonedstorage.io/introduction/ > > Thanks, > Qu > > > > > On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: > >> > >> On 07/09/2021 10:36, Robert Wyrick wrote: > >>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > >>> > >>> $ cat /etc/os-release > >>> NAME="Linux Mint" > >>> VERSION="20.2 (Uma)" > >>> ID=linuxmint > >>> ID_LIKE=ubuntu > >>> PRETTY_NAME="Linux Mint 20.2" > >>> VERSION_ID="20.2" > >>> HOME_URL="https://www.linuxmint.com/" > >>> SUPPORT_URL="https://forums.linuxmint.com/" > >>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > >>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" > >>> VERSION_CODENAME=uma > >>> UBUNTU_CODENAME=focal > >>> > >>> $ uname -a > >>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > >>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > >>> > >>> $ ./configure > >>> checking for gcc... gcc > >>> checking whether the C compiler works... yes > >>> checking for C compiler default output file name... a.out > >>> checking for suffix of executables... > >>> checking whether we are cross compiling... no > >>> checking for suffix of object files... o > >>> checking whether we are using the GNU C compiler... yes > >>> checking whether gcc accepts -g... yes > >>> checking for gcc option to accept ISO C89... none needed > >>> checking how to run the C preprocessor... gcc -E > >>> checking for grep that handles long lines and -e... /bin/grep > >>> checking for egrep... /bin/grep -E > >>> checking for ANSI C header files... yes > >>> checking for sys/types.h... yes > >>> checking for sys/stat.h... yes > >>> checking for stdlib.h... yes > >>> checking for string.h... yes > >>> checking for memory.h... yes > >>> checking for strings.h... yes > >>> checking for inttypes.h... yes > >>> checking for stdint.h... yes > >>> checking for unistd.h... yes > >>> checking minix/config.h usability... no > >>> checking minix/config.h presence... no > >>> checking for minix/config.h... no > >>> checking whether it is safe to define __EXTENSIONS__... yes > >>> checking for gcc... (cached) gcc > >>> checking whether we are using the GNU C compiler... (cached) yes > >>> checking whether gcc accepts -g... (cached) yes > >>> checking for gcc option to accept ISO C89... (cached) none needed > >>> checking whether C compiler accepts -std=gnu90... yes > >>> checking build system type... x86_64-pc-linux-gnu > >>> checking host system type... x86_64-pc-linux-gnu > >>> checking for an ANSI C-conforming const... yes > >>> checking for working volatile... yes > >>> checking whether byte ordering is bigendian... no > >>> checking for special C compiler options needed for large files... no > >>> checking for _FILE_OFFSET_BITS value needed for large files... no > >>> checking for a BSD-compatible install... /usr/bin/install -c > >>> checking whether ln -s works... yes > >>> checking for ar... ar > >>> checking for rm... /bin/rm > >>> checking for rmdir... /bin/rmdir > >>> checking for openat... yes > >>> checking for reallocarray... yes > >>> checking for clock_gettime... yes > >>> checking linux/perf_event.h usability... yes > >>> checking linux/perf_event.h presence... yes > >>> checking for linux/perf_event.h... yes > >>> checking linux/hw_breakpoint.h usability... yes > >>> checking linux/hw_breakpoint.h presence... yes > >>> checking for linux/hw_breakpoint.h... yes > >>> checking for pkg-config... /usr/bin/pkg-config > >>> checking pkg-config is at least version 0.9.0... yes > >>> checking execinfo.h usability... yes > >>> checking execinfo.h presence... yes > >>> checking for execinfo.h... yes > >>> checking for backtrace... yes > >>> checking for backtrace_symbols_fd... yes > >>> checking for xmlto... /usr/bin/xmlto > >>> checking for mv... /bin/mv > >>> checking for a sed that does not truncate output... /bin/sed > >>> checking for asciidoc... /usr/bin/asciidoc > >>> checking for asciidoctor... no > >>> checking for EXT2FS... yes > >>> checking for COM_ERR... yes > >>> checking for REISERFS... yes > >>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > >>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > >>> checking linux/blkzoned.h usability... yes > >>> checking linux/blkzoned.h presence... yes > >>> checking for linux/blkzoned.h... yes > >>> checking for struct blk_zone.capacity... no > >>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > >> > >>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity > >> > >> > >>> > >>> --- > >>> > >>> Info on the file in question (linux/blkzoned.h): > >>> > >>> $ dpkg -S /usr/include/linux/blkzoned.h > >>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > >>> > >>> $ dpkg -l linux-libc-dev > >>> Desired=Unknown/Install/Remove/Purge/Hold > >>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > >>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > >>> ||/ Name Version Architecture Description > >>> +++-====================-============-============-==================================== > >>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > >>> Headers for development > >>> > >>> > >>> So it appears that linux-libc-dev is way out-dated compared to my > >>> kernel. I don't know how to update it, though... there doesn't appear > >>> to be a newer version available. > >> > >> You could disable the zoned. > >> > >> ./configure --disable-zoned > >> > >> > >> > >> > >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 17:02 ` Robert Wyrick @ 2021-09-07 17:17 ` Robert Wyrick 2021-09-07 20:47 ` Robert Wyrick 2021-09-07 23:15 ` Qu Wenruo 1 sibling, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 17:17 UTC (permalink / raw) To: Qu Wenruo; +Cc: Anand Jain, linux-btrfs Looks like I spoke too soon. I can now mount the FS readonly. I see this error in dmesg: [58995.896369] CPU: 10 PID: 83845 Comm: btrfs-transacti Tainted: P OE 5.11.0-27-generic #29~20.04.1-Ubuntu [58995.896373] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 0515 03/30/2017 [58995.896376] RIP: 0010:btrfs_run_delayed_refs+0x1af/0x200 [btrfs] [58995.896422] Code: 8b 55 50 f0 48 0f ba aa 48 0a 00 00 03 72 20 83 f8 fb 74 3c 83 f8 e2 74 37 89 c6 48 c7 c7 50 7e 77 c0 89 45 d0 e8 96 d4 4d d3 <0f> 0b 8b 45 d0 89 c1 ba 4c 08 00 00 4c 89 ef 89 45 d0 48 c7 c6 c0 [58995.896425] RSP: 0018:ffffb89a4a0dfdf8 EFLAGS: 00010282 [58995.896428] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000027 [58995.896430] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff960fdf098ac8 [58995.896432] RBP: ffffb89a4a0dfe40 R08: ffff960fdf098ac0 R09: ffffb89a4a0dfbb8 [58995.896434] R10: 0000000000000001 R11: 0000000000000001 R12: ffff96036a7d5378 [58995.896435] R13: ffff960115028888 R14: ffff96036a7d5200 R15: 0000000000000000 [58995.896437] FS: 0000000000000000(0000) GS:ffff960fdf080000(0000) knlGS:0000000000000000 [58995.896439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [58995.896441] CR2: 00005616936891e8 CR3: 000000010fe46000 CR4: 00000000003506e0 [58995.896444] Call Trace: [58995.896450] btrfs_commit_transaction+0x2c3/0xa80 [btrfs] [58995.896500] ? start_transaction+0xd5/0x590 [btrfs] [58995.896549] transaction_kthread+0x138/0x1b0 [btrfs] [58995.896596] kthread+0x114/0x150 [58995.896604] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] [58995.896649] ? kthread_park+0x90/0x90 [58995.896653] ret_from_fork+0x22/0x30 [58995.896661] ---[ end trace c8ba04bdf2113cae ]--- [58995.896664] BTRFS: error (device sdf) in btrfs_run_delayed_refs:2124: errno=-17 Object already exists [58995.896669] BTRFS info (device sdf): forced readonly [58995.896672] BTRFS warning (device sdf): Skipping commit of aborted transaction. [58995.896674] BTRFS: error (device sdf) in cleanup_transaction:1939: errno=-17 Object already exists Read-only is better than nothing, but what would be my next steps? On Tue, Sep 7, 2021 at 11:02 AM Robert Wyrick <rob@wyrick.org> wrote: > > Ran a repair: > > $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, > just ran from the compiled directory > enabling repair mode > WARNING: > > Do not use --repair unless you are advised to do so by a developer > or an experienced user, and then only after having accepted that no > fsck can successfully repair all types of filesystem corruption. Eg. > some software or hardware bugs can fatally damage a volume. > The operation will start in 10 seconds. > Use Ctrl-C to stop it. > 10 9 8 7 6 5 4 3 2 1 > Starting repair. > Opening filesystem to check... > Checking filesystem on /dev/sda > UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > [1/7] checking root items (0:00:59 elapsed, > 2649102 items checked) > Fixed 0 roots. > Reset extent item (38179182174208) generation to 4057084elapsed, > 1116143 items checked) > No device size related problem found (0:02:22 elapsed, > 1116143 items checked) > [2/7] checking extents (0:02:23 elapsed, > 1116143 items checked) > cache and super generation don't match, space cache will be invalidated > [3/7] checking free space cache (0:00:00 elapsed) > Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, > 106695 items checked) > repairing missing dir index item for inode 834922400:26 elapsed, > 108893 items checked) > [4/7] checking fs roots (0:01:04 elapsed, > 217787 items checked) > [5/7] checking csums (without verifying data) (0:00:04 elapsed, > 12350321 items checked) > [6/7] checking root refs (0:00:00 elapsed, 4 > items checked) > [7/7] checking quota groups skipped (not enabled on this FS) > found 15729059057664 bytes used, no error found > total csum bytes: 15313288548 > total tree bytes: 18286739456 > total fs tree bytes: 1791819776 > total extent tree bytes: 229130240 > btree space waste bytes: 1018844959 > file data blocks allocated: 51587230502912 > referenced 15627926712320 > > I can now mount the filesystem successfully! Thank you for your help. > > I do have some additional questions if you don't mind... > I am already using RAID 1 to handle single disk outages. I assume > things could have gone much worse and I could have lost the whole > filesystem. Aside from backups (I know, I know), is there anything > else I can do to prevent such issues or make them easier to recover > from? Could this problem have been avoided/detected earlier? This > wasn't a disk failure and according to memtest86+, it wasn't due to > bad memory either.... I don't run scrubs very often. Should I? I > guess the more general question is: What are the best practices for > maintaining a healthy btrfs file system? > > Thanks again! > > On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > > > > > On 2021/9/7 下午12:36, Robert Wyrick wrote: > > > What exactly would i be disabling? I don't know what zoned does. > > > > The zoned device support. > > > > If you don't have any host-managed zoned device, there is no reason you > > would like to enable it. > > > > https://zonedstorage.io/introduction/ > > > > Thanks, > > Qu > > > > > > > > On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: > > >> > > >> On 07/09/2021 10:36, Robert Wyrick wrote: > > >>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > > >>> > > >>> $ cat /etc/os-release > > >>> NAME="Linux Mint" > > >>> VERSION="20.2 (Uma)" > > >>> ID=linuxmint > > >>> ID_LIKE=ubuntu > > >>> PRETTY_NAME="Linux Mint 20.2" > > >>> VERSION_ID="20.2" > > >>> HOME_URL="https://www.linuxmint.com/" > > >>> SUPPORT_URL="https://forums.linuxmint.com/" > > >>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > > >>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" > > >>> VERSION_CODENAME=uma > > >>> UBUNTU_CODENAME=focal > > >>> > > >>> $ uname -a > > >>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > > >>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > >>> > > >>> $ ./configure > > >>> checking for gcc... gcc > > >>> checking whether the C compiler works... yes > > >>> checking for C compiler default output file name... a.out > > >>> checking for suffix of executables... > > >>> checking whether we are cross compiling... no > > >>> checking for suffix of object files... o > > >>> checking whether we are using the GNU C compiler... yes > > >>> checking whether gcc accepts -g... yes > > >>> checking for gcc option to accept ISO C89... none needed > > >>> checking how to run the C preprocessor... gcc -E > > >>> checking for grep that handles long lines and -e... /bin/grep > > >>> checking for egrep... /bin/grep -E > > >>> checking for ANSI C header files... yes > > >>> checking for sys/types.h... yes > > >>> checking for sys/stat.h... yes > > >>> checking for stdlib.h... yes > > >>> checking for string.h... yes > > >>> checking for memory.h... yes > > >>> checking for strings.h... yes > > >>> checking for inttypes.h... yes > > >>> checking for stdint.h... yes > > >>> checking for unistd.h... yes > > >>> checking minix/config.h usability... no > > >>> checking minix/config.h presence... no > > >>> checking for minix/config.h... no > > >>> checking whether it is safe to define __EXTENSIONS__... yes > > >>> checking for gcc... (cached) gcc > > >>> checking whether we are using the GNU C compiler... (cached) yes > > >>> checking whether gcc accepts -g... (cached) yes > > >>> checking for gcc option to accept ISO C89... (cached) none needed > > >>> checking whether C compiler accepts -std=gnu90... yes > > >>> checking build system type... x86_64-pc-linux-gnu > > >>> checking host system type... x86_64-pc-linux-gnu > > >>> checking for an ANSI C-conforming const... yes > > >>> checking for working volatile... yes > > >>> checking whether byte ordering is bigendian... no > > >>> checking for special C compiler options needed for large files... no > > >>> checking for _FILE_OFFSET_BITS value needed for large files... no > > >>> checking for a BSD-compatible install... /usr/bin/install -c > > >>> checking whether ln -s works... yes > > >>> checking for ar... ar > > >>> checking for rm... /bin/rm > > >>> checking for rmdir... /bin/rmdir > > >>> checking for openat... yes > > >>> checking for reallocarray... yes > > >>> checking for clock_gettime... yes > > >>> checking linux/perf_event.h usability... yes > > >>> checking linux/perf_event.h presence... yes > > >>> checking for linux/perf_event.h... yes > > >>> checking linux/hw_breakpoint.h usability... yes > > >>> checking linux/hw_breakpoint.h presence... yes > > >>> checking for linux/hw_breakpoint.h... yes > > >>> checking for pkg-config... /usr/bin/pkg-config > > >>> checking pkg-config is at least version 0.9.0... yes > > >>> checking execinfo.h usability... yes > > >>> checking execinfo.h presence... yes > > >>> checking for execinfo.h... yes > > >>> checking for backtrace... yes > > >>> checking for backtrace_symbols_fd... yes > > >>> checking for xmlto... /usr/bin/xmlto > > >>> checking for mv... /bin/mv > > >>> checking for a sed that does not truncate output... /bin/sed > > >>> checking for asciidoc... /usr/bin/asciidoc > > >>> checking for asciidoctor... no > > >>> checking for EXT2FS... yes > > >>> checking for COM_ERR... yes > > >>> checking for REISERFS... yes > > >>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > > >>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > > >>> checking linux/blkzoned.h usability... yes > > >>> checking linux/blkzoned.h presence... yes > > >>> checking for linux/blkzoned.h... yes > > >>> checking for struct blk_zone.capacity... no > > >>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > > >> > > >>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity > > >> > > >> > > >>> > > >>> --- > > >>> > > >>> Info on the file in question (linux/blkzoned.h): > > >>> > > >>> $ dpkg -S /usr/include/linux/blkzoned.h > > >>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > > >>> > > >>> $ dpkg -l linux-libc-dev > > >>> Desired=Unknown/Install/Remove/Purge/Hold > > >>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > > >>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > > >>> ||/ Name Version Architecture Description > > >>> +++-====================-============-============-==================================== > > >>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > > >>> Headers for development > > >>> > > >>> > > >>> So it appears that linux-libc-dev is way out-dated compared to my > > >>> kernel. I don't know how to update it, though... there doesn't appear > > >>> to be a newer version available. > > >> > > >> You could disable the zoned. > > >> > > >> ./configure --disable-zoned > > >> > > >> > > >> > > >> > > >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 17:17 ` Robert Wyrick @ 2021-09-07 20:47 ` Robert Wyrick 2021-09-07 23:17 ` Qu Wenruo 0 siblings, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 20:47 UTC (permalink / raw) To: Qu Wenruo; +Cc: Anand Jain, linux-btrfs Re-running check now shows: [1/7] checking root items (0:00:55 elapsed, 2649102 items checked) [2/7] checking extents (0:02:13 elapsed, 1116141 items checked) there is no free space entry for 18365358505984-18365358522368d, 130 items checked) there is no free space entry for 18365358505984-18366416814080 cache appears valid but isn't 18365343072256 there is no free space entry for 19764429062144-19764429078528d, 348 items checked) there is no free space entry for 19764429062144-19765502410752 cache appears valid but isn't 19764428668928 wanted bytes 49152, found 16384 for off 254016221675521 elapsed, 1534 items checked) wanted bytes 1058373632, found 16384 for off 25401622167552 cache appears valid but isn't 25401606799360 there is no free space entry for 28659399229440-28659399245824d, 2636 items checked) there is no free space entry for 28659399229440-28660413235200 cache appears valid but isn't 28659339493376 wanted offset 29154336178176, found 2915433616179200:33 elapsed, 2792 items checked) wanted offset 29154336178176, found 29154336161792 cache appears valid but isn't 29154334474240 there is no free space entry for 30899331825664-30899331842048d, 3585 items checked) there is no free space entry for 30899331825664-30900272234496 cache appears valid but isn't 30899198492672 there is no free space entry for 32134011568128-32134011584512d, 4474 items checked) there is no free space entry for 32134011568128-32135075332096 cache appears valid but isn't 32134001590272 wanted offset 33148689629184, found 3314868961280000:59 elapsed, 4963 items checked) wanted offset 33148689629184, found 33148689612800 cache appears valid but isn't 33148687613952 there is no free space entry for 34611225755648-34611225772032d, 6036 items checked) there is no free space entry for 34611225755648-34612197720064 cache appears valid but isn't 34611123978240 there is no free space entry for 37374972723200-37374972739584d, 8051 items checked) there is no free space entry for 37374972723200-37376042729472 cache appears valid but isn't 37374968987648 there is no free space entry for 37484494651392-37484494667776d, 8172 items checked) there is no free space entry for 37484494651392-37485564395520 cache appears valid but isn't 37484490653696 wanted bytes 49152, found 32768 for off 377572293017606 elapsed, 8381 items checked) wanted bytes 1065517056, found 32768 for off 37757229301760 cache appears valid but isn't 37757221076992 there is no free space entry for 38414356250624-38414356267008d, 9004 items checked) there is no free space entry for 38414356250624-38415424815104 cache appears valid but isn't 38414351073280 there is no free space entry for 41509957402624-41509957419008d, 11792 items checked) there is no free space entry for 41509957402624-41511022493696 cache appears valid but isn't 41509948751872 there is no free space entry for 42293815459840-42293815492608d, 12469 items checked) there is no free space entry for 42293815459840-42294887579648 cache appears valid but isn't 42293813837824 [3/7] checking free space cache (0:04:18 elapsed, 14910 items checked) [4/7] checking fs roots (0:00:26 elapsed, 108894 items checked) [5/7] checking csums (without verifying data) (0:00:03 elapsed, 12350321 items checked) [6/7] checking root refs (0:00:00 elapsed, 4 items checked) [7/7] checking quota groups skipped (not enabled on this FS) found 15729059287040 bytes used, error(s) found total csum bytes: 15313288548 total tree bytes: 18286706688 total fs tree bytes: 1791819776 total extent tree bytes: 229097472 btree space waste bytes: 1018811836 file data blocks allocated: 51587230765056 referenced 15627926974464 On Tue, Sep 7, 2021 at 11:17 AM Robert Wyrick <rob@wyrick.org> wrote: > > Looks like I spoke too soon. > > I can now mount the FS readonly. > > I see this error in dmesg: > [58995.896369] CPU: 10 PID: 83845 Comm: btrfs-transacti Tainted: P > OE 5.11.0-27-generic #29~20.04.1-Ubuntu > [58995.896373] Hardware name: System manufacturer System Product > Name/PRIME X370-PRO, BIOS 0515 03/30/2017 > [58995.896376] RIP: 0010:btrfs_run_delayed_refs+0x1af/0x200 [btrfs] > [58995.896422] Code: 8b 55 50 f0 48 0f ba aa 48 0a 00 00 03 72 20 83 > f8 fb 74 3c 83 f8 e2 74 37 89 c6 48 c7 c7 50 7e 77 c0 89 45 d0 e8 96 > d4 4d d3 <0f> 0b 8b 45 d0 89 c1 ba 4c 08 00 00 4c 89 ef 89 45 d0 48 c7 > c6 c0 > [58995.896425] RSP: 0018:ffffb89a4a0dfdf8 EFLAGS: 00010282 > [58995.896428] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000027 > [58995.896430] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff960fdf098ac8 > [58995.896432] RBP: ffffb89a4a0dfe40 R08: ffff960fdf098ac0 R09: ffffb89a4a0dfbb8 > [58995.896434] R10: 0000000000000001 R11: 0000000000000001 R12: ffff96036a7d5378 > [58995.896435] R13: ffff960115028888 R14: ffff96036a7d5200 R15: 0000000000000000 > [58995.896437] FS: 0000000000000000(0000) GS:ffff960fdf080000(0000) > knlGS:0000000000000000 > [58995.896439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [58995.896441] CR2: 00005616936891e8 CR3: 000000010fe46000 CR4: 00000000003506e0 > [58995.896444] Call Trace: > [58995.896450] btrfs_commit_transaction+0x2c3/0xa80 [btrfs] > [58995.896500] ? start_transaction+0xd5/0x590 [btrfs] > [58995.896549] transaction_kthread+0x138/0x1b0 [btrfs] > [58995.896596] kthread+0x114/0x150 > [58995.896604] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] > [58995.896649] ? kthread_park+0x90/0x90 > [58995.896653] ret_from_fork+0x22/0x30 > [58995.896661] ---[ end trace c8ba04bdf2113cae ]--- > [58995.896664] BTRFS: error (device sdf) in > btrfs_run_delayed_refs:2124: errno=-17 Object already exists > [58995.896669] BTRFS info (device sdf): forced readonly > [58995.896672] BTRFS warning (device sdf): Skipping commit of aborted > transaction. > [58995.896674] BTRFS: error (device sdf) in cleanup_transaction:1939: > errno=-17 Object already exists > > Read-only is better than nothing, but what would be my next steps? > > On Tue, Sep 7, 2021 at 11:02 AM Robert Wyrick <rob@wyrick.org> wrote: > > > > Ran a repair: > > > > $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, > > just ran from the compiled directory > > enabling repair mode > > WARNING: > > > > Do not use --repair unless you are advised to do so by a developer > > or an experienced user, and then only after having accepted that no > > fsck can successfully repair all types of filesystem corruption. Eg. > > some software or hardware bugs can fatally damage a volume. > > The operation will start in 10 seconds. > > Use Ctrl-C to stop it. > > 10 9 8 7 6 5 4 3 2 1 > > Starting repair. > > Opening filesystem to check... > > Checking filesystem on /dev/sda > > UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > > [1/7] checking root items (0:00:59 elapsed, > > 2649102 items checked) > > Fixed 0 roots. > > Reset extent item (38179182174208) generation to 4057084elapsed, > > 1116143 items checked) > > No device size related problem found (0:02:22 elapsed, > > 1116143 items checked) > > [2/7] checking extents (0:02:23 elapsed, > > 1116143 items checked) > > cache and super generation don't match, space cache will be invalidated > > [3/7] checking free space cache (0:00:00 elapsed) > > Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, > > 106695 items checked) > > repairing missing dir index item for inode 834922400:26 elapsed, > > 108893 items checked) > > [4/7] checking fs roots (0:01:04 elapsed, > > 217787 items checked) > > [5/7] checking csums (without verifying data) (0:00:04 elapsed, > > 12350321 items checked) > > [6/7] checking root refs (0:00:00 elapsed, 4 > > items checked) > > [7/7] checking quota groups skipped (not enabled on this FS) > > found 15729059057664 bytes used, no error found > > total csum bytes: 15313288548 > > total tree bytes: 18286739456 > > total fs tree bytes: 1791819776 > > total extent tree bytes: 229130240 > > btree space waste bytes: 1018844959 > > file data blocks allocated: 51587230502912 > > referenced 15627926712320 > > > > I can now mount the filesystem successfully! Thank you for your help. > > > > I do have some additional questions if you don't mind... > > I am already using RAID 1 to handle single disk outages. I assume > > things could have gone much worse and I could have lost the whole > > filesystem. Aside from backups (I know, I know), is there anything > > else I can do to prevent such issues or make them easier to recover > > from? Could this problem have been avoided/detected earlier? This > > wasn't a disk failure and according to memtest86+, it wasn't due to > > bad memory either.... I don't run scrubs very often. Should I? I > > guess the more general question is: What are the best practices for > > maintaining a healthy btrfs file system? > > > > Thanks again! > > > > On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > > > > > > > > > On 2021/9/7 下午12:36, Robert Wyrick wrote: > > > > What exactly would i be disabling? I don't know what zoned does. > > > > > > The zoned device support. > > > > > > If you don't have any host-managed zoned device, there is no reason you > > > would like to enable it. > > > > > > https://zonedstorage.io/introduction/ > > > > > > Thanks, > > > Qu > > > > > > > > > > > On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: > > > >> > > > >> On 07/09/2021 10:36, Robert Wyrick wrote: > > > >>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > > > >>> > > > >>> $ cat /etc/os-release > > > >>> NAME="Linux Mint" > > > >>> VERSION="20.2 (Uma)" > > > >>> ID=linuxmint > > > >>> ID_LIKE=ubuntu > > > >>> PRETTY_NAME="Linux Mint 20.2" > > > >>> VERSION_ID="20.2" > > > >>> HOME_URL="https://www.linuxmint.com/" > > > >>> SUPPORT_URL="https://forums.linuxmint.com/" > > > >>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > > > >>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" > > > >>> VERSION_CODENAME=uma > > > >>> UBUNTU_CODENAME=focal > > > >>> > > > >>> $ uname -a > > > >>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > > > >>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > > >>> > > > >>> $ ./configure > > > >>> checking for gcc... gcc > > > >>> checking whether the C compiler works... yes > > > >>> checking for C compiler default output file name... a.out > > > >>> checking for suffix of executables... > > > >>> checking whether we are cross compiling... no > > > >>> checking for suffix of object files... o > > > >>> checking whether we are using the GNU C compiler... yes > > > >>> checking whether gcc accepts -g... yes > > > >>> checking for gcc option to accept ISO C89... none needed > > > >>> checking how to run the C preprocessor... gcc -E > > > >>> checking for grep that handles long lines and -e... /bin/grep > > > >>> checking for egrep... /bin/grep -E > > > >>> checking for ANSI C header files... yes > > > >>> checking for sys/types.h... yes > > > >>> checking for sys/stat.h... yes > > > >>> checking for stdlib.h... yes > > > >>> checking for string.h... yes > > > >>> checking for memory.h... yes > > > >>> checking for strings.h... yes > > > >>> checking for inttypes.h... yes > > > >>> checking for stdint.h... yes > > > >>> checking for unistd.h... yes > > > >>> checking minix/config.h usability... no > > > >>> checking minix/config.h presence... no > > > >>> checking for minix/config.h... no > > > >>> checking whether it is safe to define __EXTENSIONS__... yes > > > >>> checking for gcc... (cached) gcc > > > >>> checking whether we are using the GNU C compiler... (cached) yes > > > >>> checking whether gcc accepts -g... (cached) yes > > > >>> checking for gcc option to accept ISO C89... (cached) none needed > > > >>> checking whether C compiler accepts -std=gnu90... yes > > > >>> checking build system type... x86_64-pc-linux-gnu > > > >>> checking host system type... x86_64-pc-linux-gnu > > > >>> checking for an ANSI C-conforming const... yes > > > >>> checking for working volatile... yes > > > >>> checking whether byte ordering is bigendian... no > > > >>> checking for special C compiler options needed for large files... no > > > >>> checking for _FILE_OFFSET_BITS value needed for large files... no > > > >>> checking for a BSD-compatible install... /usr/bin/install -c > > > >>> checking whether ln -s works... yes > > > >>> checking for ar... ar > > > >>> checking for rm... /bin/rm > > > >>> checking for rmdir... /bin/rmdir > > > >>> checking for openat... yes > > > >>> checking for reallocarray... yes > > > >>> checking for clock_gettime... yes > > > >>> checking linux/perf_event.h usability... yes > > > >>> checking linux/perf_event.h presence... yes > > > >>> checking for linux/perf_event.h... yes > > > >>> checking linux/hw_breakpoint.h usability... yes > > > >>> checking linux/hw_breakpoint.h presence... yes > > > >>> checking for linux/hw_breakpoint.h... yes > > > >>> checking for pkg-config... /usr/bin/pkg-config > > > >>> checking pkg-config is at least version 0.9.0... yes > > > >>> checking execinfo.h usability... yes > > > >>> checking execinfo.h presence... yes > > > >>> checking for execinfo.h... yes > > > >>> checking for backtrace... yes > > > >>> checking for backtrace_symbols_fd... yes > > > >>> checking for xmlto... /usr/bin/xmlto > > > >>> checking for mv... /bin/mv > > > >>> checking for a sed that does not truncate output... /bin/sed > > > >>> checking for asciidoc... /usr/bin/asciidoc > > > >>> checking for asciidoctor... no > > > >>> checking for EXT2FS... yes > > > >>> checking for COM_ERR... yes > > > >>> checking for REISERFS... yes > > > >>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > > > >>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > > > >>> checking linux/blkzoned.h usability... yes > > > >>> checking linux/blkzoned.h presence... yes > > > >>> checking for linux/blkzoned.h... yes > > > >>> checking for struct blk_zone.capacity... no > > > >>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > > > >> > > > >>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity > > > >> > > > >> > > > >>> > > > >>> --- > > > >>> > > > >>> Info on the file in question (linux/blkzoned.h): > > > >>> > > > >>> $ dpkg -S /usr/include/linux/blkzoned.h > > > >>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > > > >>> > > > >>> $ dpkg -l linux-libc-dev > > > >>> Desired=Unknown/Install/Remove/Purge/Hold > > > >>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > > > >>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > > > >>> ||/ Name Version Architecture Description > > > >>> +++-====================-============-============-==================================== > > > >>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > > > >>> Headers for development > > > >>> > > > >>> > > > >>> So it appears that linux-libc-dev is way out-dated compared to my > > > >>> kernel. I don't know how to update it, though... there doesn't appear > > > >>> to be a newer version available. > > > >> > > > >> You could disable the zoned. > > > >> > > > >> ./configure --disable-zoned > > > >> > > > >> > > > >> > > > >> > > > >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 20:47 ` Robert Wyrick @ 2021-09-07 23:17 ` Qu Wenruo 2021-09-07 23:20 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-07 23:17 UTC (permalink / raw) To: Robert Wyrick; +Cc: Anand Jain, linux-btrfs On 2021/9/8 上午4:47, Robert Wyrick wrote: > Re-running check now shows: > > [1/7] checking root items (0:00:55 elapsed, > 2649102 items checked) > [2/7] checking extents (0:02:13 elapsed, > 1116141 items checked) > there is no free space entry for 18365358505984-18365358522368d, 130 > items checked) > there is no free space entry for 18365358505984-18366416814080 > cache appears valid but isn't 18365343072256 > there is no free space entry for 19764429062144-19764429078528d, 348 > items checked) > there is no free space entry for 19764429062144-19765502410752 > cache appears valid but isn't 19764428668928 > wanted bytes 49152, found 16384 for off 254016221675521 elapsed, 1534 > items checked) > wanted bytes 1058373632, found 16384 for off 25401622167552 > cache appears valid but isn't 25401606799360 > there is no free space entry for 28659399229440-28659399245824d, 2636 > items checked) > there is no free space entry for 28659399229440-28660413235200 > cache appears valid but isn't 28659339493376 > wanted offset 29154336178176, found 2915433616179200:33 elapsed, 2792 > items checked) > wanted offset 29154336178176, found 29154336161792 > cache appears valid but isn't 29154334474240 > there is no free space entry for 30899331825664-30899331842048d, 3585 > items checked) > there is no free space entry for 30899331825664-30900272234496 > cache appears valid but isn't 30899198492672 > there is no free space entry for 32134011568128-32134011584512d, 4474 > items checked) > there is no free space entry for 32134011568128-32135075332096 > cache appears valid but isn't 32134001590272 > wanted offset 33148689629184, found 3314868961280000:59 elapsed, 4963 > items checked) > wanted offset 33148689629184, found 33148689612800 > cache appears valid but isn't 33148687613952 > there is no free space entry for 34611225755648-34611225772032d, 6036 > items checked) > there is no free space entry for 34611225755648-34612197720064 > cache appears valid but isn't 34611123978240 > there is no free space entry for 37374972723200-37374972739584d, 8051 > items checked) > there is no free space entry for 37374972723200-37376042729472 > cache appears valid but isn't 37374968987648 > there is no free space entry for 37484494651392-37484494667776d, 8172 > items checked) > there is no free space entry for 37484494651392-37485564395520 > cache appears valid but isn't 37484490653696 > wanted bytes 49152, found 32768 for off 377572293017606 elapsed, 8381 > items checked) > wanted bytes 1065517056, found 32768 for off 37757229301760 > cache appears valid but isn't 37757221076992 > there is no free space entry for 38414356250624-38414356267008d, 9004 > items checked) > there is no free space entry for 38414356250624-38415424815104 > cache appears valid but isn't 38414351073280 > there is no free space entry for 41509957402624-41509957419008d, 11792 > items checked) > there is no free space entry for 41509957402624-41511022493696 > cache appears valid but isn't 41509948751872 > there is no free space entry for 42293815459840-42293815492608d, 12469 > items checked) > there is no free space entry for 42293815459840-42294887579648 > cache appears valid but isn't 42293813837824 All free space cache related problems. You can just clear all v1 cache: $ btrfs check --clear-space-cache v1 <dev> Then it's also a good time to migrate to v2 cache, which is safer and faster. # mount <dev> -o space_cache=v2 <mnt> Thanks, Qu > [3/7] checking free space cache (0:04:18 elapsed, 14910 > items checked) > [4/7] checking fs roots (0:00:26 elapsed, > 108894 items checked) > [5/7] checking csums (without verifying data) (0:00:03 elapsed, > 12350321 items checked) > [6/7] checking root refs (0:00:00 elapsed, 4 > items checked) > [7/7] checking quota groups skipped (not enabled on this FS) > found 15729059287040 bytes used, error(s) found > total csum bytes: 15313288548 > total tree bytes: 18286706688 > total fs tree bytes: 1791819776 > total extent tree bytes: 229097472 > btree space waste bytes: 1018811836 > file data blocks allocated: 51587230765056 > referenced 15627926974464 > > On Tue, Sep 7, 2021 at 11:17 AM Robert Wyrick <rob@wyrick.org> wrote: >> >> Looks like I spoke too soon. >> >> I can now mount the FS readonly. >> >> I see this error in dmesg: >> [58995.896369] CPU: 10 PID: 83845 Comm: btrfs-transacti Tainted: P >> OE 5.11.0-27-generic #29~20.04.1-Ubuntu >> [58995.896373] Hardware name: System manufacturer System Product >> Name/PRIME X370-PRO, BIOS 0515 03/30/2017 >> [58995.896376] RIP: 0010:btrfs_run_delayed_refs+0x1af/0x200 [btrfs] >> [58995.896422] Code: 8b 55 50 f0 48 0f ba aa 48 0a 00 00 03 72 20 83 >> f8 fb 74 3c 83 f8 e2 74 37 89 c6 48 c7 c7 50 7e 77 c0 89 45 d0 e8 96 >> d4 4d d3 <0f> 0b 8b 45 d0 89 c1 ba 4c 08 00 00 4c 89 ef 89 45 d0 48 c7 >> c6 c0 >> [58995.896425] RSP: 0018:ffffb89a4a0dfdf8 EFLAGS: 00010282 >> [58995.896428] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000027 >> [58995.896430] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff960fdf098ac8 >> [58995.896432] RBP: ffffb89a4a0dfe40 R08: ffff960fdf098ac0 R09: ffffb89a4a0dfbb8 >> [58995.896434] R10: 0000000000000001 R11: 0000000000000001 R12: ffff96036a7d5378 >> [58995.896435] R13: ffff960115028888 R14: ffff96036a7d5200 R15: 0000000000000000 >> [58995.896437] FS: 0000000000000000(0000) GS:ffff960fdf080000(0000) >> knlGS:0000000000000000 >> [58995.896439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [58995.896441] CR2: 00005616936891e8 CR3: 000000010fe46000 CR4: 00000000003506e0 >> [58995.896444] Call Trace: >> [58995.896450] btrfs_commit_transaction+0x2c3/0xa80 [btrfs] >> [58995.896500] ? start_transaction+0xd5/0x590 [btrfs] >> [58995.896549] transaction_kthread+0x138/0x1b0 [btrfs] >> [58995.896596] kthread+0x114/0x150 >> [58995.896604] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] >> [58995.896649] ? kthread_park+0x90/0x90 >> [58995.896653] ret_from_fork+0x22/0x30 >> [58995.896661] ---[ end trace c8ba04bdf2113cae ]--- >> [58995.896664] BTRFS: error (device sdf) in >> btrfs_run_delayed_refs:2124: errno=-17 Object already exists >> [58995.896669] BTRFS info (device sdf): forced readonly >> [58995.896672] BTRFS warning (device sdf): Skipping commit of aborted >> transaction. >> [58995.896674] BTRFS: error (device sdf) in cleanup_transaction:1939: >> errno=-17 Object already exists >> >> Read-only is better than nothing, but what would be my next steps? >> >> On Tue, Sep 7, 2021 at 11:02 AM Robert Wyrick <rob@wyrick.org> wrote: >>> >>> Ran a repair: >>> >>> $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, >>> just ran from the compiled directory >>> enabling repair mode >>> WARNING: >>> >>> Do not use --repair unless you are advised to do so by a developer >>> or an experienced user, and then only after having accepted that no >>> fsck can successfully repair all types of filesystem corruption. Eg. >>> some software or hardware bugs can fatally damage a volume. >>> The operation will start in 10 seconds. >>> Use Ctrl-C to stop it. >>> 10 9 8 7 6 5 4 3 2 1 >>> Starting repair. >>> Opening filesystem to check... >>> Checking filesystem on /dev/sda >>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >>> [1/7] checking root items (0:00:59 elapsed, >>> 2649102 items checked) >>> Fixed 0 roots. >>> Reset extent item (38179182174208) generation to 4057084elapsed, >>> 1116143 items checked) >>> No device size related problem found (0:02:22 elapsed, >>> 1116143 items checked) >>> [2/7] checking extents (0:02:23 elapsed, >>> 1116143 items checked) >>> cache and super generation don't match, space cache will be invalidated >>> [3/7] checking free space cache (0:00:00 elapsed) >>> Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, >>> 106695 items checked) >>> repairing missing dir index item for inode 834922400:26 elapsed, >>> 108893 items checked) >>> [4/7] checking fs roots (0:01:04 elapsed, >>> 217787 items checked) >>> [5/7] checking csums (without verifying data) (0:00:04 elapsed, >>> 12350321 items checked) >>> [6/7] checking root refs (0:00:00 elapsed, 4 >>> items checked) >>> [7/7] checking quota groups skipped (not enabled on this FS) >>> found 15729059057664 bytes used, no error found >>> total csum bytes: 15313288548 >>> total tree bytes: 18286739456 >>> total fs tree bytes: 1791819776 >>> total extent tree bytes: 229130240 >>> btree space waste bytes: 1018844959 >>> file data blocks allocated: 51587230502912 >>> referenced 15627926712320 >>> >>> I can now mount the filesystem successfully! Thank you for your help. >>> >>> I do have some additional questions if you don't mind... >>> I am already using RAID 1 to handle single disk outages. I assume >>> things could have gone much worse and I could have lost the whole >>> filesystem. Aside from backups (I know, I know), is there anything >>> else I can do to prevent such issues or make them easier to recover >>> from? Could this problem have been avoided/detected earlier? This >>> wasn't a disk failure and according to memtest86+, it wasn't due to >>> bad memory either.... I don't run scrubs very often. Should I? I >>> guess the more general question is: What are the best practices for >>> maintaining a healthy btrfs file system? >>> >>> Thanks again! >>> >>> On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >>>> >>>> >>>> >>>> On 2021/9/7 下午12:36, Robert Wyrick wrote: >>>>> What exactly would i be disabling? I don't know what zoned does. >>>> >>>> The zoned device support. >>>> >>>> If you don't have any host-managed zoned device, there is no reason you >>>> would like to enable it. >>>> >>>> https://zonedstorage.io/introduction/ >>>> >>>> Thanks, >>>> Qu >>>> >>>>> >>>>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: >>>>>> >>>>>> On 07/09/2021 10:36, Robert Wyrick wrote: >>>>>>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. >>>>>>> >>>>>>> $ cat /etc/os-release >>>>>>> NAME="Linux Mint" >>>>>>> VERSION="20.2 (Uma)" >>>>>>> ID=linuxmint >>>>>>> ID_LIKE=ubuntu >>>>>>> PRETTY_NAME="Linux Mint 20.2" >>>>>>> VERSION_ID="20.2" >>>>>>> HOME_URL="https://www.linuxmint.com/" >>>>>>> SUPPORT_URL="https://forums.linuxmint.com/" >>>>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" >>>>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" >>>>>>> VERSION_CODENAME=uma >>>>>>> UBUNTU_CODENAME=focal >>>>>>> >>>>>>> $ uname -a >>>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>>>>> >>>>>>> $ ./configure >>>>>>> checking for gcc... gcc >>>>>>> checking whether the C compiler works... yes >>>>>>> checking for C compiler default output file name... a.out >>>>>>> checking for suffix of executables... >>>>>>> checking whether we are cross compiling... no >>>>>>> checking for suffix of object files... o >>>>>>> checking whether we are using the GNU C compiler... yes >>>>>>> checking whether gcc accepts -g... yes >>>>>>> checking for gcc option to accept ISO C89... none needed >>>>>>> checking how to run the C preprocessor... gcc -E >>>>>>> checking for grep that handles long lines and -e... /bin/grep >>>>>>> checking for egrep... /bin/grep -E >>>>>>> checking for ANSI C header files... yes >>>>>>> checking for sys/types.h... yes >>>>>>> checking for sys/stat.h... yes >>>>>>> checking for stdlib.h... yes >>>>>>> checking for string.h... yes >>>>>>> checking for memory.h... yes >>>>>>> checking for strings.h... yes >>>>>>> checking for inttypes.h... yes >>>>>>> checking for stdint.h... yes >>>>>>> checking for unistd.h... yes >>>>>>> checking minix/config.h usability... no >>>>>>> checking minix/config.h presence... no >>>>>>> checking for minix/config.h... no >>>>>>> checking whether it is safe to define __EXTENSIONS__... yes >>>>>>> checking for gcc... (cached) gcc >>>>>>> checking whether we are using the GNU C compiler... (cached) yes >>>>>>> checking whether gcc accepts -g... (cached) yes >>>>>>> checking for gcc option to accept ISO C89... (cached) none needed >>>>>>> checking whether C compiler accepts -std=gnu90... yes >>>>>>> checking build system type... x86_64-pc-linux-gnu >>>>>>> checking host system type... x86_64-pc-linux-gnu >>>>>>> checking for an ANSI C-conforming const... yes >>>>>>> checking for working volatile... yes >>>>>>> checking whether byte ordering is bigendian... no >>>>>>> checking for special C compiler options needed for large files... no >>>>>>> checking for _FILE_OFFSET_BITS value needed for large files... no >>>>>>> checking for a BSD-compatible install... /usr/bin/install -c >>>>>>> checking whether ln -s works... yes >>>>>>> checking for ar... ar >>>>>>> checking for rm... /bin/rm >>>>>>> checking for rmdir... /bin/rmdir >>>>>>> checking for openat... yes >>>>>>> checking for reallocarray... yes >>>>>>> checking for clock_gettime... yes >>>>>>> checking linux/perf_event.h usability... yes >>>>>>> checking linux/perf_event.h presence... yes >>>>>>> checking for linux/perf_event.h... yes >>>>>>> checking linux/hw_breakpoint.h usability... yes >>>>>>> checking linux/hw_breakpoint.h presence... yes >>>>>>> checking for linux/hw_breakpoint.h... yes >>>>>>> checking for pkg-config... /usr/bin/pkg-config >>>>>>> checking pkg-config is at least version 0.9.0... yes >>>>>>> checking execinfo.h usability... yes >>>>>>> checking execinfo.h presence... yes >>>>>>> checking for execinfo.h... yes >>>>>>> checking for backtrace... yes >>>>>>> checking for backtrace_symbols_fd... yes >>>>>>> checking for xmlto... /usr/bin/xmlto >>>>>>> checking for mv... /bin/mv >>>>>>> checking for a sed that does not truncate output... /bin/sed >>>>>>> checking for asciidoc... /usr/bin/asciidoc >>>>>>> checking for asciidoctor... no >>>>>>> checking for EXT2FS... yes >>>>>>> checking for COM_ERR... yes >>>>>>> checking for REISERFS... yes >>>>>>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes >>>>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes >>>>>>> checking linux/blkzoned.h usability... yes >>>>>>> checking linux/blkzoned.h presence... yes >>>>>>> checking for linux/blkzoned.h... yes >>>>>>> checking for struct blk_zone.capacity... no >>>>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes >>>>>> >>>>>>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity >>>>>> >>>>>> >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> Info on the file in question (linux/blkzoned.h): >>>>>>> >>>>>>> $ dpkg -S /usr/include/linux/blkzoned.h >>>>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h >>>>>>> >>>>>>> $ dpkg -l linux-libc-dev >>>>>>> Desired=Unknown/Install/Remove/Purge/Hold >>>>>>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >>>>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >>>>>>> ||/ Name Version Architecture Description >>>>>>> +++-====================-============-============-==================================== >>>>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel >>>>>>> Headers for development >>>>>>> >>>>>>> >>>>>>> So it appears that linux-libc-dev is way out-dated compared to my >>>>>>> kernel. I don't know how to update it, though... there doesn't appear >>>>>>> to be a newer version available. >>>>>> >>>>>> You could disable the zoned. >>>>>> >>>>>> ./configure --disable-zoned >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 23:17 ` Qu Wenruo @ 2021-09-07 23:20 ` Robert Wyrick 2021-09-07 23:28 ` Qu Wenruo 0 siblings, 1 reply; 20+ messages in thread From: Robert Wyrick @ 2021-09-07 23:20 UTC (permalink / raw) To: Qu Wenruo; +Cc: Anand Jain, linux-btrfs Anything specific to address the dmesg error? On Tue, Sep 7, 2021 at 5:17 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2021/9/8 上午4:47, Robert Wyrick wrote: > > Re-running check now shows: > > > > [1/7] checking root items (0:00:55 elapsed, > > 2649102 items checked) > > [2/7] checking extents (0:02:13 elapsed, > > 1116141 items checked) > > there is no free space entry for 18365358505984-18365358522368d, 130 > > items checked) > > there is no free space entry for 18365358505984-18366416814080 > > cache appears valid but isn't 18365343072256 > > there is no free space entry for 19764429062144-19764429078528d, 348 > > items checked) > > there is no free space entry for 19764429062144-19765502410752 > > cache appears valid but isn't 19764428668928 > > wanted bytes 49152, found 16384 for off 254016221675521 elapsed, 1534 > > items checked) > > wanted bytes 1058373632, found 16384 for off 25401622167552 > > cache appears valid but isn't 25401606799360 > > there is no free space entry for 28659399229440-28659399245824d, 2636 > > items checked) > > there is no free space entry for 28659399229440-28660413235200 > > cache appears valid but isn't 28659339493376 > > wanted offset 29154336178176, found 2915433616179200:33 elapsed, 2792 > > items checked) > > wanted offset 29154336178176, found 29154336161792 > > cache appears valid but isn't 29154334474240 > > there is no free space entry for 30899331825664-30899331842048d, 3585 > > items checked) > > there is no free space entry for 30899331825664-30900272234496 > > cache appears valid but isn't 30899198492672 > > there is no free space entry for 32134011568128-32134011584512d, 4474 > > items checked) > > there is no free space entry for 32134011568128-32135075332096 > > cache appears valid but isn't 32134001590272 > > wanted offset 33148689629184, found 3314868961280000:59 elapsed, 4963 > > items checked) > > wanted offset 33148689629184, found 33148689612800 > > cache appears valid but isn't 33148687613952 > > there is no free space entry for 34611225755648-34611225772032d, 6036 > > items checked) > > there is no free space entry for 34611225755648-34612197720064 > > cache appears valid but isn't 34611123978240 > > there is no free space entry for 37374972723200-37374972739584d, 8051 > > items checked) > > there is no free space entry for 37374972723200-37376042729472 > > cache appears valid but isn't 37374968987648 > > there is no free space entry for 37484494651392-37484494667776d, 8172 > > items checked) > > there is no free space entry for 37484494651392-37485564395520 > > cache appears valid but isn't 37484490653696 > > wanted bytes 49152, found 32768 for off 377572293017606 elapsed, 8381 > > items checked) > > wanted bytes 1065517056, found 32768 for off 37757229301760 > > cache appears valid but isn't 37757221076992 > > there is no free space entry for 38414356250624-38414356267008d, 9004 > > items checked) > > there is no free space entry for 38414356250624-38415424815104 > > cache appears valid but isn't 38414351073280 > > there is no free space entry for 41509957402624-41509957419008d, 11792 > > items checked) > > there is no free space entry for 41509957402624-41511022493696 > > cache appears valid but isn't 41509948751872 > > there is no free space entry for 42293815459840-42293815492608d, 12469 > > items checked) > > there is no free space entry for 42293815459840-42294887579648 > > cache appears valid but isn't 42293813837824 > > All free space cache related problems. > > You can just clear all v1 cache: > > $ btrfs check --clear-space-cache v1 <dev> > > Then it's also a good time to migrate to v2 cache, which is safer and > faster. > > # mount <dev> -o space_cache=v2 <mnt> > > Thanks, > Qu > > [3/7] checking free space cache (0:04:18 elapsed, 14910 > > items checked) > > [4/7] checking fs roots (0:00:26 elapsed, > > 108894 items checked) > > [5/7] checking csums (without verifying data) (0:00:03 elapsed, > > 12350321 items checked) > > [6/7] checking root refs (0:00:00 elapsed, 4 > > items checked) > > [7/7] checking quota groups skipped (not enabled on this FS) > > found 15729059287040 bytes used, error(s) found > > total csum bytes: 15313288548 > > total tree bytes: 18286706688 > > total fs tree bytes: 1791819776 > > total extent tree bytes: 229097472 > > btree space waste bytes: 1018811836 > > file data blocks allocated: 51587230765056 > > referenced 15627926974464 > > > > On Tue, Sep 7, 2021 at 11:17 AM Robert Wyrick <rob@wyrick.org> wrote: > >> > >> Looks like I spoke too soon. > >> > >> I can now mount the FS readonly. > >> > >> I see this error in dmesg: > >> [58995.896369] CPU: 10 PID: 83845 Comm: btrfs-transacti Tainted: P > >> OE 5.11.0-27-generic #29~20.04.1-Ubuntu > >> [58995.896373] Hardware name: System manufacturer System Product > >> Name/PRIME X370-PRO, BIOS 0515 03/30/2017 > >> [58995.896376] RIP: 0010:btrfs_run_delayed_refs+0x1af/0x200 [btrfs] > >> [58995.896422] Code: 8b 55 50 f0 48 0f ba aa 48 0a 00 00 03 72 20 83 > >> f8 fb 74 3c 83 f8 e2 74 37 89 c6 48 c7 c7 50 7e 77 c0 89 45 d0 e8 96 > >> d4 4d d3 <0f> 0b 8b 45 d0 89 c1 ba 4c 08 00 00 4c 89 ef 89 45 d0 48 c7 > >> c6 c0 > >> [58995.896425] RSP: 0018:ffffb89a4a0dfdf8 EFLAGS: 00010282 > >> [58995.896428] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000027 > >> [58995.896430] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff960fdf098ac8 > >> [58995.896432] RBP: ffffb89a4a0dfe40 R08: ffff960fdf098ac0 R09: ffffb89a4a0dfbb8 > >> [58995.896434] R10: 0000000000000001 R11: 0000000000000001 R12: ffff96036a7d5378 > >> [58995.896435] R13: ffff960115028888 R14: ffff96036a7d5200 R15: 0000000000000000 > >> [58995.896437] FS: 0000000000000000(0000) GS:ffff960fdf080000(0000) > >> knlGS:0000000000000000 > >> [58995.896439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [58995.896441] CR2: 00005616936891e8 CR3: 000000010fe46000 CR4: 00000000003506e0 > >> [58995.896444] Call Trace: > >> [58995.896450] btrfs_commit_transaction+0x2c3/0xa80 [btrfs] > >> [58995.896500] ? start_transaction+0xd5/0x590 [btrfs] > >> [58995.896549] transaction_kthread+0x138/0x1b0 [btrfs] > >> [58995.896596] kthread+0x114/0x150 > >> [58995.896604] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] > >> [58995.896649] ? kthread_park+0x90/0x90 > >> [58995.896653] ret_from_fork+0x22/0x30 > >> [58995.896661] ---[ end trace c8ba04bdf2113cae ]--- > >> [58995.896664] BTRFS: error (device sdf) in > >> btrfs_run_delayed_refs:2124: errno=-17 Object already exists > >> [58995.896669] BTRFS info (device sdf): forced readonly > >> [58995.896672] BTRFS warning (device sdf): Skipping commit of aborted > >> transaction. > >> [58995.896674] BTRFS: error (device sdf) in cleanup_transaction:1939: > >> errno=-17 Object already exists > >> > >> Read-only is better than nothing, but what would be my next steps? > >> > >> On Tue, Sep 7, 2021 at 11:02 AM Robert Wyrick <rob@wyrick.org> wrote: > >>> > >>> Ran a repair: > >>> > >>> $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, > >>> just ran from the compiled directory > >>> enabling repair mode > >>> WARNING: > >>> > >>> Do not use --repair unless you are advised to do so by a developer > >>> or an experienced user, and then only after having accepted that no > >>> fsck can successfully repair all types of filesystem corruption. Eg. > >>> some software or hardware bugs can fatally damage a volume. > >>> The operation will start in 10 seconds. > >>> Use Ctrl-C to stop it. > >>> 10 9 8 7 6 5 4 3 2 1 > >>> Starting repair. > >>> Opening filesystem to check... > >>> Checking filesystem on /dev/sda > >>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > >>> [1/7] checking root items (0:00:59 elapsed, > >>> 2649102 items checked) > >>> Fixed 0 roots. > >>> Reset extent item (38179182174208) generation to 4057084elapsed, > >>> 1116143 items checked) > >>> No device size related problem found (0:02:22 elapsed, > >>> 1116143 items checked) > >>> [2/7] checking extents (0:02:23 elapsed, > >>> 1116143 items checked) > >>> cache and super generation don't match, space cache will be invalidated > >>> [3/7] checking free space cache (0:00:00 elapsed) > >>> Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, > >>> 106695 items checked) > >>> repairing missing dir index item for inode 834922400:26 elapsed, > >>> 108893 items checked) > >>> [4/7] checking fs roots (0:01:04 elapsed, > >>> 217787 items checked) > >>> [5/7] checking csums (without verifying data) (0:00:04 elapsed, > >>> 12350321 items checked) > >>> [6/7] checking root refs (0:00:00 elapsed, 4 > >>> items checked) > >>> [7/7] checking quota groups skipped (not enabled on this FS) > >>> found 15729059057664 bytes used, no error found > >>> total csum bytes: 15313288548 > >>> total tree bytes: 18286739456 > >>> total fs tree bytes: 1791819776 > >>> total extent tree bytes: 229130240 > >>> btree space waste bytes: 1018844959 > >>> file data blocks allocated: 51587230502912 > >>> referenced 15627926712320 > >>> > >>> I can now mount the filesystem successfully! Thank you for your help. > >>> > >>> I do have some additional questions if you don't mind... > >>> I am already using RAID 1 to handle single disk outages. I assume > >>> things could have gone much worse and I could have lost the whole > >>> filesystem. Aside from backups (I know, I know), is there anything > >>> else I can do to prevent such issues or make them easier to recover > >>> from? Could this problem have been avoided/detected earlier? This > >>> wasn't a disk failure and according to memtest86+, it wasn't due to > >>> bad memory either.... I don't run scrubs very often. Should I? I > >>> guess the more general question is: What are the best practices for > >>> maintaining a healthy btrfs file system? > >>> > >>> Thanks again! > >>> > >>> On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2021/9/7 下午12:36, Robert Wyrick wrote: > >>>>> What exactly would i be disabling? I don't know what zoned does. > >>>> > >>>> The zoned device support. > >>>> > >>>> If you don't have any host-managed zoned device, there is no reason you > >>>> would like to enable it. > >>>> > >>>> https://zonedstorage.io/introduction/ > >>>> > >>>> Thanks, > >>>> Qu > >>>> > >>>>> > >>>>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: > >>>>>> > >>>>>> On 07/09/2021 10:36, Robert Wyrick wrote: > >>>>>>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. > >>>>>>> > >>>>>>> $ cat /etc/os-release > >>>>>>> NAME="Linux Mint" > >>>>>>> VERSION="20.2 (Uma)" > >>>>>>> ID=linuxmint > >>>>>>> ID_LIKE=ubuntu > >>>>>>> PRETTY_NAME="Linux Mint 20.2" > >>>>>>> VERSION_ID="20.2" > >>>>>>> HOME_URL="https://www.linuxmint.com/" > >>>>>>> SUPPORT_URL="https://forums.linuxmint.com/" > >>>>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > >>>>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" > >>>>>>> VERSION_CODENAME=uma > >>>>>>> UBUNTU_CODENAME=focal > >>>>>>> > >>>>>>> $ uname -a > >>>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 > >>>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > >>>>>>> > >>>>>>> $ ./configure > >>>>>>> checking for gcc... gcc > >>>>>>> checking whether the C compiler works... yes > >>>>>>> checking for C compiler default output file name... a.out > >>>>>>> checking for suffix of executables... > >>>>>>> checking whether we are cross compiling... no > >>>>>>> checking for suffix of object files... o > >>>>>>> checking whether we are using the GNU C compiler... yes > >>>>>>> checking whether gcc accepts -g... yes > >>>>>>> checking for gcc option to accept ISO C89... none needed > >>>>>>> checking how to run the C preprocessor... gcc -E > >>>>>>> checking for grep that handles long lines and -e... /bin/grep > >>>>>>> checking for egrep... /bin/grep -E > >>>>>>> checking for ANSI C header files... yes > >>>>>>> checking for sys/types.h... yes > >>>>>>> checking for sys/stat.h... yes > >>>>>>> checking for stdlib.h... yes > >>>>>>> checking for string.h... yes > >>>>>>> checking for memory.h... yes > >>>>>>> checking for strings.h... yes > >>>>>>> checking for inttypes.h... yes > >>>>>>> checking for stdint.h... yes > >>>>>>> checking for unistd.h... yes > >>>>>>> checking minix/config.h usability... no > >>>>>>> checking minix/config.h presence... no > >>>>>>> checking for minix/config.h... no > >>>>>>> checking whether it is safe to define __EXTENSIONS__... yes > >>>>>>> checking for gcc... (cached) gcc > >>>>>>> checking whether we are using the GNU C compiler... (cached) yes > >>>>>>> checking whether gcc accepts -g... (cached) yes > >>>>>>> checking for gcc option to accept ISO C89... (cached) none needed > >>>>>>> checking whether C compiler accepts -std=gnu90... yes > >>>>>>> checking build system type... x86_64-pc-linux-gnu > >>>>>>> checking host system type... x86_64-pc-linux-gnu > >>>>>>> checking for an ANSI C-conforming const... yes > >>>>>>> checking for working volatile... yes > >>>>>>> checking whether byte ordering is bigendian... no > >>>>>>> checking for special C compiler options needed for large files... no > >>>>>>> checking for _FILE_OFFSET_BITS value needed for large files... no > >>>>>>> checking for a BSD-compatible install... /usr/bin/install -c > >>>>>>> checking whether ln -s works... yes > >>>>>>> checking for ar... ar > >>>>>>> checking for rm... /bin/rm > >>>>>>> checking for rmdir... /bin/rmdir > >>>>>>> checking for openat... yes > >>>>>>> checking for reallocarray... yes > >>>>>>> checking for clock_gettime... yes > >>>>>>> checking linux/perf_event.h usability... yes > >>>>>>> checking linux/perf_event.h presence... yes > >>>>>>> checking for linux/perf_event.h... yes > >>>>>>> checking linux/hw_breakpoint.h usability... yes > >>>>>>> checking linux/hw_breakpoint.h presence... yes > >>>>>>> checking for linux/hw_breakpoint.h... yes > >>>>>>> checking for pkg-config... /usr/bin/pkg-config > >>>>>>> checking pkg-config is at least version 0.9.0... yes > >>>>>>> checking execinfo.h usability... yes > >>>>>>> checking execinfo.h presence... yes > >>>>>>> checking for execinfo.h... yes > >>>>>>> checking for backtrace... yes > >>>>>>> checking for backtrace_symbols_fd... yes > >>>>>>> checking for xmlto... /usr/bin/xmlto > >>>>>>> checking for mv... /bin/mv > >>>>>>> checking for a sed that does not truncate output... /bin/sed > >>>>>>> checking for asciidoc... /usr/bin/asciidoc > >>>>>>> checking for asciidoctor... no > >>>>>>> checking for EXT2FS... yes > >>>>>>> checking for COM_ERR... yes > >>>>>>> checking for REISERFS... yes > >>>>>>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes > >>>>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes > >>>>>>> checking linux/blkzoned.h usability... yes > >>>>>>> checking linux/blkzoned.h presence... yes > >>>>>>> checking for linux/blkzoned.h... yes > >>>>>>> checking for struct blk_zone.capacity... no > >>>>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes > >>>>>> > >>>>>>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> --- > >>>>>>> > >>>>>>> Info on the file in question (linux/blkzoned.h): > >>>>>>> > >>>>>>> $ dpkg -S /usr/include/linux/blkzoned.h > >>>>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > >>>>>>> > >>>>>>> $ dpkg -l linux-libc-dev > >>>>>>> Desired=Unknown/Install/Remove/Purge/Hold > >>>>>>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > >>>>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > >>>>>>> ||/ Name Version Architecture Description > >>>>>>> +++-====================-============-============-==================================== > >>>>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel > >>>>>>> Headers for development > >>>>>>> > >>>>>>> > >>>>>>> So it appears that linux-libc-dev is way out-dated compared to my > >>>>>>> kernel. I don't know how to update it, though... there doesn't appear > >>>>>>> to be a newer version available. > >>>>>> > >>>>>> You could disable the zoned. > >>>>>> > >>>>>> ./configure --disable-zoned > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 23:20 ` Robert Wyrick @ 2021-09-07 23:28 ` Qu Wenruo 0 siblings, 0 replies; 20+ messages in thread From: Qu Wenruo @ 2021-09-07 23:28 UTC (permalink / raw) To: Robert Wyrick, Qu Wenruo; +Cc: Anand Jain, linux-btrfs On 2021/9/8 上午7:20, Robert Wyrick wrote: > Anything specific to address the dmesg error? The dmesg seems to be caused by the corrupted free space cache. Thus as long as it doesn't cause long last problems (aka, committing bad data to disk), after clearing v1 space cache, it should be safe. But to be extra safe, you'd better run btrfs check again after v1 space cache clearing to be sure. Thanks, Qu > > On Tue, Sep 7, 2021 at 5:17 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2021/9/8 上午4:47, Robert Wyrick wrote: >>> Re-running check now shows: >>> >>> [1/7] checking root items (0:00:55 elapsed, >>> 2649102 items checked) >>> [2/7] checking extents (0:02:13 elapsed, >>> 1116141 items checked) >>> there is no free space entry for 18365358505984-18365358522368d, 130 >>> items checked) >>> there is no free space entry for 18365358505984-18366416814080 >>> cache appears valid but isn't 18365343072256 >>> there is no free space entry for 19764429062144-19764429078528d, 348 >>> items checked) >>> there is no free space entry for 19764429062144-19765502410752 >>> cache appears valid but isn't 19764428668928 >>> wanted bytes 49152, found 16384 for off 254016221675521 elapsed, 1534 >>> items checked) >>> wanted bytes 1058373632, found 16384 for off 25401622167552 >>> cache appears valid but isn't 25401606799360 >>> there is no free space entry for 28659399229440-28659399245824d, 2636 >>> items checked) >>> there is no free space entry for 28659399229440-28660413235200 >>> cache appears valid but isn't 28659339493376 >>> wanted offset 29154336178176, found 2915433616179200:33 elapsed, 2792 >>> items checked) >>> wanted offset 29154336178176, found 29154336161792 >>> cache appears valid but isn't 29154334474240 >>> there is no free space entry for 30899331825664-30899331842048d, 3585 >>> items checked) >>> there is no free space entry for 30899331825664-30900272234496 >>> cache appears valid but isn't 30899198492672 >>> there is no free space entry for 32134011568128-32134011584512d, 4474 >>> items checked) >>> there is no free space entry for 32134011568128-32135075332096 >>> cache appears valid but isn't 32134001590272 >>> wanted offset 33148689629184, found 3314868961280000:59 elapsed, 4963 >>> items checked) >>> wanted offset 33148689629184, found 33148689612800 >>> cache appears valid but isn't 33148687613952 >>> there is no free space entry for 34611225755648-34611225772032d, 6036 >>> items checked) >>> there is no free space entry for 34611225755648-34612197720064 >>> cache appears valid but isn't 34611123978240 >>> there is no free space entry for 37374972723200-37374972739584d, 8051 >>> items checked) >>> there is no free space entry for 37374972723200-37376042729472 >>> cache appears valid but isn't 37374968987648 >>> there is no free space entry for 37484494651392-37484494667776d, 8172 >>> items checked) >>> there is no free space entry for 37484494651392-37485564395520 >>> cache appears valid but isn't 37484490653696 >>> wanted bytes 49152, found 32768 for off 377572293017606 elapsed, 8381 >>> items checked) >>> wanted bytes 1065517056, found 32768 for off 37757229301760 >>> cache appears valid but isn't 37757221076992 >>> there is no free space entry for 38414356250624-38414356267008d, 9004 >>> items checked) >>> there is no free space entry for 38414356250624-38415424815104 >>> cache appears valid but isn't 38414351073280 >>> there is no free space entry for 41509957402624-41509957419008d, 11792 >>> items checked) >>> there is no free space entry for 41509957402624-41511022493696 >>> cache appears valid but isn't 41509948751872 >>> there is no free space entry for 42293815459840-42293815492608d, 12469 >>> items checked) >>> there is no free space entry for 42293815459840-42294887579648 >>> cache appears valid but isn't 42293813837824 >> >> All free space cache related problems. >> >> You can just clear all v1 cache: >> >> $ btrfs check --clear-space-cache v1 <dev> >> >> Then it's also a good time to migrate to v2 cache, which is safer and >> faster. >> >> # mount <dev> -o space_cache=v2 <mnt> >> >> Thanks, >> Qu >>> [3/7] checking free space cache (0:04:18 elapsed, 14910 >>> items checked) >>> [4/7] checking fs roots (0:00:26 elapsed, >>> 108894 items checked) >>> [5/7] checking csums (without verifying data) (0:00:03 elapsed, >>> 12350321 items checked) >>> [6/7] checking root refs (0:00:00 elapsed, 4 >>> items checked) >>> [7/7] checking quota groups skipped (not enabled on this FS) >>> found 15729059287040 bytes used, error(s) found >>> total csum bytes: 15313288548 >>> total tree bytes: 18286706688 >>> total fs tree bytes: 1791819776 >>> total extent tree bytes: 229097472 >>> btree space waste bytes: 1018811836 >>> file data blocks allocated: 51587230765056 >>> referenced 15627926974464 >>> >>> On Tue, Sep 7, 2021 at 11:17 AM Robert Wyrick <rob@wyrick.org> wrote: >>>> >>>> Looks like I spoke too soon. >>>> >>>> I can now mount the FS readonly. >>>> >>>> I see this error in dmesg: >>>> [58995.896369] CPU: 10 PID: 83845 Comm: btrfs-transacti Tainted: P >>>> OE 5.11.0-27-generic #29~20.04.1-Ubuntu >>>> [58995.896373] Hardware name: System manufacturer System Product >>>> Name/PRIME X370-PRO, BIOS 0515 03/30/2017 >>>> [58995.896376] RIP: 0010:btrfs_run_delayed_refs+0x1af/0x200 [btrfs] >>>> [58995.896422] Code: 8b 55 50 f0 48 0f ba aa 48 0a 00 00 03 72 20 83 >>>> f8 fb 74 3c 83 f8 e2 74 37 89 c6 48 c7 c7 50 7e 77 c0 89 45 d0 e8 96 >>>> d4 4d d3 <0f> 0b 8b 45 d0 89 c1 ba 4c 08 00 00 4c 89 ef 89 45 d0 48 c7 >>>> c6 c0 >>>> [58995.896425] RSP: 0018:ffffb89a4a0dfdf8 EFLAGS: 00010282 >>>> [58995.896428] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000027 >>>> [58995.896430] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff960fdf098ac8 >>>> [58995.896432] RBP: ffffb89a4a0dfe40 R08: ffff960fdf098ac0 R09: ffffb89a4a0dfbb8 >>>> [58995.896434] R10: 0000000000000001 R11: 0000000000000001 R12: ffff96036a7d5378 >>>> [58995.896435] R13: ffff960115028888 R14: ffff96036a7d5200 R15: 0000000000000000 >>>> [58995.896437] FS: 0000000000000000(0000) GS:ffff960fdf080000(0000) >>>> knlGS:0000000000000000 >>>> [58995.896439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [58995.896441] CR2: 00005616936891e8 CR3: 000000010fe46000 CR4: 00000000003506e0 >>>> [58995.896444] Call Trace: >>>> [58995.896450] btrfs_commit_transaction+0x2c3/0xa80 [btrfs] >>>> [58995.896500] ? start_transaction+0xd5/0x590 [btrfs] >>>> [58995.896549] transaction_kthread+0x138/0x1b0 [btrfs] >>>> [58995.896596] kthread+0x114/0x150 >>>> [58995.896604] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] >>>> [58995.896649] ? kthread_park+0x90/0x90 >>>> [58995.896653] ret_from_fork+0x22/0x30 >>>> [58995.896661] ---[ end trace c8ba04bdf2113cae ]--- >>>> [58995.896664] BTRFS: error (device sdf) in >>>> btrfs_run_delayed_refs:2124: errno=-17 Object already exists >>>> [58995.896669] BTRFS info (device sdf): forced readonly >>>> [58995.896672] BTRFS warning (device sdf): Skipping commit of aborted >>>> transaction. >>>> [58995.896674] BTRFS: error (device sdf) in cleanup_transaction:1939: >>>> errno=-17 Object already exists >>>> >>>> Read-only is better than nothing, but what would be my next steps? >>>> >>>> On Tue, Sep 7, 2021 at 11:02 AM Robert Wyrick <rob@wyrick.org> wrote: >>>>> >>>>> Ran a repair: >>>>> >>>>> $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, >>>>> just ran from the compiled directory >>>>> enabling repair mode >>>>> WARNING: >>>>> >>>>> Do not use --repair unless you are advised to do so by a developer >>>>> or an experienced user, and then only after having accepted that no >>>>> fsck can successfully repair all types of filesystem corruption. Eg. >>>>> some software or hardware bugs can fatally damage a volume. >>>>> The operation will start in 10 seconds. >>>>> Use Ctrl-C to stop it. >>>>> 10 9 8 7 6 5 4 3 2 1 >>>>> Starting repair. >>>>> Opening filesystem to check... >>>>> Checking filesystem on /dev/sda >>>>> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >>>>> [1/7] checking root items (0:00:59 elapsed, >>>>> 2649102 items checked) >>>>> Fixed 0 roots. >>>>> Reset extent item (38179182174208) generation to 4057084elapsed, >>>>> 1116143 items checked) >>>>> No device size related problem found (0:02:22 elapsed, >>>>> 1116143 items checked) >>>>> [2/7] checking extents (0:02:23 elapsed, >>>>> 1116143 items checked) >>>>> cache and super generation don't match, space cache will be invalidated >>>>> [3/7] checking free space cache (0:00:00 elapsed) >>>>> Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, >>>>> 106695 items checked) >>>>> repairing missing dir index item for inode 834922400:26 elapsed, >>>>> 108893 items checked) >>>>> [4/7] checking fs roots (0:01:04 elapsed, >>>>> 217787 items checked) >>>>> [5/7] checking csums (without verifying data) (0:00:04 elapsed, >>>>> 12350321 items checked) >>>>> [6/7] checking root refs (0:00:00 elapsed, 4 >>>>> items checked) >>>>> [7/7] checking quota groups skipped (not enabled on this FS) >>>>> found 15729059057664 bytes used, no error found >>>>> total csum bytes: 15313288548 >>>>> total tree bytes: 18286739456 >>>>> total fs tree bytes: 1791819776 >>>>> total extent tree bytes: 229130240 >>>>> btree space waste bytes: 1018844959 >>>>> file data blocks allocated: 51587230502912 >>>>> referenced 15627926712320 >>>>> >>>>> I can now mount the filesystem successfully! Thank you for your help. >>>>> >>>>> I do have some additional questions if you don't mind... >>>>> I am already using RAID 1 to handle single disk outages. I assume >>>>> things could have gone much worse and I could have lost the whole >>>>> filesystem. Aside from backups (I know, I know), is there anything >>>>> else I can do to prevent such issues or make them easier to recover >>>>> from? Could this problem have been avoided/detected earlier? This >>>>> wasn't a disk failure and according to memtest86+, it wasn't due to >>>>> bad memory either.... I don't run scrubs very often. Should I? I >>>>> guess the more general question is: What are the best practices for >>>>> maintaining a healthy btrfs file system? >>>>> >>>>> Thanks again! >>>>> >>>>> On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2021/9/7 下午12:36, Robert Wyrick wrote: >>>>>>> What exactly would i be disabling? I don't know what zoned does. >>>>>> >>>>>> The zoned device support. >>>>>> >>>>>> If you don't have any host-managed zoned device, there is no reason you >>>>>> would like to enable it. >>>>>> >>>>>> https://zonedstorage.io/introduction/ >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>>> >>>>>>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: >>>>>>>> >>>>>>>> On 07/09/2021 10:36, Robert Wyrick wrote: >>>>>>>>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. >>>>>>>>> >>>>>>>>> $ cat /etc/os-release >>>>>>>>> NAME="Linux Mint" >>>>>>>>> VERSION="20.2 (Uma)" >>>>>>>>> ID=linuxmint >>>>>>>>> ID_LIKE=ubuntu >>>>>>>>> PRETTY_NAME="Linux Mint 20.2" >>>>>>>>> VERSION_ID="20.2" >>>>>>>>> HOME_URL="https://www.linuxmint.com/" >>>>>>>>> SUPPORT_URL="https://forums.linuxmint.com/" >>>>>>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" >>>>>>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" >>>>>>>>> VERSION_CODENAME=uma >>>>>>>>> UBUNTU_CODENAME=focal >>>>>>>>> >>>>>>>>> $ uname -a >>>>>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>>>>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>>>>>>> >>>>>>>>> $ ./configure >>>>>>>>> checking for gcc... gcc >>>>>>>>> checking whether the C compiler works... yes >>>>>>>>> checking for C compiler default output file name... a.out >>>>>>>>> checking for suffix of executables... >>>>>>>>> checking whether we are cross compiling... no >>>>>>>>> checking for suffix of object files... o >>>>>>>>> checking whether we are using the GNU C compiler... yes >>>>>>>>> checking whether gcc accepts -g... yes >>>>>>>>> checking for gcc option to accept ISO C89... none needed >>>>>>>>> checking how to run the C preprocessor... gcc -E >>>>>>>>> checking for grep that handles long lines and -e... /bin/grep >>>>>>>>> checking for egrep... /bin/grep -E >>>>>>>>> checking for ANSI C header files... yes >>>>>>>>> checking for sys/types.h... yes >>>>>>>>> checking for sys/stat.h... yes >>>>>>>>> checking for stdlib.h... yes >>>>>>>>> checking for string.h... yes >>>>>>>>> checking for memory.h... yes >>>>>>>>> checking for strings.h... yes >>>>>>>>> checking for inttypes.h... yes >>>>>>>>> checking for stdint.h... yes >>>>>>>>> checking for unistd.h... yes >>>>>>>>> checking minix/config.h usability... no >>>>>>>>> checking minix/config.h presence... no >>>>>>>>> checking for minix/config.h... no >>>>>>>>> checking whether it is safe to define __EXTENSIONS__... yes >>>>>>>>> checking for gcc... (cached) gcc >>>>>>>>> checking whether we are using the GNU C compiler... (cached) yes >>>>>>>>> checking whether gcc accepts -g... (cached) yes >>>>>>>>> checking for gcc option to accept ISO C89... (cached) none needed >>>>>>>>> checking whether C compiler accepts -std=gnu90... yes >>>>>>>>> checking build system type... x86_64-pc-linux-gnu >>>>>>>>> checking host system type... x86_64-pc-linux-gnu >>>>>>>>> checking for an ANSI C-conforming const... yes >>>>>>>>> checking for working volatile... yes >>>>>>>>> checking whether byte ordering is bigendian... no >>>>>>>>> checking for special C compiler options needed for large files... no >>>>>>>>> checking for _FILE_OFFSET_BITS value needed for large files... no >>>>>>>>> checking for a BSD-compatible install... /usr/bin/install -c >>>>>>>>> checking whether ln -s works... yes >>>>>>>>> checking for ar... ar >>>>>>>>> checking for rm... /bin/rm >>>>>>>>> checking for rmdir... /bin/rmdir >>>>>>>>> checking for openat... yes >>>>>>>>> checking for reallocarray... yes >>>>>>>>> checking for clock_gettime... yes >>>>>>>>> checking linux/perf_event.h usability... yes >>>>>>>>> checking linux/perf_event.h presence... yes >>>>>>>>> checking for linux/perf_event.h... yes >>>>>>>>> checking linux/hw_breakpoint.h usability... yes >>>>>>>>> checking linux/hw_breakpoint.h presence... yes >>>>>>>>> checking for linux/hw_breakpoint.h... yes >>>>>>>>> checking for pkg-config... /usr/bin/pkg-config >>>>>>>>> checking pkg-config is at least version 0.9.0... yes >>>>>>>>> checking execinfo.h usability... yes >>>>>>>>> checking execinfo.h presence... yes >>>>>>>>> checking for execinfo.h... yes >>>>>>>>> checking for backtrace... yes >>>>>>>>> checking for backtrace_symbols_fd... yes >>>>>>>>> checking for xmlto... /usr/bin/xmlto >>>>>>>>> checking for mv... /bin/mv >>>>>>>>> checking for a sed that does not truncate output... /bin/sed >>>>>>>>> checking for asciidoc... /usr/bin/asciidoc >>>>>>>>> checking for asciidoctor... no >>>>>>>>> checking for EXT2FS... yes >>>>>>>>> checking for COM_ERR... yes >>>>>>>>> checking for REISERFS... yes >>>>>>>>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes >>>>>>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes >>>>>>>>> checking linux/blkzoned.h usability... yes >>>>>>>>> checking linux/blkzoned.h presence... yes >>>>>>>>> checking for linux/blkzoned.h... yes >>>>>>>>> checking for struct blk_zone.capacity... no >>>>>>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes >>>>>>>> >>>>>>>>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> Info on the file in question (linux/blkzoned.h): >>>>>>>>> >>>>>>>>> $ dpkg -S /usr/include/linux/blkzoned.h >>>>>>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h >>>>>>>>> >>>>>>>>> $ dpkg -l linux-libc-dev >>>>>>>>> Desired=Unknown/Install/Remove/Purge/Hold >>>>>>>>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >>>>>>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >>>>>>>>> ||/ Name Version Architecture Description >>>>>>>>> +++-====================-============-============-==================================== >>>>>>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel >>>>>>>>> Headers for development >>>>>>>>> >>>>>>>>> >>>>>>>>> So it appears that linux-libc-dev is way out-dated compared to my >>>>>>>>> kernel. I don't know how to update it, though... there doesn't appear >>>>>>>>> to be a newer version available. >>>>>>>> >>>>>>>> You could disable the zoned. >>>>>>>> >>>>>>>> ./configure --disable-zoned >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 17:02 ` Robert Wyrick 2021-09-07 17:17 ` Robert Wyrick @ 2021-09-07 23:15 ` Qu Wenruo 2021-09-08 1:59 ` Su Yue 1 sibling, 1 reply; 20+ messages in thread From: Qu Wenruo @ 2021-09-07 23:15 UTC (permalink / raw) To: Robert Wyrick; +Cc: Anand Jain, linux-btrfs On 2021/9/8 上午1:02, Robert Wyrick wrote: > Ran a repair: > > $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make install, > just ran from the compiled directory > enabling repair mode > WARNING: > > Do not use --repair unless you are advised to do so by a developer > or an experienced user, and then only after having accepted that no > fsck can successfully repair all types of filesystem corruption. Eg. > some software or hardware bugs can fatally damage a volume. > The operation will start in 10 seconds. > Use Ctrl-C to stop it. > 10 9 8 7 6 5 4 3 2 1 > Starting repair. > Opening filesystem to check... > Checking filesystem on /dev/sda > UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > [1/7] checking root items (0:00:59 elapsed, > 2649102 items checked) > Fixed 0 roots. > Reset extent item (38179182174208) generation to 4057084elapsed, > 1116143 items checked) > No device size related problem found (0:02:22 elapsed, > 1116143 items checked) > [2/7] checking extents (0:02:23 elapsed, > 1116143 items checked) > cache and super generation don't match, space cache will be invalidated > [3/7] checking free space cache (0:00:00 elapsed) > Deleting bad dir index [8348950,96,3] root 259 (0:00:25 elapsed, > 106695 items checked) > repairing missing dir index item for inode 834922400:26 elapsed, > 108893 items checked) > [4/7] checking fs roots (0:01:04 elapsed, > 217787 items checked) > [5/7] checking csums (without verifying data) (0:00:04 elapsed, > 12350321 items checked) > [6/7] checking root refs (0:00:00 elapsed, 4 > items checked) > [7/7] checking quota groups skipped (not enabled on this FS) > found 15729059057664 bytes used, no error found > total csum bytes: 15313288548 > total tree bytes: 18286739456 > total fs tree bytes: 1791819776 > total extent tree bytes: 229130240 > btree space waste bytes: 1018844959 > file data blocks allocated: 51587230502912 > referenced 15627926712320 > > I can now mount the filesystem successfully! Thank you for your help. > > I do have some additional questions if you don't mind... > I am already using RAID 1 to handle single disk outages. One thing to note is, RAID is not perfect, not even close to proper backup. RAID is really only suitable to handle disk failures, nothing more than that. In a spectrum of backup, RAID is really just better than nothing. In this particular case, all the corruption is from bitflips, thus all copies are corrupted, no profile can save the day. > I assume > things could have gone much worse and I could have lost the whole > filesystem. If you're using newer kernels all time, the kernel can detect the extent generation problem before writing the corrupted data back to disk, thus save the day. > Aside from backups (I know, I know), is there anything > else I can do to prevent such issues or make them easier to recover > from? Newer kernel (v5.11 and newer) can prevent it. Although when such rejection happens, it will not feel that comfort though, as it would mostly result the fs to go RO. But still way better than writing bad data onto disks. > Could this problem have been avoided/detected earlier? Yes, newer kernel. > This > wasn't a disk failure and according to memtest86+, it wasn't due to > bad memory either.... I still don't believe, maybe you can try to run memtester (which is ran in user space, and since we have kernel doing the page mapping, it may expose a different workload on the memory controller than memtest86+) Since the extent generation corruption is a super obvious bitflip. > I don't run scrubs very often. Should I? For newer kernels, the corruption can be rejected in first place, thus the scrub is only going to detect problems already in the fs. For older kernels, scrub won't detect the problem anyw. So I guess you don't need that frequent scrub, but it's still recommended. Maybe monthly? > I > guess the more general question is: What are the best practices for > maintaining a healthy btrfs file system? Well, healthy hardware, balanced kernel version between cutting edge and stable. Personally I'm more towards cutting edge thought. Thanks, Qu > > Thanks again! > > On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2021/9/7 下午12:36, Robert Wyrick wrote: >>> What exactly would i be disabling? I don't know what zoned does. >> >> The zoned device support. >> >> If you don't have any host-managed zoned device, there is no reason you >> would like to enable it. >> >> https://zonedstorage.io/introduction/ >> >> Thanks, >> Qu >> >>> >>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain <anand.jain@oracle.com> wrote: >>>> >>>> On 07/09/2021 10:36, Robert Wyrick wrote: >>>>> Trying to build latest btrfs-progs. I'm seeing errors in the configure script. >>>>> >>>>> $ cat /etc/os-release >>>>> NAME="Linux Mint" >>>>> VERSION="20.2 (Uma)" >>>>> ID=linuxmint >>>>> ID_LIKE=ubuntu >>>>> PRETTY_NAME="Linux Mint 20.2" >>>>> VERSION_ID="20.2" >>>>> HOME_URL="https://www.linuxmint.com/" >>>>> SUPPORT_URL="https://forums.linuxmint.com/" >>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" >>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" >>>>> VERSION_CODENAME=uma >>>>> UBUNTU_CODENAME=focal >>>>> >>>>> $ uname -a >>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 >>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>>> >>>>> $ ./configure >>>>> checking for gcc... gcc >>>>> checking whether the C compiler works... yes >>>>> checking for C compiler default output file name... a.out >>>>> checking for suffix of executables... >>>>> checking whether we are cross compiling... no >>>>> checking for suffix of object files... o >>>>> checking whether we are using the GNU C compiler... yes >>>>> checking whether gcc accepts -g... yes >>>>> checking for gcc option to accept ISO C89... none needed >>>>> checking how to run the C preprocessor... gcc -E >>>>> checking for grep that handles long lines and -e... /bin/grep >>>>> checking for egrep... /bin/grep -E >>>>> checking for ANSI C header files... yes >>>>> checking for sys/types.h... yes >>>>> checking for sys/stat.h... yes >>>>> checking for stdlib.h... yes >>>>> checking for string.h... yes >>>>> checking for memory.h... yes >>>>> checking for strings.h... yes >>>>> checking for inttypes.h... yes >>>>> checking for stdint.h... yes >>>>> checking for unistd.h... yes >>>>> checking minix/config.h usability... no >>>>> checking minix/config.h presence... no >>>>> checking for minix/config.h... no >>>>> checking whether it is safe to define __EXTENSIONS__... yes >>>>> checking for gcc... (cached) gcc >>>>> checking whether we are using the GNU C compiler... (cached) yes >>>>> checking whether gcc accepts -g... (cached) yes >>>>> checking for gcc option to accept ISO C89... (cached) none needed >>>>> checking whether C compiler accepts -std=gnu90... yes >>>>> checking build system type... x86_64-pc-linux-gnu >>>>> checking host system type... x86_64-pc-linux-gnu >>>>> checking for an ANSI C-conforming const... yes >>>>> checking for working volatile... yes >>>>> checking whether byte ordering is bigendian... no >>>>> checking for special C compiler options needed for large files... no >>>>> checking for _FILE_OFFSET_BITS value needed for large files... no >>>>> checking for a BSD-compatible install... /usr/bin/install -c >>>>> checking whether ln -s works... yes >>>>> checking for ar... ar >>>>> checking for rm... /bin/rm >>>>> checking for rmdir... /bin/rmdir >>>>> checking for openat... yes >>>>> checking for reallocarray... yes >>>>> checking for clock_gettime... yes >>>>> checking linux/perf_event.h usability... yes >>>>> checking linux/perf_event.h presence... yes >>>>> checking for linux/perf_event.h... yes >>>>> checking linux/hw_breakpoint.h usability... yes >>>>> checking linux/hw_breakpoint.h presence... yes >>>>> checking for linux/hw_breakpoint.h... yes >>>>> checking for pkg-config... /usr/bin/pkg-config >>>>> checking pkg-config is at least version 0.9.0... yes >>>>> checking execinfo.h usability... yes >>>>> checking execinfo.h presence... yes >>>>> checking for execinfo.h... yes >>>>> checking for backtrace... yes >>>>> checking for backtrace_symbols_fd... yes >>>>> checking for xmlto... /usr/bin/xmlto >>>>> checking for mv... /bin/mv >>>>> checking for a sed that does not truncate output... /bin/sed >>>>> checking for asciidoc... /usr/bin/asciidoc >>>>> checking for asciidoctor... no >>>>> checking for EXT2FS... yes >>>>> checking for COM_ERR... yes >>>>> checking for REISERFS... yes >>>>> checking for FIEMAP_EXTENT_SHARED defined in linux/fiemap.h... yes >>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... yes >>>>> checking linux/blkzoned.h usability... yes >>>>> checking linux/blkzoned.h presence... yes >>>>> checking for linux/blkzoned.h... yes >>>>> checking for struct blk_zone.capacity... no >>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... yes >>>> >>>>> configure: error: linux/blkzoned.h does not provide blk_zone.capacity >>>> >>>> >>>>> >>>>> --- >>>>> >>>>> Info on the file in question (linux/blkzoned.h): >>>>> >>>>> $ dpkg -S /usr/include/linux/blkzoned.h >>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h >>>>> >>>>> $ dpkg -l linux-libc-dev >>>>> Desired=Unknown/Install/Remove/Purge/Hold >>>>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >>>>> ||/ Name Version Architecture Description >>>>> +++-====================-============-============-==================================== >>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux Kernel >>>>> Headers for development >>>>> >>>>> >>>>> So it appears that linux-libc-dev is way out-dated compared to my >>>>> kernel. I don't know how to update it, though... there doesn't appear >>>>> to be a newer version available. >>>> >>>> You could disable the zoned. >>>> >>>> ./configure --disable-zoned >>>> >>>> >>>> >>>> >>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-07 23:15 ` Qu Wenruo @ 2021-09-08 1:59 ` Su Yue 2021-09-08 6:50 ` Robert Wyrick 0 siblings, 1 reply; 20+ messages in thread From: Su Yue @ 2021-09-08 1:59 UTC (permalink / raw) To: Qu Wenruo; +Cc: Robert Wyrick, Anand Jain, linux-btrfs On Wed 08 Sep 2021 at 07:15, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > On 2021/9/8 上午1:02, Robert Wyrick wrote: >> Ran a repair: >> >> $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make >> install, >> just ran from the compiled directory >> enabling repair mode >> WARNING: >> >> Do not use --repair unless you are advised to do so by a >> developer >> or an experienced user, and then only after having accepted >> that no >> fsck can successfully repair all types of filesystem >> corruption. Eg. >> some software or hardware bugs can fatally damage a volume. >> The operation will start in 10 seconds. >> Use Ctrl-C to stop it. >> 10 9 8 7 6 5 4 3 2 1 >> Starting repair. >> Opening filesystem to check... >> Checking filesystem on /dev/sda >> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf >> [1/7] checking root items (0:00:59 >> elapsed, >> 2649102 items checked) >> Fixed 0 roots. >> Reset extent item (38179182174208) generation to >> 4057084elapsed, >> 1116143 items checked) >> No device size related problem found (0:02:22 >> elapsed, >> 1116143 items checked) >> [2/7] checking extents (0:02:23 >> elapsed, >> 1116143 items checked) >> cache and super generation don't match, space cache will be >> invalidated >> [3/7] checking free space cache (0:00:00 >> elapsed) >> Deleting bad dir index [8348950,96,3] root 259 (0:00:25 >> elapsed, >> 106695 items checked) >> repairing missing dir index item for inode 834922400:26 >> elapsed, >> 108893 items checked) >> [4/7] checking fs roots (0:01:04 >> elapsed, >> 217787 items checked) >> [5/7] checking csums (without verifying data) (0:00:04 >> elapsed, >> 12350321 items checked) >> [6/7] checking root refs (0:00:00 >> elapsed, 4 >> items checked) >> [7/7] checking quota groups skipped (not enabled on this FS) >> found 15729059057664 bytes used, no error found >> total csum bytes: 15313288548 >> total tree bytes: 18286739456 >> total fs tree bytes: 1791819776 >> total extent tree bytes: 229130240 >> btree space waste bytes: 1018844959 >> file data blocks allocated: 51587230502912 >> referenced 15627926712320 >> >> I can now mount the filesystem successfully! Thank you for >> your help. >> >> I do have some additional questions if you don't mind... >> I am already using RAID 1 to handle single disk outages. > > One thing to note is, RAID is not perfect, not even close to > proper backup. > > RAID is really only suitable to handle disk failures, nothing > more than > that. > > In a spectrum of backup, RAID is really just better than > nothing. > > In this particular case, all the corruption is from bitflips, > thus all > copies are corrupted, no profile can save the day. > >> I assume >> things could have gone much worse and I could have lost the >> whole >> filesystem. > > If you're using newer kernels all time, the kernel can detect > the extent > generation problem before writing the corrupted data back to > disk, thus > save the day. > >> Aside from backups (I know, I know), is there anything >> else I can do to prevent such issues or make them easier to >> recover >> from? > > Newer kernel (v5.11 and newer) can prevent it. > Although when such rejection happens, it will not feel that > comfort > though, as it would mostly result the fs to go RO. > But still way better than writing bad data onto disks. > >> Could this problem have been avoided/detected earlier? > > Yes, newer kernel. > >> This >> wasn't a disk failure and according to memtest86+, it wasn't >> due to >> bad memory either.... > > I still don't believe, maybe you can try to run memtester (which > is ran > in user space, and since we have kernel doing the page mapping, > it may > expose a different workload on the memory controller than > memtest86+) > And testmem5 using config anta777 may help. It's widely used to test memory stability after overclocking but runs on evil Windows though. It won't take too much time. Two cicles test of 64G memory consumes about 4~6 hours. -- Su Subject > Since the extent generation corruption is a super obvious > bitflip. > >> I don't run scrubs very often. Should I? > > For newer kernels, the corruption can be rejected in first > place, thus > the scrub is only going to detect problems already in the fs. > > For older kernels, scrub won't detect the problem anyw. > > So I guess you don't need that frequent scrub, but it's still > recommended. > Maybe monthly? > >> I >> guess the more general question is: What are the best >> practices for >> maintaining a healthy btrfs file system? > > Well, healthy hardware, balanced kernel version between cutting > edge and > stable. > Personally I'm more towards cutting edge thought. > > Thanks, > Qu > >> >> Thanks again! >> >> On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo >> <quwenruo.btrfs@gmx.com> wrote: >>> >>> >>> >>> On 2021/9/7 下午12:36, Robert Wyrick wrote: >>>> What exactly would i be disabling? I don't know what zoned >>>> does. >>> >>> The zoned device support. >>> >>> If you don't have any host-managed zoned device, there is no >>> reason you >>> would like to enable it. >>> >>> https://zonedstorage.io/introduction/ >>> >>> Thanks, >>> Qu >>> >>>> >>>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain >>>> <anand.jain@oracle.com> wrote: >>>>> >>>>> On 07/09/2021 10:36, Robert Wyrick wrote: >>>>>> Trying to build latest btrfs-progs. I'm seeing errors in >>>>>> the configure script. >>>>>> >>>>>> $ cat /etc/os-release >>>>>> NAME="Linux Mint" >>>>>> VERSION="20.2 (Uma)" >>>>>> ID=linuxmint >>>>>> ID_LIKE=ubuntu >>>>>> PRETTY_NAME="Linux Mint 20.2" >>>>>> VERSION_ID="20.2" >>>>>> HOME_URL="https://www.linuxmint.com/" >>>>>> SUPPORT_URL="https://forums.linuxmint.com/" >>>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" >>>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" >>>>>> VERSION_CODENAME=uma >>>>>> UBUNTU_CODENAME=focal >>>>>> >>>>>> $ uname -a >>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed >>>>>> Aug 11 >>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux >>>>>> >>>>>> $ ./configure >>>>>> checking for gcc... gcc >>>>>> checking whether the C compiler works... yes >>>>>> checking for C compiler default output file name... a.out >>>>>> checking for suffix of executables... >>>>>> checking whether we are cross compiling... no >>>>>> checking for suffix of object files... o >>>>>> checking whether we are using the GNU C compiler... yes >>>>>> checking whether gcc accepts -g... yes >>>>>> checking for gcc option to accept ISO C89... none needed >>>>>> checking how to run the C preprocessor... gcc -E >>>>>> checking for grep that handles long lines and -e... >>>>>> /bin/grep >>>>>> checking for egrep... /bin/grep -E >>>>>> checking for ANSI C header files... yes >>>>>> checking for sys/types.h... yes >>>>>> checking for sys/stat.h... yes >>>>>> checking for stdlib.h... yes >>>>>> checking for string.h... yes >>>>>> checking for memory.h... yes >>>>>> checking for strings.h... yes >>>>>> checking for inttypes.h... yes >>>>>> checking for stdint.h... yes >>>>>> checking for unistd.h... yes >>>>>> checking minix/config.h usability... no >>>>>> checking minix/config.h presence... no >>>>>> checking for minix/config.h... no >>>>>> checking whether it is safe to define __EXTENSIONS__... yes >>>>>> checking for gcc... (cached) gcc >>>>>> checking whether we are using the GNU C compiler... >>>>>> (cached) yes >>>>>> checking whether gcc accepts -g... (cached) yes >>>>>> checking for gcc option to accept ISO C89... (cached) none >>>>>> needed >>>>>> checking whether C compiler accepts -std=gnu90... yes >>>>>> checking build system type... x86_64-pc-linux-gnu >>>>>> checking host system type... x86_64-pc-linux-gnu >>>>>> checking for an ANSI C-conforming const... yes >>>>>> checking for working volatile... yes >>>>>> checking whether byte ordering is bigendian... no >>>>>> checking for special C compiler options needed for large >>>>>> files... no >>>>>> checking for _FILE_OFFSET_BITS value needed for large >>>>>> files... no >>>>>> checking for a BSD-compatible install... /usr/bin/install >>>>>> -c >>>>>> checking whether ln -s works... yes >>>>>> checking for ar... ar >>>>>> checking for rm... /bin/rm >>>>>> checking for rmdir... /bin/rmdir >>>>>> checking for openat... yes >>>>>> checking for reallocarray... yes >>>>>> checking for clock_gettime... yes >>>>>> checking linux/perf_event.h usability... yes >>>>>> checking linux/perf_event.h presence... yes >>>>>> checking for linux/perf_event.h... yes >>>>>> checking linux/hw_breakpoint.h usability... yes >>>>>> checking linux/hw_breakpoint.h presence... yes >>>>>> checking for linux/hw_breakpoint.h... yes >>>>>> checking for pkg-config... /usr/bin/pkg-config >>>>>> checking pkg-config is at least version 0.9.0... yes >>>>>> checking execinfo.h usability... yes >>>>>> checking execinfo.h presence... yes >>>>>> checking for execinfo.h... yes >>>>>> checking for backtrace... yes >>>>>> checking for backtrace_symbols_fd... yes >>>>>> checking for xmlto... /usr/bin/xmlto >>>>>> checking for mv... /bin/mv >>>>>> checking for a sed that does not truncate output... >>>>>> /bin/sed >>>>>> checking for asciidoc... /usr/bin/asciidoc >>>>>> checking for asciidoctor... no >>>>>> checking for EXT2FS... yes >>>>>> checking for COM_ERR... yes >>>>>> checking for REISERFS... yes >>>>>> checking for FIEMAP_EXTENT_SHARED defined in >>>>>> linux/fiemap.h... yes >>>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... >>>>>> yes >>>>>> checking linux/blkzoned.h usability... yes >>>>>> checking linux/blkzoned.h presence... yes >>>>>> checking for linux/blkzoned.h... yes >>>>>> checking for struct blk_zone.capacity... no >>>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... >>>>>> yes >>>>> >>>>>> configure: error: linux/blkzoned.h does not provide >>>>>> blk_zone.capacity >>>>> >>>>> >>>>>> >>>>>> --- >>>>>> >>>>>> Info on the file in question (linux/blkzoned.h): >>>>>> >>>>>> $ dpkg -S /usr/include/linux/blkzoned.h >>>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h >>>>>> >>>>>> $ dpkg -l linux-libc-dev >>>>>> Desired=Unknown/Install/Remove/Purge/Hold >>>>>> | >>>>>> Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >>>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >>>>>> ||/ Name Version Architecture >>>>>> Description >>>>>> +++-====================-============-============-==================================== >>>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux >>>>>> Kernel >>>>>> Headers for development >>>>>> >>>>>> >>>>>> So it appears that linux-libc-dev is way out-dated compared >>>>>> to my >>>>>> kernel. I don't know how to update it, though... there >>>>>> doesn't appear >>>>>> to be a newer version available. >>>>> >>>>> You could disable the zoned. >>>>> >>>>> ./configure --disable-zoned >>>>> >>>>> >>>>> >>>>> >>>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Next steps in recovery? 2021-09-08 1:59 ` Su Yue @ 2021-09-08 6:50 ` Robert Wyrick 0 siblings, 0 replies; 20+ messages in thread From: Robert Wyrick @ 2021-09-08 6:50 UTC (permalink / raw) To: Su Yue; +Cc: Qu Wenruo, Anand Jain, linux-btrfs I ran memtester for 2 complete passes. No failures. I only tested 51G of my 64G, though. :/ That's how much was "free". On Tue, Sep 7, 2021 at 8:10 PM Su Yue <l@damenly.su> wrote: > > > On Wed 08 Sep 2021 at 07:15, Qu Wenruo <quwenruo.btrfs@gmx.com> > wrote: > > > On 2021/9/8 上午1:02, Robert Wyrick wrote: > >> Ran a repair: > >> > >> $ sudo ./btrfs check --repair -p /dev/sda # I did NOT make > >> install, > >> just ran from the compiled directory > >> enabling repair mode > >> WARNING: > >> > >> Do not use --repair unless you are advised to do so by a > >> developer > >> or an experienced user, and then only after having accepted > >> that no > >> fsck can successfully repair all types of filesystem > >> corruption. Eg. > >> some software or hardware bugs can fatally damage a volume. > >> The operation will start in 10 seconds. > >> Use Ctrl-C to stop it. > >> 10 9 8 7 6 5 4 3 2 1 > >> Starting repair. > >> Opening filesystem to check... > >> Checking filesystem on /dev/sda > >> UUID: 75f1f45c-552e-4ae2-a56f-46e44b6647cf > >> [1/7] checking root items (0:00:59 > >> elapsed, > >> 2649102 items checked) > >> Fixed 0 roots. > >> Reset extent item (38179182174208) generation to > >> 4057084elapsed, > >> 1116143 items checked) > >> No device size related problem found (0:02:22 > >> elapsed, > >> 1116143 items checked) > >> [2/7] checking extents (0:02:23 > >> elapsed, > >> 1116143 items checked) > >> cache and super generation don't match, space cache will be > >> invalidated > >> [3/7] checking free space cache (0:00:00 > >> elapsed) > >> Deleting bad dir index [8348950,96,3] root 259 (0:00:25 > >> elapsed, > >> 106695 items checked) > >> repairing missing dir index item for inode 834922400:26 > >> elapsed, > >> 108893 items checked) > >> [4/7] checking fs roots (0:01:04 > >> elapsed, > >> 217787 items checked) > >> [5/7] checking csums (without verifying data) (0:00:04 > >> elapsed, > >> 12350321 items checked) > >> [6/7] checking root refs (0:00:00 > >> elapsed, 4 > >> items checked) > >> [7/7] checking quota groups skipped (not enabled on this FS) > >> found 15729059057664 bytes used, no error found > >> total csum bytes: 15313288548 > >> total tree bytes: 18286739456 > >> total fs tree bytes: 1791819776 > >> total extent tree bytes: 229130240 > >> btree space waste bytes: 1018844959 > >> file data blocks allocated: 51587230502912 > >> referenced 15627926712320 > >> > >> I can now mount the filesystem successfully! Thank you for > >> your help. > >> > >> I do have some additional questions if you don't mind... > >> I am already using RAID 1 to handle single disk outages. > > > > One thing to note is, RAID is not perfect, not even close to > > proper backup. > > > > RAID is really only suitable to handle disk failures, nothing > > more than > > that. > > > > In a spectrum of backup, RAID is really just better than > > nothing. > > > > In this particular case, all the corruption is from bitflips, > > thus all > > copies are corrupted, no profile can save the day. > > > >> I assume > >> things could have gone much worse and I could have lost the > >> whole > >> filesystem. > > > > If you're using newer kernels all time, the kernel can detect > > the extent > > generation problem before writing the corrupted data back to > > disk, thus > > save the day. > > > >> Aside from backups (I know, I know), is there anything > >> else I can do to prevent such issues or make them easier to > >> recover > >> from? > > > > Newer kernel (v5.11 and newer) can prevent it. > > Although when such rejection happens, it will not feel that > > comfort > > though, as it would mostly result the fs to go RO. > > But still way better than writing bad data onto disks. > > > >> Could this problem have been avoided/detected earlier? > > > > Yes, newer kernel. > > > >> This > >> wasn't a disk failure and according to memtest86+, it wasn't > >> due to > >> bad memory either.... > > > > I still don't believe, maybe you can try to run memtester (which > > is ran > > in user space, and since we have kernel doing the page mapping, > > it may > > expose a different workload on the memory controller than > > memtest86+) > > > And testmem5 using config anta777 may help. It's widely used to > test > memory stability after overclocking but runs on evil Windows > though. > It won't take too much time. Two cicles test of 64G memory > consumes > about 4~6 hours. > > -- > Su > > Subject > > Since the extent generation corruption is a super obvious > > bitflip. > > > >> I don't run scrubs very often. Should I? > > > > For newer kernels, the corruption can be rejected in first > > place, thus > > the scrub is only going to detect problems already in the fs. > > > > For older kernels, scrub won't detect the problem anyw. > > > > So I guess you don't need that frequent scrub, but it's still > > recommended. > > Maybe monthly? > > > >> I > >> guess the more general question is: What are the best > >> practices for > >> maintaining a healthy btrfs file system? > > > > Well, healthy hardware, balanced kernel version between cutting > > edge and > > stable. > > Personally I'm more towards cutting edge thought. > > > > Thanks, > > Qu > > > >> > >> Thanks again! > >> > >> On Mon, Sep 6, 2021 at 10:53 PM Qu Wenruo > >> <quwenruo.btrfs@gmx.com> wrote: > >>> > >>> > >>> > >>> On 2021/9/7 下午12:36, Robert Wyrick wrote: > >>>> What exactly would i be disabling? I don't know what zoned > >>>> does. > >>> > >>> The zoned device support. > >>> > >>> If you don't have any host-managed zoned device, there is no > >>> reason you > >>> would like to enable it. > >>> > >>> https://zonedstorage.io/introduction/ > >>> > >>> Thanks, > >>> Qu > >>> > >>>> > >>>> On Mon, Sep 6, 2021, 9:07 PM Anand Jain > >>>> <anand.jain@oracle.com> wrote: > >>>>> > >>>>> On 07/09/2021 10:36, Robert Wyrick wrote: > >>>>>> Trying to build latest btrfs-progs. I'm seeing errors in > >>>>>> the configure script. > >>>>>> > >>>>>> $ cat /etc/os-release > >>>>>> NAME="Linux Mint" > >>>>>> VERSION="20.2 (Uma)" > >>>>>> ID=linuxmint > >>>>>> ID_LIKE=ubuntu > >>>>>> PRETTY_NAME="Linux Mint 20.2" > >>>>>> VERSION_ID="20.2" > >>>>>> HOME_URL="https://www.linuxmint.com/" > >>>>>> SUPPORT_URL="https://forums.linuxmint.com/" > >>>>>> BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/" > >>>>>> PRIVACY_POLICY_URL="https://www.linuxmint.com/" > >>>>>> VERSION_CODENAME=uma > >>>>>> UBUNTU_CODENAME=focal > >>>>>> > >>>>>> $ uname -a > >>>>>> Linux bigbox 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed > >>>>>> Aug 11 > >>>>>> 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > >>>>>> > >>>>>> $ ./configure > >>>>>> checking for gcc... gcc > >>>>>> checking whether the C compiler works... yes > >>>>>> checking for C compiler default output file name... a.out > >>>>>> checking for suffix of executables... > >>>>>> checking whether we are cross compiling... no > >>>>>> checking for suffix of object files... o > >>>>>> checking whether we are using the GNU C compiler... yes > >>>>>> checking whether gcc accepts -g... yes > >>>>>> checking for gcc option to accept ISO C89... none needed > >>>>>> checking how to run the C preprocessor... gcc -E > >>>>>> checking for grep that handles long lines and -e... > >>>>>> /bin/grep > >>>>>> checking for egrep... /bin/grep -E > >>>>>> checking for ANSI C header files... yes > >>>>>> checking for sys/types.h... yes > >>>>>> checking for sys/stat.h... yes > >>>>>> checking for stdlib.h... yes > >>>>>> checking for string.h... yes > >>>>>> checking for memory.h... yes > >>>>>> checking for strings.h... yes > >>>>>> checking for inttypes.h... yes > >>>>>> checking for stdint.h... yes > >>>>>> checking for unistd.h... yes > >>>>>> checking minix/config.h usability... no > >>>>>> checking minix/config.h presence... no > >>>>>> checking for minix/config.h... no > >>>>>> checking whether it is safe to define __EXTENSIONS__... yes > >>>>>> checking for gcc... (cached) gcc > >>>>>> checking whether we are using the GNU C compiler... > >>>>>> (cached) yes > >>>>>> checking whether gcc accepts -g... (cached) yes > >>>>>> checking for gcc option to accept ISO C89... (cached) none > >>>>>> needed > >>>>>> checking whether C compiler accepts -std=gnu90... yes > >>>>>> checking build system type... x86_64-pc-linux-gnu > >>>>>> checking host system type... x86_64-pc-linux-gnu > >>>>>> checking for an ANSI C-conforming const... yes > >>>>>> checking for working volatile... yes > >>>>>> checking whether byte ordering is bigendian... no > >>>>>> checking for special C compiler options needed for large > >>>>>> files... no > >>>>>> checking for _FILE_OFFSET_BITS value needed for large > >>>>>> files... no > >>>>>> checking for a BSD-compatible install... /usr/bin/install > >>>>>> -c > >>>>>> checking whether ln -s works... yes > >>>>>> checking for ar... ar > >>>>>> checking for rm... /bin/rm > >>>>>> checking for rmdir... /bin/rmdir > >>>>>> checking for openat... yes > >>>>>> checking for reallocarray... yes > >>>>>> checking for clock_gettime... yes > >>>>>> checking linux/perf_event.h usability... yes > >>>>>> checking linux/perf_event.h presence... yes > >>>>>> checking for linux/perf_event.h... yes > >>>>>> checking linux/hw_breakpoint.h usability... yes > >>>>>> checking linux/hw_breakpoint.h presence... yes > >>>>>> checking for linux/hw_breakpoint.h... yes > >>>>>> checking for pkg-config... /usr/bin/pkg-config > >>>>>> checking pkg-config is at least version 0.9.0... yes > >>>>>> checking execinfo.h usability... yes > >>>>>> checking execinfo.h presence... yes > >>>>>> checking for execinfo.h... yes > >>>>>> checking for backtrace... yes > >>>>>> checking for backtrace_symbols_fd... yes > >>>>>> checking for xmlto... /usr/bin/xmlto > >>>>>> checking for mv... /bin/mv > >>>>>> checking for a sed that does not truncate output... > >>>>>> /bin/sed > >>>>>> checking for asciidoc... /usr/bin/asciidoc > >>>>>> checking for asciidoctor... no > >>>>>> checking for EXT2FS... yes > >>>>>> checking for COM_ERR... yes > >>>>>> checking for REISERFS... yes > >>>>>> checking for FIEMAP_EXTENT_SHARED defined in > >>>>>> linux/fiemap.h... yes > >>>>>> checking for EXT4_EPOCH_MASK defined in ext2fs/ext2_fs.h... > >>>>>> yes > >>>>>> checking linux/blkzoned.h usability... yes > >>>>>> checking linux/blkzoned.h presence... yes > >>>>>> checking for linux/blkzoned.h... yes > >>>>>> checking for struct blk_zone.capacity... no > >>>>>> checking for BLKGETZONESZ defined in linux/blkzoned.h... > >>>>>> yes > >>>>> > >>>>>> configure: error: linux/blkzoned.h does not provide > >>>>>> blk_zone.capacity > >>>>> > >>>>> > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> Info on the file in question (linux/blkzoned.h): > >>>>>> > >>>>>> $ dpkg -S /usr/include/linux/blkzoned.h > >>>>>> linux-libc-dev:amd64: /usr/include/linux/blkzoned.h > >>>>>> > >>>>>> $ dpkg -l linux-libc-dev > >>>>>> Desired=Unknown/Install/Remove/Purge/Hold > >>>>>> | > >>>>>> Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > >>>>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > >>>>>> ||/ Name Version Architecture > >>>>>> Description > >>>>>> +++-====================-============-============-==================================== > >>>>>> ii linux-libc-dev:amd64 5.4.0-81.91 amd64 Linux > >>>>>> Kernel > >>>>>> Headers for development > >>>>>> > >>>>>> > >>>>>> So it appears that linux-libc-dev is way out-dated compared > >>>>>> to my > >>>>>> kernel. I don't know how to update it, though... there > >>>>>> doesn't appear > >>>>>> to be a newer version available. > >>>>> > >>>>> You could disable the zoned. > >>>>> > >>>>> ./configure --disable-zoned > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-09-08 6:50 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-03 2:43 Next steps in recovery? Robert Wyrick 2021-09-03 2:47 ` Robert Wyrick 2021-09-03 6:48 ` Qu Wenruo 2021-09-03 6:53 ` Qu Wenruo [not found] ` <CAA_aC99-C8xOf7EAvJAMk2ZkYSaN2vyK7YFMw06utQ0T+tsh9A@mail.gmail.com> 2021-09-05 22:03 ` Qu Wenruo 2021-09-06 14:42 ` Robert Wyrick 2021-09-06 23:26 ` Qu Wenruo 2021-09-07 2:36 ` Robert Wyrick 2021-09-07 3:06 ` Anand Jain 2021-09-07 4:36 ` Robert Wyrick 2021-09-07 4:53 ` Qu Wenruo 2021-09-07 17:02 ` Robert Wyrick 2021-09-07 17:17 ` Robert Wyrick 2021-09-07 20:47 ` Robert Wyrick 2021-09-07 23:17 ` Qu Wenruo 2021-09-07 23:20 ` Robert Wyrick 2021-09-07 23:28 ` Qu Wenruo 2021-09-07 23:15 ` Qu Wenruo 2021-09-08 1:59 ` Su Yue 2021-09-08 6:50 ` Robert Wyrick
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.