All of lore.kernel.org
 help / color / mirror / Atom feed
From: Larkin Lowrey <llowrey@nuclearwinter.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Scrub aborts due to corrupt leaf
Date: Wed, 10 Oct 2018 11:44:26 -0400	[thread overview]
Message-ID: <dd08eb00-faf8-cd88-aeb6-e5ed75a2889f@nuclearwinter.com> (raw)
In-Reply-To: <9c7290ea-668d-c10a-9328-91adfac14d5a@nuclearwinter.com>

On 9/11/2018 11:23 AM, Larkin Lowrey wrote:
> On 8/29/2018 1:32 AM, Qu Wenruo wrote:
>>
>> On 2018/8/28 下午9:56, Chris Murphy wrote:
>>> On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> 
>>> wrote:
>>>>
>>>> On 2018/8/28 下午9:29, Larkin Lowrey wrote:
>>>>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote:
>>>>>> On 8/27/2018 12:46 AM, Qu Wenruo wrote:
>>>>>>>> The system uses ECC memory and edac-util has not reported any 
>>>>>>>> errors.
>>>>>>>> However, I will run a memtest anyway.
>>>>>>> So it should not be the memory problem.
>>>>>>>
>>>>>>> BTW, what's the current generation of the fs?
>>>>>>>
>>>>>>> # btrfs inspect dump-super <device> | grep generation
>>>>>>>
>>>>>>> The corrupted leaf has generation 2862, I'm not sure how recent 
>>>>>>> did the
>>>>>>> corruption happen.
>>>>>> generation              358392
>>>>>> chunk_root_generation   357256
>>>>>> cache_generation        358392
>>>>>> uuid_tree_generation    358392
>>>>>> dev_item.generation     0
>>>>>>
>>>>>> I don't recall the last time I ran a scrub but I doubt it has been
>>>>>> more than a year.
>>>>>>
>>>>>> I am running 'btrfs check --init-csum-tree' now. Hopefully that 
>>>>>> clears
>>>>>> everything up.
>>>>> No such luck:
>>>>>
>>>>> Creating a new CRC tree
>>>>> Checking filesystem on /dev/Cached/Backups
>>>>> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
>>>>> Reinitialize checksum tree
>>>>> csum result is 0 for block 2412149436416
>>>>> extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, 
>>>>> value -28
>>>> It's ENOSPC, meaning btrfs can't find enough space for the new csum 
>>>> tree
>>>> blocks.
>>> Seems bogus, there's >4TiB unallocated.
>> What a shame.
>> Btrfs won't try to allocate new chunk if we're allocating new tree
>> blocks for metadata trees (extent, csum, etc).
>>
>> One quick (and dirty) way to avoid such limitation is to use the
>> following patch
>
>> <<patch removed>>
>
> No luck.
>
> # ./btrfs check --init-csum-tree /dev/Cached/Backups
> Creating a new CRC tree
> Opening filesystem to check...
> Checking filesystem on /dev/Cached/Backups
> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
> Reinitialize checksum tree
> Segmentation fault (core dumped)
>
>  btrfs[16575]: segfault at 7ffc4f74ef60 ip 000000000040d4c3 sp 
> 00007ffc4f74ef50 error 6 in btrfs[400000+bf000]
>
> # ./btrfs --version
> btrfs-progs v4.17.1
>
> I cloned  btrfs-progs from git and applied your patch.
>
> BTW, I've been having tons of trouble with two hosts after updating 
> from kernel 4.17.12 to 4.17.14 and beyond. The fs will become 
> unresponsive and all processes will end up stuck waiting on io. The 
> system will end up totally idle but unable perform any io on the 
> filesystem. So far things have been stable after reverting back to 
> 4.17.12. It looks like there was a btrfs change in 4.17.13. Could that 
> be related to this csum tree corruption?

About once a week, or so, I'm running into the above situation where FS 
seems to deadlock. All IO to the FS blocks, there is no IO activity at 
all. I have to hard reboot the system to recover. There are no error 
indications except for the following which occurs well before the FS 
freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong amount 
of free space
BTRFS warning (device dm-3): failed to load free space cache for block 
group 78691883286528, rebuilding it now

Do I have any options other the nuking the FS and starting over?

--Larkin

  reply	other threads:[~2018-10-10 15:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-26 20:45 Scrub aborts due to corrupt leaf Larkin Lowrey
2018-08-27  0:16 ` Qu Wenruo
2018-08-27  2:32   ` Larkin Lowrey
2018-08-27  4:46     ` Qu Wenruo
2018-08-28  2:12       ` Larkin Lowrey
2018-08-28  3:29         ` Chris Murphy
2018-08-28 13:29         ` Larkin Lowrey
2018-08-28 13:42           ` Qu Wenruo
2018-08-28 13:56             ` Chris Murphy
2018-08-29  1:27               ` Qu Wenruo
2018-08-29  5:32               ` Qu Wenruo
2018-09-11 15:23                 ` Larkin Lowrey
2018-10-10 15:44                   ` Larkin Lowrey [this message]
2018-10-10 16:04                     ` Holger Hoffstätte
2018-10-10 17:25                       ` Larkin Lowrey
2018-10-10 18:20                         ` Holger Hoffstätte
2018-10-10 18:31                           ` Larkin Lowrey
2018-10-10 19:53                             ` Chris Murphy
2018-10-10 23:43                         ` Qu Wenruo
2018-10-10 17:44                       ` Chris Murphy
2018-10-10 18:25                         ` Holger Hoffstätte
2018-10-10 23:55                         ` Hans van Kranenburg
2018-10-11  2:12                           ` Larkin Lowrey
2018-10-11  2:51                             ` Chris Murphy
2018-10-11  3:07                               ` Larkin Lowrey
2018-10-11  4:00                                 ` Chris Murphy
2018-10-11  4:15                                   ` Chris Murphy
2018-12-31 15:52                                     ` Larkin Lowrey
2019-01-01  0:12                                       ` Qu Wenruo
2019-01-01  2:38                                         ` Larkin Lowrey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd08eb00-faf8-cd88-aeb6-e5ed75a2889f@nuclearwinter.com \
    --to=llowrey@nuclearwinter.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.