linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Qu Wenruo <wqu@suse.de>,
	dsterba@suse.cz, Nikolay Borisov <nborisov@suse.com>,
	linux-btrfs@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH] btrfs: extent_io: Handle memory allocation failure in __clear_extent_bit()
Date: Mon, 22 Apr 2019 13:44:00 +0800	[thread overview]
Message-ID: <53b3998b-4bb0-6bd9-3a5f-0292f812ee81@gmx.com> (raw)
In-Reply-To: <a7d7c706-6e42-21bb-c42d-c60529bcd946@gmx.com>



On 2019/4/18 下午7:54, Qu Wenruo wrote:
>
>
> On 2019/4/18 下午7:51, Qu Wenruo wrote:
>>
>>
>> On 2019/4/18 下午7:38, David Sterba wrote:
>>> On Thu, Apr 18, 2019 at 03:30:20PM +0800, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2019/4/18 下午3:24, Nikolay Borisov wrote:
>>>>>
>>>>>
>>>>> On 18.04.19 г. 10:21 ч., Qu Wenruo wrote:
>>>>>> There is a BUG_ON() in __clear_extent_bit() for memory allocation
>>>>>> failure.
>>>>>>
>>>>>> While comment of __clear_extent_bit() says it can return error, but we
>>>>>> always return 0.
>>>>>>
>>>>>> Some __clear_extent_bit() callers just ignore the return value, while
>>>>>> some still expect error.
>>>>>>
>>>>>> Let's return proper error for this memory allocation anyway, to remove
>>>>>> that BUG_ON() as a first step, so at least we can continue test.
>>>>>
>>>>> I remember Josef did some changes into this code and said that prealloc
>>>>> shouldn't fail because this will cause mayhem down the road i.e. proper
>>>>> error handling is missing. If anything I think it should be added first
>>>>> and then remove the BUG_ONs.
>>>>
>>>> That's true, we could have some strange lockup due to
>>>> lock_extent_bits(), as if some clear_extent_bits() failed due to ENOMEM
>>>> and caller just ignore the error, we could have a lockup.
>>>
>>> Not only lockup but unhandled failed extent range locking totally breaks
>>> assumptions that the following code makes and this would lead to
>>> unpredictable corruptions. Just count how many lock_extent_bits calls
>>> are there. And any caller of __set_extent_bit. There are so many that
>>> the BUG_ON is the measure of last resort to prevent worse problems.
>>>
>>>> I'll try to pre-allocate certain amount of extent_state as the last
>>>> chance of redemption.
>>>
>>> This only lowers the chances to hit the allocation error but there's
>>> always a case when certain amount + 1 would be needed.
>>
>> Lower chance is already good enough (TM) for low possibility (0.001)
>> error injection.
>>
>> And, for real world low memory case, lower chance in btrfs means higher
>> chance in other subsystem, less chance user will blame btrfs. :)
>>
>>>
>>>> Anyway, such BUG_ON() right after kmalloc() is really a blockage for
>>>> error injection test.
>>>
>>> Maybe, but the code is not yet in the state to inject memory allocation
>>> faiulres to that particular path (ie. the state changes).
>>
>> With last-chance reservation, we can make state related memory
>> allocation almost always to success even memory allocation failure
>> injected (if the possibility is low and low concurrency)
>> And the last-chance reservation can be configured at compile/module load
>> time, making it flex enough for most cases.
>
> Forgot to mention, for that method, I'll definitely keep the BUG_ON() on
> @prealloc.
>
> Just make the allocation part fall back to use fs_info->last_chance[] to
> grab a valid memory slot.

This makes no sense in fact, for bcc inject tool, modify the predicts
can filter out the NOFAIL case, so all of my previous last-chance method
just makes no sense anyway.

I'm re-testing using the refined predicates: '(!(gfpflags &
__GFP_NOFAIL)) => submit_one_bio()'

And just now, I hit the same lock_extent_bits() hang with NOFAIL filter.
So this is definitely something wrong in the submit_one_bio() path.

Thanks,
Qu
>
> Thanks,
> Qu
>
>>
>> The main reason I'm doing such error injection test is to ensure write
>> time tree checker is not the cause of the lockup.
>>
>> Of course I can directly inject error into btrfs_check_leaf_full() and
>> btrfs_check_node(), and filter the stack to ensure it only happen in
>> write time, and that's already what I'm crafting, based on the bcc error
>> inject example and kprobe return value overriding.
>>
>> But it will never be a bad idea to explore what can go wrong.
>> And "always BUG_ON()" -> "good enough (TM)" already looks like a
>> improvement to me.
>>
>> Thanks,
>> Qu
>>

  parent reply	other threads:[~2019-04-22  5:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18  7:21 [PATCH] btrfs: extent_io: Handle memory allocation failure in __clear_extent_bit() Qu Wenruo
2019-04-18  7:24 ` Nikolay Borisov
2019-04-18  7:30   ` Qu Wenruo
2019-04-18 11:38     ` David Sterba
2019-04-18 11:51       ` Qu Wenruo
2019-04-18 11:54         ` Qu Wenruo
2019-04-18 12:27           ` Nikolay Borisov
2019-04-18 12:44             ` Qu Wenruo
2019-04-22  5:44           ` Qu Wenruo [this message]
2019-04-18 14:10   ` Josef Bacik
2019-04-18 14:15     ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53b3998b-4bb0-6bd9-3a5f-0292f812ee81@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=wqu@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).