linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: statfs: Don't reset f_bavail if we're over committing metadata space
Date: Fri, 17 Jan 2020 22:22:46 +0800	[thread overview]
Message-ID: <85585720-77de-b999-8d17-a17e86e1c181@gmx.com> (raw)
In-Reply-To: <20200117141037.GG3929@twin.jikos.cz>


[-- Attachment #1.1: Type: text/plain, Size: 4152 bytes --]



On 2020/1/17 下午10:10, David Sterba wrote:
> On Fri, Jan 17, 2020 at 09:32:49AM +0800, Qu Wenruo wrote:
>> On 2020/1/17 上午8:54, Qu Wenruo wrote:
>>> On 2020/1/16 下午10:29, David Sterba wrote:
>>>> On Wed, Jan 15, 2020 at 11:41:28AM +0800, Qu Wenruo wrote:
>>>>> [BUG]
>>>>> When there are a lot of metadata space reserved, e.g. after balancing a
>>>>> data block with many extents, vanilla df would report 0 available space.
>>>>>
>>>>> [CAUSE]
>>>>> btrfs_statfs() would report 0 available space if its metadata space is
>>>>> exhausted.
>>>>> And the calculation is based on currently reserved space vs on-disk
>>>>> available space, with a small headroom as buffer.
>>>>> When there is not enough headroom, btrfs_statfs() will report 0
>>>>> available space.
>>>>>
>>>>> The problem is, since commit ef1317a1b9a3 ("btrfs: do not allow
>>>>> reservations if we have pending tickets"), we allow btrfs to over commit
>>>>> metadata space, as long as we have enough space to allocate new metadata
>>>>> chunks.
>>>>>
>>>>> This makes old calculation unreliable and report false 0 available space.
>>>>>
>>>>> [FIX]
>>>>> Don't do such naive check anymore for btrfs_statfs().
>>>>> Also remove the comment about "0 available space when metadata is
>>>>> exhausted".
>>>>
>>>> This is intentional and was added to prevent a situation where 'df'
>>>> reports available space but exhausted metadata don't allow to create new
>>>> inode.
>>>
>>> But this behavior itself is not accurate.
>>>
>>> We have global reservation, which is normally always larger than the
>>> immediate number 4M.
>>>
>>> So that check will never really be triggered.
>>>
>>> Thus invalidating most of your argument.
>>>>
>>>> If it gets removed you are trading one bug for another. With the changed
>>>> logic in the referenced commit, the metadata exhaustion is more likely
>>>> but it's also temporary.
>>
>> Furthermore, the point of the patch is, current check doesn't play well
>> with metadata over-commit.
> 
> The recent overcommit updates broke statfs in a new way and left us
> almost nothing to make it better.

It's not impossible to solve in fact.

Exporting can_overcommit() can do pretty well in this particular case.

> 
>> If it's before v5.4, I won't touch the check considering it will never
>> hit anyway.
>>
>> But now for v5.4, either:
>> - We over-commit metadata
>>   Meaning we have unallocated space, nothing to worry
> 
> Can we estimate how much unallocated data are there? I don't know how,
> and "nothing to worry" always worries me.

Data never over-commit. We always ensure there are enough data chunk
allocated before we allocate data extents.

> 
>> - No more space for over-commit
>>   But in that case, we still have global rsv to update essential trees.
>>   Please note that, btrfs should never fall into a status where no files
>>   can be deleted.
> 
> Of course, the global reserve is there for last resort actions and will
> be used for deletion and updating essential trees. What statfs says is
> how much data is there left for the user. New files, writing more data
> etc.
> 
>> Consider all these, we're no longer able to really hit that case.
>>
>> So that's why I'm purposing deleting that. I see no reason why that
>> magic number 4M would still work nowadays.
> 
> So, the corner case that resulted in the guesswork needs to be
> reevaluated then, the space reservations and related updates clearly
> affect that. That's out of 5.5-rc timeframe though.

Although we can still solve the problem only using facility in v5.5, I'm
still not happy enough with the idea of "one exhausted resource would
result a different resource exhausted"

I still believe in that we should report different things independently.
(Which obviously makes our lives easier in statfs case).

That's also why we require reporters to include 'btrfs fi df' result
other than vanilla 'df', because we have different internals.

Or, can we reuse the f_files/f_free facility to report metadata space,
and forgot all these mess?

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-01-17 14:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15  3:41 [PATCH] btrfs: statfs: Don't reset f_bavail if we're over committing metadata space Qu Wenruo
2020-01-15 11:40 ` Qu WenRuo
2020-01-16 14:29 ` David Sterba
2020-01-17  0:54   ` Qu Wenruo
2020-01-17  1:32     ` Qu Wenruo
2020-01-17 14:10       ` David Sterba
2020-01-17 14:22         ` Qu Wenruo [this message]
2020-01-29 15:38           ` David Sterba
2020-01-17 14:02     ` David Sterba
2020-01-17 14:16       ` Qu Wenruo
2020-01-29 16:01         ` David Sterba
2020-01-31  2:23           ` Zygo Blaxell
2020-01-30 21:05 ` Josef Bacik
2020-01-30 23:14   ` Anand Jain
2020-01-31  0:35   ` Qu Wenruo
2020-01-31 11:58     ` Qu Wenruo
2020-01-31 12:34   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85585720-77de-b999-8d17-a17e86e1c181@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).