Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: fdmanana@gmail.com
Cc: Qu Wenruo <wqu@suse.com>, linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2] btrfs: transaction: Commit transaction more frequently for BPF
Date: Fri, 16 Aug 2019 18:06:16 +0800
Message-ID: <7ab52335-6a1c-46d9-5891-27fcd86db0e1@gmx.com> (raw)
In-Reply-To: <CAL3q7H4jM5ydhOazozLQR5kQnAi84WhPHu7uFm+k8zFy31-agQ@mail.gmail.com>

[-- Attachment #1.1: Type: text/plain, Size: 5311 bytes --]



On 2019/8/16 下午6:03, Filipe Manana wrote:
> On Fri, Aug 16, 2019 at 10:53 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2019/8/16 下午5:33, Filipe Manana wrote:
>>> On Thu, Aug 15, 2019 at 9:36 AM Qu Wenruo <wqu@suse.com> wrote:
>>>>
>>>> Btrfs has btrfs_end_transaction_throttle() which could try to commit
>>>> transaction when needed.
>>>>
>>>> However under most cases btrfs_end_transaction_throttle() won't really
>>>> commit transaction, due to the hard timing requirement.
>>>>
>>>> Now introduce a new error injection point, btrfs_need_trans_pressure(),
>>>> to allow btrfs_should_end_transaction() to return 1 and
>>>> btrfs_end_transaction_throttle() to fallback to
>>>> btrfs_commit_transaction().
>>>>
>>>> With such more aggressive transaction commit, we can dig deeper into
>>>> cases like snapshot drop.
>>>> Now each reference drop of btrfs_drop_snapshot() will lead to a
>>>> transaction commit, allowing dm-logwrites to catch more details, other
>>>> than one big transaction dropping everything.
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>> Changelog:
>>>> v2:
>>>> - Add comment to explain why this function is needed
>>>> ---
>>>>  fs/btrfs/transaction.c | 18 ++++++++++++++++++
>>>>  1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>>> index 3f6811cdf803..8c5471b01d03 100644
>>>> --- a/fs/btrfs/transaction.c
>>>> +++ b/fs/btrfs/transaction.c
>>>> @@ -10,6 +10,7 @@
>>>>  #include <linux/pagemap.h>
>>>>  #include <linux/blkdev.h>
>>>>  #include <linux/uuid.h>
>>>> +#include <linux/error-injection.h>
>>>>  #include "ctree.h"
>>>>  #include "disk-io.h"
>>>>  #include "transaction.h"
>>>> @@ -749,10 +750,25 @@ void btrfs_throttle(struct btrfs_fs_info *fs_info)
>>>>         wait_current_trans(fs_info);
>>>>  }
>>>>
>>>> +/*
>>>> + * This function is to allow BPF to override the return value so that we can
>>>> + * make btrfs to commit transaction more aggressively.
>>>> + *
>>>> + * It's a debug only feature, mainly used with dm-log-writes to catch more details
>>>> + * of transient operations like balance and subvolume drop.
>>>
>>> Transient? I think you mean long running operations that can span
>>> multiple transactions.
>>
>> Nope, really transient details.
>>
>> E.g catching subvolume dropping for each drop_progress update. While
>> under most one transaction can contain multiple drop_progress update or
>> even the whole tree just get dropped in one transaction.
>>
>>>
>>>> + */
>>>> +static noinline bool btrfs_need_trans_pressure(struct btrfs_trans_handle *trans)
>>>> +{
>>>> +       return false;
>>>> +}
>>>> +ALLOW_ERROR_INJECTION(btrfs_need_trans_pressure, TRUE);
>>>
>>> So, I'm not sure if it's really a good idea to have such specific
>>> things like this.
>>> Has this proven useful already? I.e., have you already found any bug using this?
>>
>> Not exactly.
>> I have observed a case where btrfs check gives false alert on missing a
>> backref of a dropped tree.
> 
> Wasn't this fixed by Josef in
> https://github.com/kdave/btrfs-progs/commit/42a1aaeec47bc34ae4a923e3e8b2e55b59c01711
> ?

Oh...

> That's normal, the first phase of dropping a tree removes the
> references in the extent tree, and then only in the second phase we
> drop the leafs/nodes that pointed to the extent.

Yes, that's just a false alert from btrfs check, so it is nothing
related to kernel.

> 
>>
>> Originally planned to use this feature to catch the exact update, but
>> the problem is, with this pressure, we need an extra ioctl to wait the
>> full subvolume drop to finish.
> 
> That, the ioctl to wait (or better, poll) for subvolume removal to
> complete (either all subvolumes or just a specific one), would be
> useful.

OK, would work on that feature.

Thanks,
Qu

> 
> Thanks.
> 
>>
>> Or we will get the fs unmounted before we really go to DROP_REFERENCE
>> phase, thus dm-log-writes gets nothing interesting.
>>
>> Thanks,
>> Qu
>>
>>>
>>> I often add such similar things myself for testing and debugging, but
>>> because they are so specific, or ugly or verbose, I keep them to
>>> myself.
>>>
>>> Allowing the return value of should_end_transaction() to be
>>> overridden, using the same approach, would be more generic for
>>> example.
>>>
>>> Thanks.
>>>
>>>> +
>>>>  static int should_end_transaction(struct btrfs_trans_handle *trans)
>>>>  {
>>>>         struct btrfs_fs_info *fs_info = trans->fs_info;
>>>>
>>>> +       if (btrfs_need_trans_pressure(trans))
>>>> +               return 1;
>>>>         if (btrfs_check_space_for_delayed_refs(fs_info))
>>>>                 return 1;
>>>>
>>>> @@ -813,6 +829,8 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
>>>>
>>>>         btrfs_trans_release_chunk_metadata(trans);
>>>>
>>>> +       if (throttle && btrfs_need_trans_pressure(trans))
>>>> +               return btrfs_commit_transaction(trans);
>>>>         if (lock && READ_ONCE(cur_trans->state) == TRANS_STATE_BLOCKED) {
>>>>                 if (throttle)
>>>>                         return btrfs_commit_transaction(trans);
>>>> --
>>>> 2.22.0
>>>>
>>>
>>>
>>
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-15  8:04 Qu Wenruo
2019-08-16  9:33 ` Filipe Manana
2019-08-16  9:53   ` Qu Wenruo
2019-08-16 10:03     ` Filipe Manana
2019-08-16 10:06       ` Qu Wenruo [this message]
2019-08-19 16:57       ` David Sterba
2019-08-20  0:34         ` Qu Wenruo
2019-08-19  5:14   ` Qu Wenruo

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ab52335-6a1c-46d9-5891-27fcd86db0e1@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox