All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Nikolay Borisov <nborisov@suse.com>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 2/9] btrfs: improve preemptive background space flushing
Date: Thu, 1 Oct 2020 17:35:58 -0400	[thread overview]
Message-ID: <01e8925f-5b4b-74aa-f3e6-0faae52a4d3b@toxicpanda.com> (raw)
In-Reply-To: <efe49176-1eba-df6a-ffdf-47031c5acf36@suse.com>

On 10/1/20 9:19 AM, Nikolay Borisov wrote:
> 
> 
> On 30.09.20 г. 23:01 ч., Josef Bacik wrote:
> <snip>
> 
>> When I introduced the ticketed ENOSPC stuff this broke slightly in the
>> fact that we were using tickets to indicate if we were done flushing.
>> No tickets, no more flushing.  However this meant that we essentially
>> never preemptively flushed.  This caused a write performance regression
>> that Nikolay noticed in an unrelated patch that removed the committing
>> of the transaction during btrfs_end_transaction.
> 
> I see, so basically the patch which I biseceted this to was really
> papering over the initial bug since the logic in end_transaction, sort
> of, simulated pre-emptive flushing... how subtle!
> 
> <snip>
> 
>> +	spin_lock(&space_info->lock);
>> +	used = btrfs_space_info_used(space_info, true);
>> +	while (need_do_async_reclaim(fs_info, space_info, used)) {
>> +		enum btrfs_reserve_flush_enum flush;
>> +		u64 delalloc_size = 0;
>> +		u64 to_reclaim, block_rsv_size;
>> +		u64 global_rsv_size = global_rsv->reserved;
>> +
>> +		/*
>> +		 * If we're just full of pinned, commit the transaction.  We
>> +		 * don't call flush_space(COMMIT_TRANS) here because that has
>> +		 * logic to decide whether we need to commit the transaction to
>> +		 * satisfy the ticket to keep us from live locking the box by
>> +		 * committing over and over again.  Here we don't care about
>> +		 * that, we know we are using a lot of space and most of it is
>> +		 * pinned, just commit.
> 
> nit: That comment is a mouthful, I think what you are describing here is
> really this line in may_commit_transaction:
> 
> if (!bytes_needed) return 0;
> 
> Which triggers if we don't have a ticket, if so there simply say :
> 
> "We can't call flush_commit because it will flush iff there is a pending
> ticket".
> 

Yup I'll reword.

> <snip>
> 
>> +		/*
>> +		 * We don't have a precise counter for delalloc, so we'll
>> +		 * approximate it by subtracting out the block rsv's space from
>> +		 * the bytes_may_use.  If that amount is higher than the
>> +		 * individual reserves, then we can assume it's tied up in
>> +		 * delalloc reservations.
>> +		 */
>> +		block_rsv_size = global_rsv_size +
>> +			delayed_block_rsv->reserved +
>> +			delayed_refs_rsv->reserved +
>> +			trans_rsv->reserved;
>> +		if (block_rsv_size < space_info->bytes_may_use)
>> +			delalloc_size = space_info->bytes_may_use -
>> +				block_rsv_size;
> 
> What about  :
> 
> percpu_counter_add_batch(&fs_info->delalloc_bytes, len,
>                        fs_info->delalloc_batch);
> 

Because that's the total data bytes, not how much metadata is reserved for the 
data bytes, which can be very grossly different things.  For example, one 
contiguous MAX_EXTENT_SIZE data extent is only like 1mib of metadata 
reservations, but if we used delalloc_bytes here we'd think the delalloc size 
was dominating the metadata reservations.  Thanks,

Josef

  reply	other threads:[~2020-10-01 21:36 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-30 20:01 [PATCH 0/9] Improve preemptive ENOSPC flushing Josef Bacik
2020-09-30 20:01 ` [PATCH 1/9] btrfs: add a trace point for reserve tickets Josef Bacik
2020-10-01  5:54   ` Nikolay Borisov
2020-10-01 21:33     ` Josef Bacik
2020-09-30 20:01 ` [PATCH 2/9] btrfs: improve preemptive background space flushing Josef Bacik
2020-10-01 13:19   ` Nikolay Borisov
2020-10-01 21:35     ` Josef Bacik [this message]
2020-09-30 20:01 ` [PATCH 3/9] btrfs: rename need_do_async_reclaim Josef Bacik
2020-10-01 13:20   ` Nikolay Borisov
2020-10-01 13:24   ` Nikolay Borisov
2020-10-01 21:37     ` Josef Bacik
2020-09-30 20:01 ` [PATCH 4/9] btrfs: check reclaim_size in need_preemptive_reclaim Josef Bacik
2020-10-01 13:23   ` Nikolay Borisov
2020-10-01 21:36     ` Josef Bacik
2020-09-30 20:01 ` [PATCH 5/9] btrfs: rework btrfs_calc_reclaim_metadata_size Josef Bacik
2020-10-01 13:59   ` Nikolay Borisov
2020-10-01 21:38     ` Josef Bacik
2020-09-30 20:01 ` [PATCH 6/9] btrfs: simplify the logic in need_preemptive_flushing Josef Bacik
2020-10-01 14:09   ` Nikolay Borisov
2020-10-01 21:40     ` Josef Bacik
2020-10-02  7:13   ` Nikolay Borisov
2020-09-30 20:01 ` [PATCH 7/9] btrfs: implement space clamping for preemptive flushing Josef Bacik
2020-10-01 14:49   ` Nikolay Borisov
2020-10-01 21:41     ` Josef Bacik
2020-09-30 20:01 ` [PATCH 8/9] btrfs: adjust the flush trace point to include the source Josef Bacik
2020-10-01 15:32   ` Nikolay Borisov
2020-09-30 20:01 ` [PATCH 9/9] btrfs: add a trace class for dumping the current ENOSPC state Josef Bacik
2020-10-02  8:30   ` Nikolay Borisov
2020-10-02 13:45     ` Josef Bacik
2020-10-06 12:55 ` [PATCH 0/9] Improve preemptive ENOSPC flushing Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01e8925f-5b4b-74aa-f3e6-0faae52a4d3b@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.