All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 3/3] btrfs: statfs: Use virtual chunk allocation to calculation available data space
Date: Mon, 6 Jan 2020 09:44:31 -0500	[thread overview]
Message-ID: <1c9b3644-4b77-a7c8-c47d-19743cad7f06@toxicpanda.com> (raw)
In-Reply-To: <20200106061343.18772-4-wqu@suse.com>

On 1/6/20 1:13 AM, Qu Wenruo wrote:
> Although btrfs_calc_avail_data_space() is trying to do an estimation
> on how many data chunks it can allocate, the estimation is far from
> perfect:
> 
> - Metadata over-commit is not considered at all
> - Chunk allocation doesn't take RAID5/6 into consideration
> 
> Although current per-profile available space itself is not able to
> handle metadata over-commit itself, the virtual chunk infrastructure can
> be re-used to address above problems.
> 
> This patch will change btrfs_calc_avail_data_space() to do the following
> things:
> - Do metadata virtual chunk allocation first
>    This is to address the over-commit behavior.
>    If current metadata chunks have enough free space, we can completely
>    skip this step.
> 
> - Allocate data virtual chunks as many as possible
>    Just like what we did in per-profile available space estimation.
>    Here we only need to calculate one profile, since statfs() call is
>    a relative cold path.
> 
> Now statfs() should be able to report near perfect estimation on
> available data space, and can handle RAID5/6 better.
> 
> [BENCHMARK]
> For the performance difference, here is the benchmark:
>   Disk layout:
>   - devid 1:	1G
>   - devid 2:	2G
>   - devid 3:	3G
>   - devid 4:	4G
>   - devid 5:	5G
>   metadata:	RAID1
>   data:		RAID5
> 
>   This layout should be the worst case for RAID5, as it can
>   from 5 disks raid5 to 2 disks raid 5 with unusable space.
> 
>   Then use ftrace to trace the execution time of btrfs_statfs() after
>   allocating 1G data chunk. Both have 12 samples.
> 				avg		std
>   Patched:			17.59 us	7.04
>   WIthout patch (v5.5-rc2):	14.98 us	6.16
> 
> When the fs is cold, there is a small performance for this particular
> case, as we need to do several more iterations to calculate correct
> RAID5 data space.
> But it's still pretty good, and won't block chunk allocator for any
> observable time.
> 
> When the fs is hot, the performance bottleneck is the chunk_mutex, where
> the most common and longest holder would be __btrfs_chunk_alloc().
> In that case, we may sleep much longer as __btrfs_chunk_alloc() can
> trigger IO.
> 
> Since the new implementation is not observable slower than the old one,
> and won't cause meaningful delay for chunk allocator, it should be more
> or less OK.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Sorry I should have been more specific.  Taking the chunk_mutex here means all 
the people running statfs at the same time are going to be serialized.  Can we 
just do what overcommit does and take the already calculated free space instead 
of doing the calculation again?  Thanks,

Josef

  reply	other threads:[~2020-01-06 14:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-06  6:13 [PATCH v3 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() Qu Wenruo
2020-01-06  6:13 ` [PATCH v3 1/3] btrfs: Introduce per-profile available space facility Qu Wenruo
2020-01-06 14:32   ` David Sterba
2020-01-07  2:13     ` Qu Wenruo
2020-01-08 15:04       ` David Sterba
2020-01-08 23:53         ` Qu WenRuo
2020-01-09  6:26           ` Qu Wenruo
2020-01-06 23:45   ` kbuild test robot
2020-01-06 23:45     ` kbuild test robot
2020-01-06  6:13 ` [PATCH v3 2/3] btrfs: space-info: Use per-profile available space in can_overcommit() Qu Wenruo
2020-01-06  6:13 ` [PATCH v3 3/3] btrfs: statfs: Use virtual chunk allocation to calculation available data space Qu Wenruo
2020-01-06 14:44   ` Josef Bacik [this message]
2020-01-07  1:03     ` Qu Wenruo
2020-01-06 14:06 ` [PATCH v3 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() David Sterba
2020-01-07  2:04   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c9b3644-4b77-a7c8-c47d-19743cad7f06@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.