From: Su Yue <suy.fnst@cn.fujitsu.com>
To: <linux-btrfs@vger.kernel.org>
Cc: <suy.fnst@cn.fujitsu.com>
Subject: [RFC PATCH 00/17] btrfs: implementation of priority aware allocator
Date: Wed, 28 Nov 2018 11:11:31 +0800 [thread overview]
Message-ID: <20181128031148.357-1-suy.fnst@cn.fujitsu.com> (raw)
This patchset can be fetched from repo:
https://github.com/Damenly/btrfs-devel/commits/priority_aware_allocator.
Since patchset 'btrfs: Refactor find_free_extent()' does a nice work
to simplify find_free_extent(). This patchset dependents on the refactor.
The base is the commit in kdave/misc-next:
commit fcaaa1dfa81f2f87ad88cbe0ab86a07f9f76073c (kdave/misc-next)
Author: Nikolay Borisov <nborisov@suse.com>
Date: Tue Nov 6 16:40:20 2018 +0200
btrfs: Always try all copies when reading extent buffers
This patchset introduces a new mount option named 'priority_alloc=%s',
%s is supported to be "usage" and "off" now. The mount option changes
the way find_free_extent() how to search block groups.
Previously, block groups are stored in list of btrfs_space_info
by start position. When call find_free_extent() if no hint,
block_groups are searched one by one.
Design of priority aware allocator:
Block group has its own priority. We split priorities to many levels,
block groups are split to different trees according priorities.
And those trees are sorted by their levels and stored in space_info.
Once find_free_extent() is called, try to search block groups in higher
priority level then lower level. Then a block group with higher
priority is more likely to be used.
Pros:
1) Reduce the frequency of balance.
The block group with a higher usage rate will be used preferentially
for allocating extents. Free the empty block groups with pinned bytes
as non-zero.[1]
2) The priority of empty block group with pinned bytes as non-zero
will be set as the lowest.
3) Support zoned block device.[2]
For metadata allocation, the block group in conventional zones
will be used as much as possible regardless of usage rate.
Will do it in future.
Cons:
1) Expectable performance regression.
The degree of the decline is temporarily unknown.
The user can disable block group priority to get the full performance.
TESTS:
If use usage as priority(the only available option), empty block group
is much harder to be reused.
About block group usage:
Disk: 4 x 1T HDD gathered in LVM.
Run script to create files and delete files randomly in loop.
The num of files to create are double than to delete.
Default mount option result:
https://i.loli.net/2018/11/28/5bfdfdf08c760.png
Priority aware allocator(usage) result:
https://i.loli.net/2018/11/28/5bfdfdf0c1b11.png
X coordinate means total disk usage, Y coordinate means avg block
group usage.
Due to fragmentation of extents, the different are not obvious,
only about 1% improvement....
Performance regression:
I have ran sysbench on our machine with SSD in multi combinations,
no obvious regression found.
However in theory, the new allocator may cost more time in some
cases.
[1] https://www.spinics.net/lists/linux-btrfs/msg79508.html
[2] https://lkml.org/lkml/2018/8/16/174
---
Due to some reasons includes time and hardware, the use-case is not
outstanding enough. And some codes are dirty but I can't found another
way. So I named it as RFC.
Any comments and suggestions are welcome.
Su Yue (17):
btrfs: priority alloc: prepare of priority aware allocator
btrfs: add mount definition BTRFS_MOUNT_PRIORITY_USAGE
btrfs: priority alloc: introduce compute_block_group_priority/usage
btrfs: priority alloc: add functions to create/remove priority trees
btrfs: priority alloc: introduce functions to add block group to
priority tree
btrfs: priority alloc: introduce three macros to mark block group
status
btrfs: priority alloc: add functions to remove block group from
priority tree
btrfs: priority alloc: add btrfs_update_block_group_priority()
btrfs: priority alloc: call create/remove_priority_trees in space_info
btrfs: priority alloc: call add_block_group_priority while reading or
making block group
btrfs: priority alloc: remove block group from priority tree while
removing block group
btrfs: priority alloc: introduce find_free_extent_search()
btrfs: priority alloc: modify find_free_extent() to fit priority
allocator
btrfs: priority alloc: introduce btrfs_set_bg_updating and call
btrfs_update_block_group_prioriy
btrfs: priority alloc: write bg->priority_groups_sem while waiting
reservation
btrfs: priority alloc: write bg->priority_tree->groups_sem to avoid
race in btrfs_delete_unused_bgs()
btrfs: add mount option "priority_alloc=%s"
fs/btrfs/ctree.h | 28 ++
fs/btrfs/extent-tree.c | 672 +++++++++++++++++++++++++++++++++---
fs/btrfs/free-space-cache.c | 3 +
fs/btrfs/super.c | 18 +
fs/btrfs/transaction.c | 1 +
5 files changed, 681 insertions(+), 41 deletions(-)
--
2.19.1
next reply other threads:[~2018-11-28 3:04 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 3:11 Su Yue [this message]
2018-11-28 3:11 ` [RFC PATCH 01/17] btrfs: priority alloc: prepare of priority aware allocator Su Yue
2018-11-28 8:24 ` Nikolay Borisov
2018-11-28 9:24 ` Su Yue
2018-11-28 3:11 ` [RFC PATCH 02/17] btrfs: add mount definition BTRFS_MOUNT_PRIORITY_USAGE Su Yue
2018-11-28 3:11 ` [RFC PATCH 03/17] btrfs: priority alloc: introduce compute_block_group_priority/usage Su Yue
2018-11-28 8:56 ` Nikolay Borisov
2018-11-28 3:11 ` [RFC PATCH 04/17] btrfs: priority alloc: add functions to create/remove priority trees Su Yue
2018-11-28 3:11 ` [RFC PATCH 05/17] btrfs: priority alloc: introduce functions to add block group to priority tree Su Yue
2018-11-28 3:11 ` [RFC PATCH 06/17] btrfs: priority alloc: introduce three macros to mark block group status Su Yue
2018-11-28 3:11 ` [RFC PATCH 07/17] btrfs: priority alloc: add functions to remove block group from priority tree Su Yue
2018-11-28 3:11 ` [RFC PATCH 08/17] btrfs: priority alloc: add btrfs_update_block_group_priority() Su Yue
2018-11-28 3:11 ` [RFC PATCH 09/17] btrfs: priority alloc: call create/remove_priority_trees in space_info Su Yue
2018-11-28 3:11 ` [RFC PATCH 10/17] btrfs: priority alloc: call add_block_group_priority while reading or making block group Su Yue
2018-11-28 3:11 ` [RFC PATCH 11/17] btrfs: priority alloc: remove block group from priority tree while removing " Su Yue
2018-11-28 3:11 ` [RFC PATCH 12/17] btrfs: priority alloc: introduce find_free_extent_search() Su Yue
2018-11-28 3:11 ` [RFC PATCH 13/17] btrfs: priority alloc: modify find_free_extent() to fit priority allocator Su Yue
2018-11-28 3:11 ` [RFC PATCH 14/17] btrfs: priority alloc: introduce btrfs_set_bg_updating and call btrfs_update_block_group_prioriy Su Yue
2018-11-28 3:11 ` [RFC PATCH 15/17] btrfs: priority alloc: write bg->priority_groups_sem while waiting reservation Su Yue
2018-11-28 3:11 ` [RFC PATCH 16/17] btrfs: priority alloc: write bg->priority_tree->groups_sem to avoid race in btrfs_delete_unused_bgs() Su Yue
2018-11-28 3:11 ` [RFC PATCH 17/17] btrfs: add mount option "priority_alloc=%s" Su Yue
2018-11-28 4:04 ` [RFC PATCH 00/17] btrfs: implementation of priority aware allocator Qu Wenruo
2018-12-02 5:28 ` Su Yue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181128031148.357-1-suy.fnst@cn.fujitsu.com \
--to=suy.fnst@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).