From: Naohiro Aota <naohiro.aota@wdc.com>
To: Josef Bacik <josef@toxicpanda.com>, David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org, Naohiro Aota <naohiro.aota@wdc.com>
Subject: [PATCH 00/17] ZNS Support for Btrfs
Date: Wed, 11 Aug 2021 23:16:24 +0900 [thread overview]
Message-ID: <cover.1628690222.git.naohiro.aota@wdc.com> (raw)
This series extends zoned support for Zoned Namespace (ZNS) SSDs [1].
[1] https://zonedstorage.io/introduction/zns/
This series is available on GitHub at
v1 https://github.com/naota/linux/tree/btrfs-zns-v1
HEAD https://github.com/naota/linux/tree/btrfs-zns
The ZNS specification introduces extra functionalities listed below.
- No conventional zones
- Zone Append write command
- Zone Capacity
- Active Zones
The first two functionalities are already addressed in the current
zoned support on btrfs. We do not rely on conventional zones, and we
use the zone append write command to write data IOs.
This series implements support for the other ones.
While userland tool needs some tweaks (e.g. using capactiy instead of
the length) to be precise, but it still works fine as it is.
* Zone Capacity Support
A zone capacity is an additional per-zone attribute that indicates the
number of usable logical blocks within each zone, starting from the
first logical block of each zone. It is always smaller or equal to the
zone size.
We can naturally map the capacity to the newly introduced
"zone_capacity" of a block group. Allocations are limited under the
zone capacity instead of the block group's length.
* Active Zones Tracking
The ZNS specification defines a limit on the number of zones that can
be in the implicit open, explicit open or closed conditions. Any zone
with such condition is defined as an active zone and correspond to any
zone that is being written or that has been only partially written. If
the maximum number of active zones is reached, we must either reset or
finish some active zones before being able to chose other zones for
storing data.
In order to not exceed the number of max active zones, we need to
track which zones are active and how the active zones are related to
the block groups. We mark a block group as "active" if the
corresponding device zones are all active. Allocating an extent will
activate a block group, and allocation from an inactive block group is
prohibited. Such active block groups are tracked in a list. Once a
block group is fully written, we deactivate it and remove it from the
list.
* Active Zone Aware Sequential Allocator
Handling the active zones will make the allocator complex. Here is a
summary of how find_free_extent_update_loop() behave.
1. If enough space is available in an active block group
- allocate from it (end, success)
2. If we can activate another zone on a device
2.1 Try to allocate a new block group and activate it
2.2 If the activation succeeds
- allocation will be satisfied from it in the next iteration
2.3 If the activation failed
- Try the next cycle. Some writes may free up an active block group
3. If we cannot activate any zones
3.1 Try to allocate in a small size by checking min_alloc_size
- btrfs_reserve_extent() will halve the allocation size and
restart the loop
3.2 Nothing can be done anymore. Give up. ENOSPC
* Patch series organization
Note: patches 2 and 14 are preparation patches and can be merged
independently.
Patches 1-6 implement zone capacity support.
Patch 7 implements finishing a superblock zone once there is no space
left for new superblock.
Patches 8-13 implement the activation side of the active zone
tracking.
Patches 14 and 15 tweak the allocator to retry with a smaller size if
possible (step 3.1 in the above list)
Patches 16 and 17 implement the deactivation side of the active zone
tracking.
Naohiro Aota (17):
btrfs: zoned: load zone capacity information from devices
btrfs: zoned: move btrfs_free_excluded_extents out from
btrfs_calc_zone_unusable
btrfs: zoned: calculate free space from zone capacity
btrfs: zoned: tweak reclaim threshold for zone capacity
btrfs: zoned: consider zone as full when no more SB can be written
btrfs: zoned: locate superblock position using zone capacity
btrfs: zoned: finish superblock zone once no space left for new SB
btrfs: zoned: load active zone information from devices
btrfs: zoned: introduce physical_map to btrfs_block_group
btrfs: zoned: implement active zone tracking
btrfs: zoned: load active zone info for block group
btrfs: zoned: activate block group on allocation
btrfs: zoned: activate new block group
btrfs: move ffe_ctl one level up
btrfs: zoned: avoid chunk allocation if active block group has enough
space
btrfs: zoned: finish fully written block group
btrfs: zoned: finish relocating block group
fs/btrfs/block-group.c | 29 ++-
fs/btrfs/block-group.h | 4 +
fs/btrfs/ctree.h | 3 +
fs/btrfs/disk-io.c | 6 +-
fs/btrfs/extent-tree.c | 204 +++++++++------
fs/btrfs/extent_io.c | 11 +-
fs/btrfs/extent_io.h | 1 +
fs/btrfs/free-space-cache.c | 19 +-
fs/btrfs/inode.c | 6 +-
fs/btrfs/relocation.c | 4 +
fs/btrfs/zoned.c | 495 +++++++++++++++++++++++++++++++++---
fs/btrfs/zoned.h | 39 ++-
12 files changed, 692 insertions(+), 129 deletions(-)
--
2.32.0
next reply other threads:[~2021-08-11 14:20 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-11 14:16 Naohiro Aota [this message]
2021-08-11 14:16 ` [PATCH 01/17] btrfs: zoned: load zone capacity information from devices Naohiro Aota
2021-08-11 14:16 ` [PATCH 02/17] btrfs: zoned: move btrfs_free_excluded_extents out from btrfs_calc_zone_unusable Naohiro Aota
2021-08-11 14:16 ` [PATCH 03/17] btrfs: zoned: calculate free space from zone capacity Naohiro Aota
2021-08-11 14:16 ` [PATCH 04/17] btrfs: zoned: tweak reclaim threshold for " Naohiro Aota
2021-08-11 14:16 ` [PATCH 05/17] btrfs: zoned: consider zone as full when no more SB can be written Naohiro Aota
2021-08-11 14:16 ` [PATCH 06/17] btrfs: zoned: locate superblock position using zone capacity Naohiro Aota
2021-08-11 14:16 ` [PATCH 07/17] btrfs: zoned: finish superblock zone once no space left for new SB Naohiro Aota
2021-08-11 14:16 ` [PATCH 08/17] btrfs: zoned: load active zone information from devices Naohiro Aota
2021-08-11 14:16 ` [PATCH 09/17] btrfs: zoned: introduce physical_map to btrfs_block_group Naohiro Aota
2021-08-11 14:16 ` [PATCH 10/17] btrfs: zoned: implement active zone tracking Naohiro Aota
2021-08-11 14:16 ` [PATCH 11/17] btrfs: zoned: load active zone info for block group Naohiro Aota
2021-08-11 14:16 ` [PATCH 12/17] btrfs: zoned: activate block group on allocation Naohiro Aota
2021-08-11 14:16 ` [PATCH 13/17] btrfs: zoned: activate new block group Naohiro Aota
2021-08-11 14:16 ` [PATCH 14/17] btrfs: move ffe_ctl one level up Naohiro Aota
2021-08-11 14:16 ` [PATCH 15/17] btrfs: zoned: avoid chunk allocation if active block group has enough space Naohiro Aota
2021-08-11 14:16 ` [PATCH 16/17] btrfs: zoned: finish fully written block group Naohiro Aota
2021-08-11 17:26 ` kernel test robot
2021-08-11 17:26 ` kernel test robot
2021-08-16 4:34 ` Naohiro Aota
2021-08-16 4:34 ` Naohiro Aota
2021-08-11 18:37 ` kernel test robot
2021-08-11 18:37 ` kernel test robot
2021-08-11 14:16 ` [PATCH 17/17] btrfs: zoned: finish relocating " Naohiro Aota
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1628690222.git.naohiro.aota@wdc.com \
--to=naohiro.aota@wdc.com \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.