All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing
@ 2022-05-04  0:48 Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 1/5] btrfs: zoned: introduce btrfs_zoned_bg_is_full Naohiro Aota
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

- Changes
 - v2
   - Rename some functions/variables.
   - Intoduce btrfs_zoned_bg_is_full() to check if a block group is fully
     allocated or not.
   - Add some more comments.

* Note: this series depends on "btrfs: zoned: fix zone activation logic"
  series (patch 1). I found the bug addressed in the series while I'm
  introducing the helper. 

Commit be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
introduced zone finishing a block group when the IO reaches at the end of
the block group. However, since the zone capacity may not aligned to 16KB
(node size), we can leave an un-allocatable space at the end of a block
group. Also, it turned out that metadata zone finishing never works
actually.

This series addresses these issues by rewriting metadata zone finishing
code to use a workqueue to process the finishing work.

Patch 1 introduces a helper to check if a block group is fully allocated.
The helper is used in patch 2.

Patch 2 is a clean-up patch to consolidate zone finishing function for
better maintainability.

Patch 3 changes the left region calculation so that it finishes a block
group when there is no more space left for allocation.

Patch 4 fixes metadata block group finishing which is not actually working.

Patch 5 implements zone finishing of an unused block group and fixes active
block group accounting. This patch is a bit unrelated to other ones. But,
the patch is tested with the previous patches applied, so let me go with
the same series.

Naohiro Aota (5):
  btrfs: zoned: introduce btrfs_zoned_bg_is_full
  btrfs: zoned: consolidate zone finish function
  btrfs: zoned: finish BG when there are no more allocatable bytes left
  btrfs: zoned: properly finish block group on metadata write
  btrfs: zoned: zone finish unused block group

 fs/btrfs/block-group.c |   8 ++
 fs/btrfs/block-group.h |   2 +
 fs/btrfs/extent-tree.c |   3 +-
 fs/btrfs/extent_io.c   |   6 +-
 fs/btrfs/extent_io.h   |   1 -
 fs/btrfs/zoned.c       | 176 ++++++++++++++++++++++++-----------------
 fs/btrfs/zoned.h       |  11 +++
 7 files changed, 128 insertions(+), 79 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/5] btrfs: zoned: introduce btrfs_zoned_bg_is_full
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
@ 2022-05-04  0:48 ` Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function Naohiro Aota
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

Introduce a wrapper to check if all the space in a block group is allocated
or not.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 3 +--
 fs/btrfs/zoned.c       | 2 +-
 fs/btrfs/zoned.h       | 6 ++++++
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1dc6b2014813..1a2959be579d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3800,8 +3800,7 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
 
 	/* Check RO and no space case before trying to activate it */
 	spin_lock(&block_group->lock);
-	if (block_group->ro ||
-	    block_group->alloc_offset == block_group->zone_capacity) {
+	if (block_group->ro || btrfs_zoned_bg_is_full(block_group)) {
 		ret = 1;
 		/*
 		 * May need to clear fs_info->{treelog,data_reloc}_bg.
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 6e91022ae9f6..cc0c5dd5a901 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1836,7 +1836,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
 	}
 
 	/* No space left */
-	if (block_group->alloc_offset == block_group->zone_capacity) {
+	if (btrfs_zoned_bg_is_full(block_group)) {
 		ret = false;
 		goto out_unlock;
 	}
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index de923fc8449d..98f277ed5138 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -372,4 +372,10 @@ static inline void btrfs_zoned_data_reloc_unlock(struct btrfs_inode *inode)
 		mutex_unlock(&root->fs_info->zoned_data_reloc_io_lock);
 }
 
+static inline bool btrfs_zoned_bg_is_full(struct btrfs_block_group *bg)
+{
+	ASSERT(btrfs_is_zoned(bg->fs_info));
+	return bg->alloc_offset == bg->zone_capacity;
+}
+
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 1/5] btrfs: zoned: introduce btrfs_zoned_bg_is_full Naohiro Aota
@ 2022-05-04  0:48 ` Naohiro Aota
  2022-05-04 16:00   ` Johannes Thumshirn
  2022-05-04  0:48 ` [PATCH v2 3/5] btrfs: zoned: finish BG when there are no more allocatable bytes left Naohiro Aota
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

btrfs_zone_finish() and btrfs_zone_finish_endio() have similar code.
Introduce __btrfs_zone_finish() to consolidate them.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/zoned.c | 137 ++++++++++++++++++++++-------------------------
 1 file changed, 64 insertions(+), 73 deletions(-)

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index cc0c5dd5a901..0286fb1c63db 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1873,20 +1873,14 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
 	return ret;
 }
 
-int btrfs_zone_finish(struct btrfs_block_group *block_group)
+static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_written)
 {
 	struct btrfs_fs_info *fs_info = block_group->fs_info;
 	struct map_lookup *map;
-	struct btrfs_device *device;
-	u64 physical;
+	bool need_zone_finish;
 	int ret = 0;
 	int i;
 
-	if (!btrfs_is_zoned(fs_info))
-		return 0;
-
-	map = block_group->physical_map;
-
 	spin_lock(&block_group->lock);
 	if (!block_group->zone_is_active) {
 		spin_unlock(&block_group->lock);
@@ -1900,36 +1894,52 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
 		spin_unlock(&block_group->lock);
 		return -EAGAIN;
 	}
-	spin_unlock(&block_group->lock);
-
-	ret = btrfs_inc_block_group_ro(block_group, false);
-	if (ret)
-		return ret;
-
-	/* Ensure all writes in this block group finish */
-	btrfs_wait_block_group_reservations(block_group);
-	/* No need to wait for NOCOW writers. Zoned mode does not allow that. */
-	btrfs_wait_ordered_roots(fs_info, U64_MAX, block_group->start,
-				 block_group->length);
-
-	spin_lock(&block_group->lock);
 
 	/*
-	 * Bail out if someone already deactivated the block group, or
-	 * allocated space is left in the block group.
+	 * If we are sure that the block group is full (= no more room left for
+	 * new allocation) and the IO for the last usable block is completed, we
+	 * don't need to wait for the other IOs. This holds because we ensure
+	 * the sequential IO submissions using the ZONE_APPEND command for data
+	 * and block_group->meta_write_pointer for metadata.
 	 */
-	if (!block_group->zone_is_active) {
+	if (!fully_written) {
 		spin_unlock(&block_group->lock);
-		btrfs_dec_block_group_ro(block_group);
-		return 0;
-	}
 
-	if (block_group->reserved) {
-		spin_unlock(&block_group->lock);
-		btrfs_dec_block_group_ro(block_group);
-		return -EAGAIN;
+		ret = btrfs_inc_block_group_ro(block_group, false);
+		if (ret)
+			return ret;
+
+		/* Ensure all writes in this block group finish */
+		btrfs_wait_block_group_reservations(block_group);
+		/* No need to wait for NOCOW writers. Zoned mode does not allow that. */
+		btrfs_wait_ordered_roots(fs_info, U64_MAX, block_group->start,
+					 block_group->length);
+
+		spin_lock(&block_group->lock);
+
+		/*
+		 * Bail out if someone already deactivated the block group, or
+		 * allocated space is left in the block group.
+		 */
+		if (!block_group->zone_is_active) {
+			spin_unlock(&block_group->lock);
+			btrfs_dec_block_group_ro(block_group);
+			return 0;
+		}
+
+		if (block_group->reserved) {
+			spin_unlock(&block_group->lock);
+			btrfs_dec_block_group_ro(block_group);
+			return -EAGAIN;
+		}
 	}
 
+	/*
+	 * The block group is not fully allocated, so not fully written yet. We
+	 * need to send ZONE_FINISH command to free up an active zone.
+	 */
+	need_zone_finish = !btrfs_zoned_bg_is_full(block_group);
+
 	block_group->zone_is_active = 0;
 	block_group->alloc_offset = block_group->zone_capacity;
 	block_group->free_space_ctl->free_space = 0;
@@ -1937,24 +1947,29 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
 	btrfs_clear_data_reloc_bg(block_group);
 	spin_unlock(&block_group->lock);
 
+	map = block_group->physical_map;
 	for (i = 0; i < map->num_stripes; i++) {
-		device = map->stripes[i].dev;
-		physical = map->stripes[i].physical;
+		struct btrfs_device *device = map->stripes[i].dev;
+		const u64 physical = map->stripes[i].physical;
 
 		if (device->zone_info->max_active_zones == 0)
 			continue;
 
-		ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
-				       physical >> SECTOR_SHIFT,
-				       device->zone_info->zone_size >> SECTOR_SHIFT,
-				       GFP_NOFS);
+		if (need_zone_finish) {
+			ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
+					       physical >> SECTOR_SHIFT,
+					       device->zone_info->zone_size >> SECTOR_SHIFT,
+					       GFP_NOFS);
 
-		if (ret)
-			return ret;
+			if (ret)
+				return ret;
+		}
 
 		btrfs_dev_clear_active_zone(device, physical);
 	}
-	btrfs_dec_block_group_ro(block_group);
+
+	if (!fully_written)
+		btrfs_dec_block_group_ro(block_group);
 
 	spin_lock(&fs_info->zone_active_bgs_lock);
 	ASSERT(!list_empty(&block_group->active_bg_list));
@@ -1967,6 +1982,14 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
 	return 0;
 }
 
+int btrfs_zone_finish(struct btrfs_block_group *block_group)
+{
+	if (!btrfs_is_zoned(block_group->fs_info))
+		return 0;
+
+	return do_zone_finish(block_group, false);
+}
+
 bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
 {
 	struct btrfs_fs_info *fs_info = fs_devices->fs_info;
@@ -1998,9 +2021,6 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
 void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length)
 {
 	struct btrfs_block_group *block_group;
-	struct map_lookup *map;
-	struct btrfs_device *device;
-	u64 physical;
 
 	if (!btrfs_is_zoned(fs_info))
 		return;
@@ -2011,36 +2031,7 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
 	if (logical + length < block_group->start + block_group->zone_capacity)
 		goto out;
 
-	spin_lock(&block_group->lock);
-
-	if (!block_group->zone_is_active) {
-		spin_unlock(&block_group->lock);
-		goto out;
-	}
-
-	block_group->zone_is_active = 0;
-	/* We should have consumed all the free space */
-	ASSERT(block_group->alloc_offset == block_group->zone_capacity);
-	ASSERT(block_group->free_space_ctl->free_space == 0);
-	btrfs_clear_treelog_bg(block_group);
-	btrfs_clear_data_reloc_bg(block_group);
-	spin_unlock(&block_group->lock);
-
-	map = block_group->physical_map;
-	device = map->stripes[0].dev;
-	physical = map->stripes[0].physical;
-
-	if (!device->zone_info->max_active_zones)
-		goto out;
-
-	btrfs_dev_clear_active_zone(device, physical);
-
-	spin_lock(&fs_info->zone_active_bgs_lock);
-	ASSERT(!list_empty(&block_group->active_bg_list));
-	list_del_init(&block_group->active_bg_list);
-	spin_unlock(&fs_info->zone_active_bgs_lock);
-
-	btrfs_put_block_group(block_group);
+	do_zone_finish(block_group, true);
 
 out:
 	btrfs_put_block_group(block_group);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/5] btrfs: zoned: finish BG when there are no more allocatable bytes left
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 1/5] btrfs: zoned: introduce btrfs_zoned_bg_is_full Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function Naohiro Aota
@ 2022-05-04  0:48 ` Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 4/5] btrfs: zoned: properly finish block group on metadata write Naohiro Aota
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota, stable, Pankaj Raghav

Currently, btrfs_zone_finish_endio() finishes a block group only when the
written region reaches the end of the block group. We can also finish the
block group when no more allocation is possible.

Cc: stable@vger.kernel.org # 5.16+
Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/zoned.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 0286fb1c63db..320bb7ba1c49 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2021,6 +2021,7 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
 void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length)
 {
 	struct btrfs_block_group *block_group;
+	u64 min_alloc_bytes;
 
 	if (!btrfs_is_zoned(fs_info))
 		return;
@@ -2028,7 +2029,15 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
 	block_group = btrfs_lookup_block_group(fs_info, logical);
 	ASSERT(block_group);
 
-	if (logical + length < block_group->start + block_group->zone_capacity)
+	/* No MIXED BG on zoned btrfs. */
+	if (block_group->flags & BTRFS_BLOCK_GROUP_DATA)
+		min_alloc_bytes = fs_info->sectorsize;
+	else
+		min_alloc_bytes = fs_info->nodesize;
+
+	/* Bail out if we can allocate more data from this BG. */
+	if (logical + length + min_alloc_bytes <=
+	    block_group->start + block_group->zone_capacity)
 		goto out;
 
 	do_zone_finish(block_group, true);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/5] btrfs: zoned: properly finish block group on metadata write
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
                   ` (2 preceding siblings ...)
  2022-05-04  0:48 ` [PATCH v2 3/5] btrfs: zoned: finish BG when there are no more allocatable bytes left Naohiro Aota
@ 2022-05-04  0:48 ` Naohiro Aota
  2022-05-04  0:48 ` [PATCH v2 5/5] btrfs: zoned: zone finish unused block group Naohiro Aota
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota, stable

Commit be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
introduced zone finishing code both for data and metadata end_io path.
However, the metadata side is not working as it should be. First, it
compares logical address (eb->start + eb->len) with offset within a block
group (cache->zone_capacity) in submit_eb_page(). That essentially disabled
zone finishing on metadata end_io path.

Furthermore, fixing the issue above revealed we cannot call
btrfs_zone_finish_endio() in end_extent_buffer_writeback(). We cannot call
btrfs_lookup_block_group() which require spin lock inside end_io context.

This commit introduces btrfs_schedule_zone_finish_bg() to wait for the
extent buffer writeback and do the zone finish IO in a workqueue.

Also, drop EXTENT_BUFFER_ZONE_FINISH as it is no longer used.

Cc: stable@vger.kernel.org # 5.16+
Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/block-group.h |  2 ++
 fs/btrfs/extent_io.c   |  6 +-----
 fs/btrfs/extent_io.h   |  1 -
 fs/btrfs/zoned.c       | 34 ++++++++++++++++++++++++++++++++++
 fs/btrfs/zoned.h       |  5 +++++
 5 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index c9bf01dd10e8..3ac668ace50a 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -212,6 +212,8 @@ struct btrfs_block_group {
 	u64 meta_write_pointer;
 	struct map_lookup *physical_map;
 	struct list_head active_bg_list;
+	struct work_struct zone_finish_work;
+	struct extent_buffer *last_eb;
 };
 
 static inline u64 btrfs_block_group_end(struct btrfs_block_group *block_group)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1b1baeb0d76b..588c7c606a2c 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4251,9 +4251,6 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb)
 
 static void end_extent_buffer_writeback(struct extent_buffer *eb)
 {
-	if (test_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags))
-		btrfs_zone_finish_endio(eb->fs_info, eb->start, eb->len);
-
 	clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
 	smp_mb__after_atomic();
 	wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
@@ -4843,8 +4840,7 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc,
 		/*
 		 * Implies write in zoned mode. Mark the last eb in a block group.
 		 */
-		if (cache->seq_zone && eb->start + eb->len == cache->zone_capacity)
-			set_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags);
+		btrfs_schedule_zone_finish_bg(cache, eb);
 		btrfs_put_block_group(cache);
 	}
 	ret = write_one_eb(eb, wbc, epd);
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 17674b7e699c..956fa434df43 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -26,7 +26,6 @@ enum {
 	/* write IO error */
 	EXTENT_BUFFER_WRITE_ERR,
 	EXTENT_BUFFER_NO_CHECK,
-	EXTENT_BUFFER_ZONE_FINISH,
 };
 
 /* these are flags for __process_pages_contig */
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 320bb7ba1c49..905ce5498ee0 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2046,6 +2046,40 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
 	btrfs_put_block_group(block_group);
 }
 
+static void btrfs_zone_finish_endio_workfn(struct work_struct *work)
+{
+	struct btrfs_block_group *bg =
+		container_of(work, struct btrfs_block_group, zone_finish_work);
+
+	wait_on_extent_buffer_writeback(bg->last_eb);
+	free_extent_buffer(bg->last_eb);
+	btrfs_zone_finish_endio(bg->fs_info, bg->start, bg->length);
+	btrfs_put_block_group(bg);
+}
+
+void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
+				   struct extent_buffer *eb)
+{
+	if (!bg->seq_zone ||
+	    eb->start + eb->len * 2 <= bg->start + bg->zone_capacity)
+		return;
+
+	if (WARN_ON(bg->zone_finish_work.func ==
+		    btrfs_zone_finish_endio_workfn)) {
+		btrfs_err(bg->fs_info,
+			  "double scheduling of BG %llu zone finishing",
+			  bg->start);
+		return;
+	}
+
+	/* For the work */
+	btrfs_get_block_group(bg);
+	atomic_inc(&eb->refs);
+	bg->last_eb = eb;
+	INIT_WORK(&bg->zone_finish_work, btrfs_zone_finish_endio_workfn);
+	queue_work(system_unbound_wq, &bg->zone_finish_work);
+}
+
 void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg)
 {
 	struct btrfs_fs_info *fs_info = bg->fs_info;
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index 98f277ed5138..a4126ec6b909 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -72,6 +72,8 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group);
 bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags);
 void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical,
 			     u64 length);
+void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
+				   struct extent_buffer *eb);
 void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
 void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
 bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info);
@@ -230,6 +232,9 @@ static inline bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices,
 static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info,
 					   u64 logical, u64 length) { }
 
+static inline void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
+						 struct extent_buffer *eb) { }
+
 static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
 
 static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/5] btrfs: zoned: zone finish unused block group
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
                   ` (3 preceding siblings ...)
  2022-05-04  0:48 ` [PATCH v2 4/5] btrfs: zoned: properly finish block group on metadata write Naohiro Aota
@ 2022-05-04  0:48 ` Naohiro Aota
  2022-05-04 15:20 ` [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing David Sterba
  2022-05-04 16:02 ` Johannes Thumshirn
  6 siblings, 0 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-05-04  0:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota, stable

While the active zones within an active block group are reset, and their
active resource is released, the block group itself is kept in the active
block group list and marked as active. As a result, the list will contain
more than max_active_zones block groups. That itself is not fatal for the
device as the zones are properly reset.

However, that inflated list is, of course, strange. Also, a to-appear patch
series, which deactivates an active block group on demand, get confused
with the wrong list.

So, fix the issue by finishing the unused block group once it gets
read-only, so that we can release the active resource in an early stage.

Cc: stable@vger.kernel.org # 5.16+
Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/block-group.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 9739f3e8230a..ede389f2602d 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1385,6 +1385,14 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info)
 			goto next;
 		}
 
+		ret = btrfs_zone_finish(block_group);
+		if (ret < 0) {
+			btrfs_dec_block_group_ro(block_group);
+			if (ret == -EAGAIN)
+				ret = 0;
+			goto next;
+		}
+
 		/*
 		 * Want to do this before we do anything else so we can recover
 		 * properly if we fail to join the transaction.
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
                   ` (4 preceding siblings ...)
  2022-05-04  0:48 ` [PATCH v2 5/5] btrfs: zoned: zone finish unused block group Naohiro Aota
@ 2022-05-04 15:20 ` David Sterba
  2022-05-04 16:02 ` Johannes Thumshirn
  6 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2022-05-04 15:20 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs

On Tue, May 03, 2022 at 05:48:49PM -0700, Naohiro Aota wrote:
> - Changes
>  - v2
>    - Rename some functions/variables.
>    - Intoduce btrfs_zoned_bg_is_full() to check if a block group is fully
>      allocated or not.
>    - Add some more comments.
> 
> * Note: this series depends on "btrfs: zoned: fix zone activation logic"
>   series (patch 1). I found the bug addressed in the series while I'm
>   introducing the helper. 
> 
> Commit be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
> introduced zone finishing a block group when the IO reaches at the end of
> the block group. However, since the zone capacity may not aligned to 16KB
> (node size), we can leave an un-allocatable space at the end of a block
> group. Also, it turned out that metadata zone finishing never works
> actually.
> 
> This series addresses these issues by rewriting metadata zone finishing
> code to use a workqueue to process the finishing work.
> 
> Patch 1 introduces a helper to check if a block group is fully allocated.
> The helper is used in patch 2.
> 
> Patch 2 is a clean-up patch to consolidate zone finishing function for
> better maintainability.
> 
> Patch 3 changes the left region calculation so that it finishes a block
> group when there is no more space left for allocation.
> 
> Patch 4 fixes metadata block group finishing which is not actually working.
> 
> Patch 5 implements zone finishing of an unused block group and fixes active
> block group accounting. This patch is a bit unrelated to other ones. But,
> the patch is tested with the previous patches applied, so let me go with
> the same series.
> 
> Naohiro Aota (5):
>   btrfs: zoned: introduce btrfs_zoned_bg_is_full
>   btrfs: zoned: consolidate zone finish function
>   btrfs: zoned: finish BG when there are no more allocatable bytes left
>   btrfs: zoned: properly finish block group on metadata write
>   btrfs: zoned: zone finish unused block group

Added to misc-next, thanks. I'll update patches with any followup
reviews.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function
  2022-05-04  0:48 ` [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function Naohiro Aota
@ 2022-05-04 16:00   ` Johannes Thumshirn
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2022-05-04 16:00 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs

On 03/05/2022 17:49, Naohiro Aota wrote:
> Introduce __btrfs_zone_finish() to consolidate them.

This still has __btrfs_zone_finish() instead of do_zone_finish(),
but I think David can fix that up when he applies the series.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing
  2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
                   ` (5 preceding siblings ...)
  2022-05-04 15:20 ` [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing David Sterba
@ 2022-05-04 16:02 ` Johannes Thumshirn
  6 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2022-05-04 16:02 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs

Apart from the one small comment on path 2,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-04 16:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-04  0:48 [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing Naohiro Aota
2022-05-04  0:48 ` [PATCH v2 1/5] btrfs: zoned: introduce btrfs_zoned_bg_is_full Naohiro Aota
2022-05-04  0:48 ` [PATCH v2 2/5] btrfs: zoned: consolidate zone finish function Naohiro Aota
2022-05-04 16:00   ` Johannes Thumshirn
2022-05-04  0:48 ` [PATCH v2 3/5] btrfs: zoned: finish BG when there are no more allocatable bytes left Naohiro Aota
2022-05-04  0:48 ` [PATCH v2 4/5] btrfs: zoned: properly finish block group on metadata write Naohiro Aota
2022-05-04  0:48 ` [PATCH v2 5/5] btrfs: zoned: zone finish unused block group Naohiro Aota
2022-05-04 15:20 ` [PATCH v2 0/5] btrfs: zoned: fixes for zone finishing David Sterba
2022-05-04 16:02 ` Johannes Thumshirn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.