All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] extent buffer dirty cleanups
@ 2022-12-07 22:28 Josef Bacik
  2022-12-07 22:28 ` [PATCH 1/8] btrfs: always lock the block before calling btrfs_clean_tree_block Josef Bacik
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

While sync'ing ctree.c to btrfs-progs I noticed we have some oddities when it
comes to how we deal with the extent buffer being dirty.  We have
btrfs_clean_tree_block, which is sort of meant to be run against extent buffers
we've modified in this transaction.  However we have some other places where
we've open coded the same work without the generation check.  This makes it kind
of confusing, and is inconsistent with how we deal with the
fs_info->dirty_metadata_bytes.

So clean this stuff up so we have one helper we use for setting the extent
buffer dirty (btrfs_mark_buffer_dirty) and one for clearing dirty
(btrfs_clear_buffer_dirty).  This makes everything more consistent and clean
across the board.  I've additionally cleaned up a random writeback thing we had
in tree-log that I noticed while doing these cleanups.  Thanks,

Josef

Josef Bacik (8):
  btrfs: always lock the block before calling btrfs_clean_tree_block
  btrfs: do not check header generation in btrfs_clean_tree_block
  btrfs: do not set the header generation before btrfs_clean_tree_block
  btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block
  btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr
  btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty
  btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty
  btrfs: remove btrfs_wait_tree_block_writeback

 fs/btrfs/ctree.c           | 16 +++++++--------
 fs/btrfs/disk-io.c         | 25 +++++-------------------
 fs/btrfs/disk-io.h         |  2 +-
 fs/btrfs/extent-tree.c     | 12 ++++--------
 fs/btrfs/extent_io.c       | 18 +++++++++--------
 fs/btrfs/extent_io.h       |  2 +-
 fs/btrfs/free-space-tree.c |  2 +-
 fs/btrfs/ioctl.c           |  2 +-
 fs/btrfs/qgroup.c          |  2 +-
 fs/btrfs/tree-log.c        | 40 ++++++++++++++------------------------
 10 files changed, 47 insertions(+), 74 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/8] btrfs: always lock the block before calling btrfs_clean_tree_block
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block Josef Bacik
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We want to clean up the dirty handling for extent buffers so it's a
little more consistent, so skip the check for generation == transid and
simply always lock the extent buffer before calling
btrfs_clean_tree_block.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 876bea67f9a1..c85af644e353 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5534,8 +5534,7 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans,
 			}
 		}
 		/* make block locked assertion in btrfs_clean_tree_block happy */
-		if (!path->locks[level] &&
-		    btrfs_header_generation(eb) == trans->transid) {
+		if (!path->locks[level]) {
 			btrfs_tree_lock(eb);
 			path->locks[level] = BTRFS_WRITE_LOCK;
 		}
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
  2022-12-07 22:28 ` [PATCH 1/8] btrfs: always lock the block before calling btrfs_clean_tree_block Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-16  5:32   ` Qu Wenruo
  2022-12-07 22:28 ` [PATCH 3/8] btrfs: do not set the header generation before btrfs_clean_tree_block Josef Bacik
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This check is from an era where we didn't have a per-extent buffer dirty
flag, we just messed with the page bits.  All the places we call this
function are when we have a transaction open, so just skip this check
and rely on the dirty flag.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d0ed52cab304..267163e546a5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -968,16 +968,13 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
 void btrfs_clean_tree_block(struct extent_buffer *buf)
 {
 	struct btrfs_fs_info *fs_info = buf->fs_info;
-	if (btrfs_header_generation(buf) ==
-	    fs_info->running_transaction->transid) {
-		btrfs_assert_tree_write_locked(buf);
+	btrfs_assert_tree_write_locked(buf);
 
-		if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
-			percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
-						 -buf->len,
-						 fs_info->dirty_metadata_batch);
-			clear_extent_buffer_dirty(buf);
-		}
+	if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
+		percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
+					 -buf->len,
+					 fs_info->dirty_metadata_batch);
+		clear_extent_buffer_dirty(buf);
 	}
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/8] btrfs: do not set the header generation before btrfs_clean_tree_block
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
  2022-12-07 22:28 ` [PATCH 1/8] btrfs: always lock the block before calling btrfs_clean_tree_block Josef Bacik
  2022-12-07 22:28 ` [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 4/8] btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block Josef Bacik
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We were doing this before to deal with possible uninitialized pages
which was reported by syzbot.  Now that btrfs_clean_tree_block no longer
checks the header generation simply remove this extra init.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c85af644e353..971b1de50d9e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4897,9 +4897,6 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 	    !test_bit(BTRFS_ROOT_RESET_LOCKDEP_CLASS, &root->state))
 		lockdep_owner = BTRFS_FS_TREE_OBJECTID;
 
-	/* btrfs_clean_tree_block() accesses generation field. */
-	btrfs_set_header_generation(buf, trans->transid);
-
 	/*
 	 * This needs to stay, because we could allocate a freed block from an
 	 * old tree into a new tree, so we need to make sure this new block is
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/8] btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (2 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 3/8] btrfs: do not set the header generation before btrfs_clean_tree_block Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 5/8] btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr Josef Bacik
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that btrfs_clean_block doesn't care about the transid, we can
replace all occurrences of

if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags))
	clear_extent_buffer_dirty(eb);

with

btrfs_tree_lock(eb);
btrfs_clean_tree_block(eb);
btrfs_tree_unlock(eb);

We need the lock because if we are actually dirty we need to make sure
we aren't racing with anything that's starting writeout currently.  This
also makes sure that we're accounting fs_info->dirty_metadata_bytes
appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c  |  9 +++++----
 fs/btrfs/tree-log.c | 34 +++++++++++++++-------------------
 2 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 267163e546a5..275ba1925eeb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -5074,11 +5074,12 @@ static int btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info,
 			start += fs_info->nodesize;
 			if (!eb)
 				continue;
-			wait_on_extent_buffer_writeback(eb);
 
-			if (test_and_clear_bit(EXTENT_BUFFER_DIRTY,
-					       &eb->bflags))
-				clear_extent_buffer_dirty(eb);
+			btrfs_tree_lock(eb);
+			wait_on_extent_buffer_writeback(eb);
+			btrfs_clean_tree_block(eb);
+			btrfs_tree_unlock(eb);
+
 			free_extent_buffer_stale(eb);
 		}
 	}
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 8fcfaf015a70..73e621df32f7 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2635,11 +2635,12 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 					return ret;
 				}
 
+				btrfs_tree_lock(next);
+				btrfs_clean_tree_block(next);
+				btrfs_wait_tree_block_writeback(next);
+				btrfs_tree_unlock(next);
+
 				if (trans) {
-					btrfs_tree_lock(next);
-					btrfs_clean_tree_block(next);
-					btrfs_wait_tree_block_writeback(next);
-					btrfs_tree_unlock(next);
 					ret = btrfs_pin_reserved_extent(trans,
 							bytenr, blocksize);
 					if (ret) {
@@ -2649,8 +2650,6 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 					btrfs_redirty_list_add(
 						trans->transaction, next);
 				} else {
-					if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
-						clear_extent_buffer_dirty(next);
 					unaccount_log_buffer(fs_info, bytenr);
 				}
 			}
@@ -2705,11 +2704,12 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
 
 				next = path->nodes[*level];
 
+				btrfs_tree_lock(next);
+				btrfs_clean_tree_block(next);
+				btrfs_wait_tree_block_writeback(next);
+				btrfs_tree_unlock(next);
+
 				if (trans) {
-					btrfs_tree_lock(next);
-					btrfs_clean_tree_block(next);
-					btrfs_wait_tree_block_writeback(next);
-					btrfs_tree_unlock(next);
 					ret = btrfs_pin_reserved_extent(trans,
 						     path->nodes[*level]->start,
 						     path->nodes[*level]->len);
@@ -2718,9 +2718,6 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
 					btrfs_redirty_list_add(trans->transaction,
 							       next);
 				} else {
-					if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
-						clear_extent_buffer_dirty(next);
-
 					unaccount_log_buffer(fs_info,
 						path->nodes[*level]->start);
 				}
@@ -2788,19 +2785,18 @@ static int walk_log_tree(struct btrfs_trans_handle *trans,
 
 			next = path->nodes[orig_level];
 
+			btrfs_tree_lock(next);
+			btrfs_clean_tree_block(next);
+			btrfs_wait_tree_block_writeback(next);
+			btrfs_tree_unlock(next);
+
 			if (trans) {
-				btrfs_tree_lock(next);
-				btrfs_clean_tree_block(next);
-				btrfs_wait_tree_block_writeback(next);
-				btrfs_tree_unlock(next);
 				ret = btrfs_pin_reserved_extent(trans,
 						next->start, next->len);
 				if (ret)
 					goto out;
 				btrfs_redirty_list_add(trans->transaction, next);
 			} else {
-				if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
-					clear_extent_buffer_dirty(next);
 				unaccount_log_buffer(fs_info, next->start);
 			}
 		}
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/8] btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (3 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 4/8] btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 6/8] btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty Josef Bacik
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We only add if we set the extent buffer dirty, and we subtract when we
clear the extent buffer dirty.  If we end up in set_btree_ioerr we have
already cleared the buffer dirty, and we aren't resetting dirty on the
extent buffer, so this is simply wrong.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent_io.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c87be46e0663..7cd4065acc82 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2467,13 +2467,6 @@ static void set_btree_ioerr(struct page *page, struct extent_buffer *eb)
 	 */
 	mapping_set_error(page->mapping, -EIO);
 
-	/*
-	 * If we error out, we should add back the dirty_metadata_bytes
-	 * to make it consistent.
-	 */
-	percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
-				 eb->len, fs_info->dirty_metadata_batch);
-
 	/*
 	 * If writeback for a btree extent that doesn't belong to a log tree
 	 * failed, increment the counter transaction->eb_write_errors.
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/8] btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (4 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 5/8] btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 7/8] btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty Josef Bacik
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_clean_tree_block is a misnomer, it's just
clear_extent_buffer_dirty with some extra accounting around it.  Rename
this to btrfs_clear_buffer_dirty to make it more clear it belongs with
it's setter, btrfs_mark_buffer_dirty.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c           | 16 ++++++++--------
 fs/btrfs/disk-io.c         |  4 ++--
 fs/btrfs/disk-io.h         |  2 +-
 fs/btrfs/extent-tree.c     |  6 +++---
 fs/btrfs/free-space-tree.c |  2 +-
 fs/btrfs/ioctl.c           |  2 +-
 fs/btrfs/qgroup.c          |  2 +-
 fs/btrfs/tree-log.c        |  6 +++---
 8 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 5476d90a76ce..4b7cd5acdaf9 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -459,7 +459,7 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans,
 			if (ret)
 				return ret;
 		}
-		btrfs_clean_tree_block(buf);
+		btrfs_clear_buffer_dirty(buf);
 		*last_ref = 1;
 	}
 	return 0;
@@ -1026,7 +1026,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
 
 		path->locks[level] = 0;
 		path->nodes[level] = NULL;
-		btrfs_clean_tree_block(mid);
+		btrfs_clear_buffer_dirty(mid);
 		btrfs_tree_unlock(mid);
 		/* once for the path */
 		free_extent_buffer(mid);
@@ -1087,7 +1087,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
 		if (wret < 0 && wret != -ENOSPC)
 			ret = wret;
 		if (btrfs_header_nritems(right) == 0) {
-			btrfs_clean_tree_block(right);
+			btrfs_clear_buffer_dirty(right);
 			btrfs_tree_unlock(right);
 			del_ptr(root, path, level + 1, pslot + 1);
 			root_sub_used(root, right->len);
@@ -1133,7 +1133,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
 		BUG_ON(wret == 1);
 	}
 	if (btrfs_header_nritems(mid) == 0) {
-		btrfs_clean_tree_block(mid);
+		btrfs_clear_buffer_dirty(mid);
 		btrfs_tree_unlock(mid);
 		del_ptr(root, path, level + 1, pslot);
 		root_sub_used(root, mid->len);
@@ -3106,7 +3106,7 @@ static noinline int __push_leaf_right(struct btrfs_path *path,
 	if (left_nritems)
 		btrfs_mark_buffer_dirty(left);
 	else
-		btrfs_clean_tree_block(left);
+		btrfs_clear_buffer_dirty(left);
 
 	btrfs_mark_buffer_dirty(right);
 
@@ -3118,7 +3118,7 @@ static noinline int __push_leaf_right(struct btrfs_path *path,
 	if (path->slots[0] >= left_nritems) {
 		path->slots[0] -= left_nritems;
 		if (btrfs_header_nritems(path->nodes[0]) == 0)
-			btrfs_clean_tree_block(path->nodes[0]);
+			btrfs_clear_buffer_dirty(path->nodes[0]);
 		btrfs_tree_unlock(path->nodes[0]);
 		free_extent_buffer(path->nodes[0]);
 		path->nodes[0] = right;
@@ -3330,7 +3330,7 @@ static noinline int __push_leaf_left(struct btrfs_path *path, int data_size,
 	if (right_nritems)
 		btrfs_mark_buffer_dirty(right);
 	else
-		btrfs_clean_tree_block(right);
+		btrfs_clear_buffer_dirty(right);
 
 	btrfs_item_key(right, &disk_key, 0);
 	fixup_low_keys(path, &disk_key, 1);
@@ -4367,7 +4367,7 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		if (leaf == root->node) {
 			btrfs_set_header_level(leaf, 0);
 		} else {
-			btrfs_clean_tree_block(leaf);
+			btrfs_clear_buffer_dirty(leaf);
 			btrfs_del_leaf(trans, root, path, leaf);
 		}
 	} else {
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 275ba1925eeb..b8f1ac54d10c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -965,7 +965,7 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
 
 }
 
-void btrfs_clean_tree_block(struct extent_buffer *buf)
+void btrfs_clear_buffer_dirty(struct extent_buffer *buf)
 {
 	struct btrfs_fs_info *fs_info = buf->fs_info;
 	btrfs_assert_tree_write_locked(buf);
@@ -5077,7 +5077,7 @@ static int btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info,
 
 			btrfs_tree_lock(eb);
 			wait_on_extent_buffer_writeback(eb);
-			btrfs_clean_tree_block(eb);
+			btrfs_clear_buffer_dirty(eb);
 			btrfs_tree_unlock(eb);
 
 			free_extent_buffer_stale(eb);
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index f2c507fd0e04..8d23746c7660 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -39,7 +39,7 @@ struct extent_buffer *btrfs_find_create_tree_block(
 						struct btrfs_fs_info *fs_info,
 						u64 bytenr, u64 owner_root,
 						int level);
-void btrfs_clean_tree_block(struct extent_buffer *buf);
+void btrfs_clear_buffer_dirty(struct extent_buffer *buf);
 void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info);
 int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info);
 int btrfs_check_super_csum(struct btrfs_fs_info *fs_info,
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 971b1de50d9e..3f123c3570a9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4905,7 +4905,7 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 	btrfs_set_buffer_lockdep_class(lockdep_owner, buf, level);
 
 	__btrfs_tree_lock(buf, nest);
-	btrfs_clean_tree_block(buf);
+	btrfs_clear_buffer_dirty(buf);
 	clear_bit(EXTENT_BUFFER_STALE, &buf->bflags);
 	clear_bit(EXTENT_BUFFER_NO_CHECK, &buf->bflags);
 
@@ -5530,12 +5530,12 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans,
 				}
 			}
 		}
-		/* make block locked assertion in btrfs_clean_tree_block happy */
+		/* make block locked assertion in btrfs_clear_buffer_dirty happy */
 		if (!path->locks[level]) {
 			btrfs_tree_lock(eb);
 			path->locks[level] = BTRFS_WRITE_LOCK;
 		}
-		btrfs_clean_tree_block(eb);
+		btrfs_clear_buffer_dirty(eb);
 	}
 
 	if (eb == root->node) {
diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index c667e878ef1a..e03f58f43719 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1283,7 +1283,7 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info)
 	list_del(&free_space_root->dirty_list);
 
 	btrfs_tree_lock(free_space_root->node);
-	btrfs_clean_tree_block(free_space_root->node);
+	btrfs_clear_buffer_dirty(free_space_root->node);
 	btrfs_tree_unlock(free_space_root->node);
 	btrfs_free_tree_block(trans, btrfs_root_id(free_space_root),
 			      free_space_root->node, 0, 1);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a64a71d882dc..8048b536c682 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -707,7 +707,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		 * exists).
 		 */
 		btrfs_tree_lock(leaf);
-		btrfs_clean_tree_block(leaf);
+		btrfs_clear_buffer_dirty(leaf);
 		btrfs_tree_unlock(leaf);
 		btrfs_free_tree_block(trans, objectid, leaf, 0, 1);
 		free_extent_buffer(leaf);
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 35856ea28e32..9e22698ea439 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1303,7 +1303,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	list_del(&quota_root->dirty_list);
 
 	btrfs_tree_lock(quota_root->node);
-	btrfs_clean_tree_block(quota_root->node);
+	btrfs_clear_buffer_dirty(quota_root->node);
 	btrfs_tree_unlock(quota_root->node);
 	btrfs_free_tree_block(trans, btrfs_root_id(quota_root),
 			      quota_root->node, 0, 1);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 73e621df32f7..15695f505f05 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2636,7 +2636,7 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 				}
 
 				btrfs_tree_lock(next);
-				btrfs_clean_tree_block(next);
+				btrfs_clear_buffer_dirty(next);
 				btrfs_wait_tree_block_writeback(next);
 				btrfs_tree_unlock(next);
 
@@ -2705,7 +2705,7 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
 				next = path->nodes[*level];
 
 				btrfs_tree_lock(next);
-				btrfs_clean_tree_block(next);
+				btrfs_clear_buffer_dirty(next);
 				btrfs_wait_tree_block_writeback(next);
 				btrfs_tree_unlock(next);
 
@@ -2786,7 +2786,7 @@ static int walk_log_tree(struct btrfs_trans_handle *trans,
 			next = path->nodes[orig_level];
 
 			btrfs_tree_lock(next);
-			btrfs_clean_tree_block(next);
+			btrfs_clear_buffer_dirty(next);
 			btrfs_wait_tree_block_writeback(next);
 			btrfs_tree_unlock(next);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/8] btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (5 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 6/8] btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-07 22:28 ` [PATCH 8/8] btrfs: remove btrfs_wait_tree_block_writeback Josef Bacik
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

btrfs_clear_buffer_dirty just does the test_clear_bit() and then calls
clear_extent_buffer_dirty and does the dirty metadata accounting.
Combine this into clear_extent_buffer_dirty and make the result
btrfs_clear_buffer_dirty.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c   | 13 -------------
 fs/btrfs/extent_io.c | 11 ++++++++++-
 fs/btrfs/extent_io.h |  2 +-
 3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b8f1ac54d10c..8edbf73d8707 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -965,19 +965,6 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
 
 }
 
-void btrfs_clear_buffer_dirty(struct extent_buffer *buf)
-{
-	struct btrfs_fs_info *fs_info = buf->fs_info;
-	btrfs_assert_tree_write_locked(buf);
-
-	if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
-		percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
-					 -buf->len,
-					 fs_info->dirty_metadata_batch);
-		clear_extent_buffer_dirty(buf);
-	}
-}
-
 static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info,
 			 u64 objectid)
 {
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 7cd4065acc82..c242aabfa863 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4813,12 +4813,21 @@ static void clear_subpage_extent_buffer_dirty(const struct extent_buffer *eb)
 	WARN_ON(atomic_read(&eb->refs) == 0);
 }
 
-void clear_extent_buffer_dirty(const struct extent_buffer *eb)
+void btrfs_clear_buffer_dirty(struct extent_buffer *eb)
 {
+	struct btrfs_fs_info *fs_info = eb->fs_info;
 	int i;
 	int num_pages;
 	struct page *page;
 
+	btrfs_assert_tree_write_locked(eb);
+
+	if (!test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags))
+		return;
+
+	percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, -eb->len,
+				 fs_info->dirty_metadata_batch);
+
 	if (eb->fs_info->nodesize < PAGE_SIZE)
 		return clear_subpage_extent_buffer_dirty(eb);
 
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index a024655d4237..41fc887d6cfe 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -261,7 +261,6 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star
 void extent_buffer_bitmap_clear(const struct extent_buffer *eb,
 				unsigned long start, unsigned long pos,
 				unsigned long len);
-void clear_extent_buffer_dirty(const struct extent_buffer *eb);
 bool set_extent_buffer_dirty(struct extent_buffer *eb);
 void set_extent_buffer_uptodate(struct extent_buffer *eb);
 void clear_extent_buffer_uptodate(struct extent_buffer *eb);
@@ -273,6 +272,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 				  u32 bits_to_clear, unsigned long page_ops);
 int extent_invalidate_folio(struct extent_io_tree *tree,
 			    struct folio *folio, size_t offset);
+void btrfs_clear_buffer_dirty(struct extent_buffer *buf);
 
 int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 8/8] btrfs: remove btrfs_wait_tree_block_writeback
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (6 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 7/8] btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty Josef Bacik
@ 2022-12-07 22:28 ` Josef Bacik
  2022-12-14 17:49 ` [PATCH 0/8] extent buffer dirty cleanups David Sterba
  2022-12-15  4:38 ` Wang Yugui
  9 siblings, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2022-12-07 22:28 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This is used in the tree-log code and is a holdover from previous
iterations of extent buffer writeback.  We can simply use
wait_on_extent_buffer_writeback here, and remove
btrfs_wait_tree_block_writeback completely.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-log.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 15695f505f05..15f8130d812c 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -278,12 +278,6 @@ void btrfs_end_log_trans(struct btrfs_root *root)
 	}
 }
 
-static void btrfs_wait_tree_block_writeback(struct extent_buffer *buf)
-{
-	filemap_fdatawait_range(buf->pages[0]->mapping,
-			        buf->start, buf->start + buf->len - 1);
-}
-
 /*
  * the walk control struct is used to pass state down the chain when
  * processing the log tree.  The stage field tells us which part
@@ -2637,7 +2631,7 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 
 				btrfs_tree_lock(next);
 				btrfs_clear_buffer_dirty(next);
-				btrfs_wait_tree_block_writeback(next);
+				wait_on_extent_buffer_writeback(next);
 				btrfs_tree_unlock(next);
 
 				if (trans) {
@@ -2706,7 +2700,7 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
 
 				btrfs_tree_lock(next);
 				btrfs_clear_buffer_dirty(next);
-				btrfs_wait_tree_block_writeback(next);
+				wait_on_extent_buffer_writeback(next);
 				btrfs_tree_unlock(next);
 
 				if (trans) {
@@ -2787,7 +2781,7 @@ static int walk_log_tree(struct btrfs_trans_handle *trans,
 
 			btrfs_tree_lock(next);
 			btrfs_clear_buffer_dirty(next);
-			btrfs_wait_tree_block_writeback(next);
+			wait_on_extent_buffer_writeback(next);
 			btrfs_tree_unlock(next);
 
 			if (trans) {
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/8] extent buffer dirty cleanups
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (7 preceding siblings ...)
  2022-12-07 22:28 ` [PATCH 8/8] btrfs: remove btrfs_wait_tree_block_writeback Josef Bacik
@ 2022-12-14 17:49 ` David Sterba
  2022-12-15  4:38 ` Wang Yugui
  9 siblings, 0 replies; 17+ messages in thread
From: David Sterba @ 2022-12-14 17:49 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, Dec 07, 2022 at 05:28:03PM -0500, Josef Bacik wrote:
> Hello,
> 
> While sync'ing ctree.c to btrfs-progs I noticed we have some oddities when it
> comes to how we deal with the extent buffer being dirty.  We have
> btrfs_clean_tree_block, which is sort of meant to be run against extent buffers
> we've modified in this transaction.  However we have some other places where
> we've open coded the same work without the generation check.  This makes it kind
> of confusing, and is inconsistent with how we deal with the
> fs_info->dirty_metadata_bytes.
> 
> So clean this stuff up so we have one helper we use for setting the extent
> buffer dirty (btrfs_mark_buffer_dirty) and one for clearing dirty
> (btrfs_clear_buffer_dirty).  This makes everything more consistent and clean
> across the board.  I've additionally cleaned up a random writeback thing we had
> in tree-log that I noticed while doing these cleanups.  Thanks,
> 
> Josef
> 
> Josef Bacik (8):
>   btrfs: always lock the block before calling btrfs_clean_tree_block
>   btrfs: do not check header generation in btrfs_clean_tree_block
>   btrfs: do not set the header generation before btrfs_clean_tree_block
>   btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block
>   btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr
>   btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty
>   btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty
>   btrfs: remove btrfs_wait_tree_block_writeback

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/8] extent buffer dirty cleanups
  2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
                   ` (8 preceding siblings ...)
  2022-12-14 17:49 ` [PATCH 0/8] extent buffer dirty cleanups David Sterba
@ 2022-12-15  4:38 ` Wang Yugui
  9 siblings, 0 replies; 17+ messages in thread
From: Wang Yugui @ 2022-12-15  4:38 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

Hi,

> Hello,
> 
> While sync'ing ctree.c to btrfs-progs I noticed we have some oddities when it
> comes to how we deal with the extent buffer being dirty.  We have
> btrfs_clean_tree_block, which is sort of meant to be run against extent buffers
> we've modified in this transaction.  However we have some other places where
> we've open coded the same work without the generation check.  This makes it kind
> of confusing, and is inconsistent with how we deal with the
> fs_info->dirty_metadata_bytes.
> 
> So clean this stuff up so we have one helper we use for setting the extent
> buffer dirty (btrfs_mark_buffer_dirty) and one for clearing dirty
> (btrfs_clear_buffer_dirty).  This makes everything more consistent and clean
> across the board.  I've additionally cleaned up a random writeback thing we had
> in tree-log that I noticed while doing these cleanups.  Thanks,
> 
> Josef
> 
> Josef Bacik (8):
>   btrfs: always lock the block before calling btrfs_clean_tree_block
>   btrfs: do not check header generation in btrfs_clean_tree_block
>   btrfs: do not set the header generation before btrfs_clean_tree_block
>   btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block
>   btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr
>   btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty
>   btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty
>   btrfs: remove btrfs_wait_tree_block_writeback

xfstests result with/without these patches.
with the patches [1-8]	btrfs/066 btrfs/072 btrfs/074 failed
with the patches [1-5]	btrfs/066 btrfs/072 btrfs/074 failed
without the patches 	OK. no failure.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/12/15

>  fs/btrfs/ctree.c           | 16 +++++++--------
>  fs/btrfs/disk-io.c         | 25 +++++-------------------
>  fs/btrfs/disk-io.h         |  2 +-
>  fs/btrfs/extent-tree.c     | 12 ++++--------
>  fs/btrfs/extent_io.c       | 18 +++++++++--------
>  fs/btrfs/extent_io.h       |  2 +-
>  fs/btrfs/free-space-tree.c |  2 +-
>  fs/btrfs/ioctl.c           |  2 +-
>  fs/btrfs/qgroup.c          |  2 +-
>  fs/btrfs/tree-log.c        | 40 ++++++++++++++------------------------
>  10 files changed, 47 insertions(+), 74 deletions(-)
> 
> -- 
> 2.26.3



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2022-12-07 22:28 ` [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block Josef Bacik
@ 2022-12-16  5:32   ` Qu Wenruo
  2022-12-17  2:10     ` Qu Wenruo
  0 siblings, 1 reply; 17+ messages in thread
From: Qu Wenruo @ 2022-12-16  5:32 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2022/12/8 06:28, Josef Bacik wrote:
> This check is from an era where we didn't have a per-extent buffer dirty
> flag, we just messed with the page bits.  All the places we call this
> function are when we have a transaction open, so just skip this check
> and rely on the dirty flag.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

This patch is failing a lot of test cases, mostly scrub related 
(btrfs/072, btrfs/074).

Now we will report all kinds of errors during scrub.
Which seems weird, as scrub doesn't use the regular extent buffer 
helpers at all...

Maybe it's related to the generation we got during the extent tree search?
As all the failures points to generation mismatch during scrub.

Thanks,
Qu

> ---
>   fs/btrfs/disk-io.c | 15 ++++++---------
>   1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index d0ed52cab304..267163e546a5 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -968,16 +968,13 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
>   void btrfs_clean_tree_block(struct extent_buffer *buf)
>   {
>   	struct btrfs_fs_info *fs_info = buf->fs_info;
> -	if (btrfs_header_generation(buf) ==
> -	    fs_info->running_transaction->transid) {
> -		btrfs_assert_tree_write_locked(buf);
> +	btrfs_assert_tree_write_locked(buf);
>   
> -		if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
> -			percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
> -						 -buf->len,
> -						 fs_info->dirty_metadata_batch);
> -			clear_extent_buffer_dirty(buf);
> -		}
> +	if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
> +		percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
> +					 -buf->len,
> +					 fs_info->dirty_metadata_batch);
> +		clear_extent_buffer_dirty(buf);
>   	}
>   }
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2022-12-16  5:32   ` Qu Wenruo
@ 2022-12-17  2:10     ` Qu Wenruo
  2023-01-10 15:33       ` David Sterba
  2023-01-11 19:56       ` Josef Bacik
  0 siblings, 2 replies; 17+ messages in thread
From: Qu Wenruo @ 2022-12-17  2:10 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2022/12/16 13:32, Qu Wenruo wrote:
> 
> 
> On 2022/12/8 06:28, Josef Bacik wrote:
>> This check is from an era where we didn't have a per-extent buffer dirty
>> flag, we just messed with the page bits.  All the places we call this
>> function are when we have a transaction open, so just skip this check
>> and rely on the dirty flag.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> This patch is failing a lot of test cases, mostly scrub related 
> (btrfs/072, btrfs/074).
> 
> Now we will report all kinds of errors during scrub.
> Which seems weird, as scrub doesn't use the regular extent buffer 
> helpers at all...
> 
> Maybe it's related to the generation we got during the extent tree search?
> As all the failures points to generation mismatch during scrub.

I got some extra digging, and it turns out that, if we unconditionally 
clear the EXTENT_BUFFER_DIRTY flags, we can miss some tree blocks 
writeback for commit root.

Here is some trace for one extent buffer:

     btrfs_init_new_buffer: eb 7110656 set generation 13
     btrfs_clean_tree_block: eb 7110656 gen 13 dirty 0 running trans 13
     __btrfs_cow_block: eb 7110656 set generation 13

Above 3 lines are where the eb 7110656 got created as a cowed tree block.

     update_ref_for_cow: root 1 buf 6930432 gen 12 cow 7110656 gen 13

The eb 7110656 is cowed from 6930432, and that's created at transid 13.

     update_ref_for_cow: root 1 buf 7110656 gen 13 cow 7192576 gen 14

But that eb 7110656 got CoWed again in transaction 14. Which means, eb 
7110656 is no longer referred in transid 14, but is still referred in 
transid 13.

     btrfs_clean_tree_block: eb 7110656 gen 13 dirty 1 running trans 14

Here inside update_ref_for_cow(), we clear the dirty flag for eb 7110656.

This has a consequence that, the tree block 7110656 will not be written 
back, even it's still referred in transid 13.

This is where the problem is, previously we will continue writing back 
eb 7110656, as its transid is not the running transid, meaning some 
commit root still needs it.

Especially note that, I have added trace output for any tree block write 
back (in btree_csum_one_bio()).
But there is no such trace, meaning the tree block is never properly 
written back.

This also exposed another problem, if we didn't properly writeback tree 
blocks in commit root, we break COW, thus no proper transactional 
protection for our metadata.

     scrub_simple_mirror: tree 7110656 mirror 1 wanted generation 13
                          running trans 14
     scrub_checksum_tree_block: tree generation mismatch, tree 7110656
                                mirror 1 running trans 14, has 15 want 13
     scrub_checksum_tree_block: tree generation mismatch, tree 7110656
                                mirror 0 running trans 14, has 15 want 13

The above lines just shows the scrub failure for it.
As the tree block is not properly written back, we just read out some 
garbage.

And unfortunately our scrub code only checks bytenr, then generation, 
not even checking the fsid, thus we got the generation mismatch error.

     ...
     btree_csum_one_bio: eb 7110656 gen 26

There is an example to prove that, I have added tree block writeback 
trace event.

Thus I'd prefer to have at least a comment explaining why we can not 
just clean the dirty bit for a dirty eb which is not dirtied during the 
current running transaction.

Thanks,
Qu


> 
> Thanks,
> Qu
> 
>> ---
>>   fs/btrfs/disk-io.c | 15 ++++++---------
>>   1 file changed, 6 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index d0ed52cab304..267163e546a5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -968,16 +968,13 @@ struct extent_buffer *read_tree_block(struct 
>> btrfs_fs_info *fs_info, u64 bytenr,
>>   void btrfs_clean_tree_block(struct extent_buffer *buf)
>>   {
>>       struct btrfs_fs_info *fs_info = buf->fs_info;
>> -    if (btrfs_header_generation(buf) ==
>> -        fs_info->running_transaction->transid) {
>> -        btrfs_assert_tree_write_locked(buf);
>> +    btrfs_assert_tree_write_locked(buf);
>> -        if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
>> -            percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
>> -                         -buf->len,
>> -                         fs_info->dirty_metadata_batch);
>> -            clear_extent_buffer_dirty(buf);
>> -        }
>> +    if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)) {
>> +        percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
>> +                     -buf->len,
>> +                     fs_info->dirty_metadata_batch);
>> +        clear_extent_buffer_dirty(buf);
>>       }
>>   }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2022-12-17  2:10     ` Qu Wenruo
@ 2023-01-10 15:33       ` David Sterba
  2023-01-10 23:03         ` Qu Wenruo
  2023-01-11 19:56       ` Josef Bacik
  1 sibling, 1 reply; 17+ messages in thread
From: David Sterba @ 2023-01-10 15:33 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, linux-btrfs, kernel-team

On Sat, Dec 17, 2022 at 10:10:24AM +0800, Qu Wenruo wrote:
> On 2022/12/16 13:32, Qu Wenruo wrote:
> > On 2022/12/8 06:28, Josef Bacik wrote:
> >> This check is from an era where we didn't have a per-extent buffer dirty
> >> flag, we just messed with the page bits.  All the places we call this
> >> function are when we have a transaction open, so just skip this check
> >> and rely on the dirty flag.
> >>
> >> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > 
> > This patch is failing a lot of test cases, mostly scrub related 
> > (btrfs/072, btrfs/074).
> > 
> > Now we will report all kinds of errors during scrub.
> > Which seems weird, as scrub doesn't use the regular extent buffer 
> > helpers at all...
> > 
> > Maybe it's related to the generation we got during the extent tree search?
> > As all the failures points to generation mismatch during scrub.
> 
> I got some extra digging, and it turns out that, if we unconditionally 
> clear the EXTENT_BUFFER_DIRTY flags, we can miss some tree blocks 
> writeback for commit root.
> 
> Here is some trace for one extent buffer:
> 
>      btrfs_init_new_buffer: eb 7110656 set generation 13
>      btrfs_clean_tree_block: eb 7110656 gen 13 dirty 0 running trans 13
>      __btrfs_cow_block: eb 7110656 set generation 13
> 
> Above 3 lines are where the eb 7110656 got created as a cowed tree block.
> 
>      update_ref_for_cow: root 1 buf 6930432 gen 12 cow 7110656 gen 13
> 
> The eb 7110656 is cowed from 6930432, and that's created at transid 13.
> 
>      update_ref_for_cow: root 1 buf 7110656 gen 13 cow 7192576 gen 14
> 
> But that eb 7110656 got CoWed again in transaction 14. Which means, eb 
> 7110656 is no longer referred in transid 14, but is still referred in 
> transid 13.
> 
>      btrfs_clean_tree_block: eb 7110656 gen 13 dirty 1 running trans 14
> 
> Here inside update_ref_for_cow(), we clear the dirty flag for eb 7110656.
> 
> This has a consequence that, the tree block 7110656 will not be written 
> back, even it's still referred in transid 13.
> 
> This is where the problem is, previously we will continue writing back 
> eb 7110656, as its transid is not the running transid, meaning some 
> commit root still needs it.
> 
> Especially note that, I have added trace output for any tree block write 
> back (in btree_csum_one_bio()).
> But there is no such trace, meaning the tree block is never properly 
> written back.
> 
> This also exposed another problem, if we didn't properly writeback tree 
> blocks in commit root, we break COW, thus no proper transactional 
> protection for our metadata.
> 
>      scrub_simple_mirror: tree 7110656 mirror 1 wanted generation 13
>                           running trans 14
>      scrub_checksum_tree_block: tree generation mismatch, tree 7110656
>                                 mirror 1 running trans 14, has 15 want 13
>      scrub_checksum_tree_block: tree generation mismatch, tree 7110656
>                                 mirror 0 running trans 14, has 15 want 13
> 
> The above lines just shows the scrub failure for it.
> As the tree block is not properly written back, we just read out some 
> garbage.
> 
> And unfortunately our scrub code only checks bytenr, then generation, 
> not even checking the fsid, thus we got the generation mismatch error.
> 
>      ...
>      btree_csum_one_bio: eb 7110656 gen 26
> 
> There is an example to prove that, I have added tree block writeback 
> trace event.
> 
> Thus I'd prefer to have at least a comment explaining why we can not 
> just clean the dirty bit for a dirty eb which is not dirtied during the 
> current running transaction.

I've temporarily removed the patchset from for-next so we don't have
test failures over the holidays, now it's time to add it back, but based
on the above there are changes needed, right? If yes, I'll wait for the
update.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2023-01-10 15:33       ` David Sterba
@ 2023-01-10 23:03         ` Qu Wenruo
  2023-01-11 20:44           ` David Sterba
  0 siblings, 1 reply; 17+ messages in thread
From: Qu Wenruo @ 2023-01-10 23:03 UTC (permalink / raw)
  To: dsterba, Qu Wenruo; +Cc: Josef Bacik, linux-btrfs, kernel-team



On 2023/1/10 23:33, David Sterba wrote:
> On Sat, Dec 17, 2022 at 10:10:24AM +0800, Qu Wenruo wrote:
>> On 2022/12/16 13:32, Qu Wenruo wrote:
>>> On 2022/12/8 06:28, Josef Bacik wrote:
>>>> This check is from an era where we didn't have a per-extent buffer dirty
>>>> flag, we just messed with the page bits.  All the places we call this
>>>> function are when we have a transaction open, so just skip this check
>>>> and rely on the dirty flag.
>>>>
>>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>>>
>>> This patch is failing a lot of test cases, mostly scrub related
>>> (btrfs/072, btrfs/074).
>>>
>>> Now we will report all kinds of errors during scrub.
>>> Which seems weird, as scrub doesn't use the regular extent buffer
>>> helpers at all...
>>>
>>> Maybe it's related to the generation we got during the extent tree search?
>>> As all the failures points to generation mismatch during scrub.
>>
>> I got some extra digging, and it turns out that, if we unconditionally
>> clear the EXTENT_BUFFER_DIRTY flags, we can miss some tree blocks
>> writeback for commit root.
>>
>> Here is some trace for one extent buffer:
>>
>>       btrfs_init_new_buffer: eb 7110656 set generation 13
>>       btrfs_clean_tree_block: eb 7110656 gen 13 dirty 0 running trans 13
>>       __btrfs_cow_block: eb 7110656 set generation 13
>>
>> Above 3 lines are where the eb 7110656 got created as a cowed tree block.
>>
>>       update_ref_for_cow: root 1 buf 6930432 gen 12 cow 7110656 gen 13
>>
>> The eb 7110656 is cowed from 6930432, and that's created at transid 13.
>>
>>       update_ref_for_cow: root 1 buf 7110656 gen 13 cow 7192576 gen 14
>>
>> But that eb 7110656 got CoWed again in transaction 14. Which means, eb
>> 7110656 is no longer referred in transid 14, but is still referred in
>> transid 13.
>>
>>       btrfs_clean_tree_block: eb 7110656 gen 13 dirty 1 running trans 14
>>
>> Here inside update_ref_for_cow(), we clear the dirty flag for eb 7110656.
>>
>> This has a consequence that, the tree block 7110656 will not be written
>> back, even it's still referred in transid 13.
>>
>> This is where the problem is, previously we will continue writing back
>> eb 7110656, as its transid is not the running transid, meaning some
>> commit root still needs it.
>>
>> Especially note that, I have added trace output for any tree block write
>> back (in btree_csum_one_bio()).
>> But there is no such trace, meaning the tree block is never properly
>> written back.
>>
>> This also exposed another problem, if we didn't properly writeback tree
>> blocks in commit root, we break COW, thus no proper transactional
>> protection for our metadata.
>>
>>       scrub_simple_mirror: tree 7110656 mirror 1 wanted generation 13
>>                            running trans 14
>>       scrub_checksum_tree_block: tree generation mismatch, tree 7110656
>>                                  mirror 1 running trans 14, has 15 want 13
>>       scrub_checksum_tree_block: tree generation mismatch, tree 7110656
>>                                  mirror 0 running trans 14, has 15 want 13
>>
>> The above lines just shows the scrub failure for it.
>> As the tree block is not properly written back, we just read out some
>> garbage.
>>
>> And unfortunately our scrub code only checks bytenr, then generation,
>> not even checking the fsid, thus we got the generation mismatch error.
>>
>>       ...
>>       btree_csum_one_bio: eb 7110656 gen 26
>>
>> There is an example to prove that, I have added tree block writeback
>> trace event.
>>
>> Thus I'd prefer to have at least a comment explaining why we can not
>> just clean the dirty bit for a dirty eb which is not dirtied during the
>> current running transaction.
> 
> I've temporarily removed the patchset from for-next so we don't have
> test failures over the holidays, now it's time to add it back, but based
> on the above there are changes needed, right? If yes, I'll wait for the
> update.

I believe we should drop this patch from the series.

But since it's at the very beginning of the series, and a lot of later 
patches are depending on it, it needs some work to resolve it.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2022-12-17  2:10     ` Qu Wenruo
  2023-01-10 15:33       ` David Sterba
@ 2023-01-11 19:56       ` Josef Bacik
  1 sibling, 0 replies; 17+ messages in thread
From: Josef Bacik @ 2023-01-11 19:56 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, kernel-team

On Sat, Dec 17, 2022 at 10:10:24AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/12/16 13:32, Qu Wenruo wrote:
> > 
> > 
> > On 2022/12/8 06:28, Josef Bacik wrote:
> > > This check is from an era where we didn't have a per-extent buffer dirty
> > > flag, we just messed with the page bits.  All the places we call this
> > > function are when we have a transaction open, so just skip this check
> > > and rely on the dirty flag.
> > > 
> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > 
> > This patch is failing a lot of test cases, mostly scrub related
> > (btrfs/072, btrfs/074).
> > 
> > Now we will report all kinds of errors during scrub.
> > Which seems weird, as scrub doesn't use the regular extent buffer
> > helpers at all...
> > 
> > Maybe it's related to the generation we got during the extent tree search?
> > As all the failures points to generation mismatch during scrub.
> 
> I got some extra digging, and it turns out that, if we unconditionally clear
> the EXTENT_BUFFER_DIRTY flags, we can miss some tree blocks writeback for
> commit root.
> 
> Here is some trace for one extent buffer:
> 
>     btrfs_init_new_buffer: eb 7110656 set generation 13
>     btrfs_clean_tree_block: eb 7110656 gen 13 dirty 0 running trans 13
>     __btrfs_cow_block: eb 7110656 set generation 13
> 
> Above 3 lines are where the eb 7110656 got created as a cowed tree block.
> 
>     update_ref_for_cow: root 1 buf 6930432 gen 12 cow 7110656 gen 13
> 
> The eb 7110656 is cowed from 6930432, and that's created at transid 13.
> 
>     update_ref_for_cow: root 1 buf 7110656 gen 13 cow 7192576 gen 14
> 
> But that eb 7110656 got CoWed again in transaction 14. Which means, eb
> 7110656 is no longer referred in transid 14, but is still referred in
> transid 13.
> 
>     btrfs_clean_tree_block: eb 7110656 gen 13 dirty 1 running trans 14
> 
> Here inside update_ref_for_cow(), we clear the dirty flag for eb 7110656.
> 

Ooooh well that's not good, we shouldn't be calling btrfs_clean_tree_block in
this case, that's the real bug.  I'll fix that and update the series.  Nice
catch, thanks!

Josef

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block
  2023-01-10 23:03         ` Qu Wenruo
@ 2023-01-11 20:44           ` David Sterba
  0 siblings, 0 replies; 17+ messages in thread
From: David Sterba @ 2023-01-11 20:44 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, Josef Bacik, linux-btrfs, kernel-team

On Wed, Jan 11, 2023 at 07:03:06AM +0800, Qu Wenruo wrote:
> > I've temporarily removed the patchset from for-next so we don't have
> > test failures over the holidays, now it's time to add it back, but based
> > on the above there are changes needed, right? If yes, I'll wait for the
> > update.
> 
> I believe we should drop this patch from the series.
> 
> But since it's at the very beginning of the series, and a lot of later 
> patches are depending on it, it needs some work to resolve it.

Ok, so it's for Josef, please have a look.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-01-11 20:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-07 22:28 [PATCH 0/8] extent buffer dirty cleanups Josef Bacik
2022-12-07 22:28 ` [PATCH 1/8] btrfs: always lock the block before calling btrfs_clean_tree_block Josef Bacik
2022-12-07 22:28 ` [PATCH 2/8] btrfs: do not check header generation in btrfs_clean_tree_block Josef Bacik
2022-12-16  5:32   ` Qu Wenruo
2022-12-17  2:10     ` Qu Wenruo
2023-01-10 15:33       ` David Sterba
2023-01-10 23:03         ` Qu Wenruo
2023-01-11 20:44           ` David Sterba
2023-01-11 19:56       ` Josef Bacik
2022-12-07 22:28 ` [PATCH 3/8] btrfs: do not set the header generation before btrfs_clean_tree_block Josef Bacik
2022-12-07 22:28 ` [PATCH 4/8] btrfs: replace clearing extent buffer dirty bit with btrfs_clean_block Josef Bacik
2022-12-07 22:28 ` [PATCH 5/8] btrfs: do not increment dirty_metadata_bytes in set_btree_ioerr Josef Bacik
2022-12-07 22:28 ` [PATCH 6/8] btrfs: rename btrfs_clean_tree_block => btrfs_clear_buffer_dirty Josef Bacik
2022-12-07 22:28 ` [PATCH 7/8] btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty Josef Bacik
2022-12-07 22:28 ` [PATCH 8/8] btrfs: remove btrfs_wait_tree_block_writeback Josef Bacik
2022-12-14 17:49 ` [PATCH 0/8] extent buffer dirty cleanups David Sterba
2022-12-15  4:38 ` Wang Yugui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.