All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs NOCOW fix and cleanups
@ 2023-07-24 14:22 Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 1/6] btrfs: fix error handling when in a COW window in run_delalloc_nocow Christoph Hellwig
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

Hi all,

this series fixes a (found by code inspection) bug in the error handling
in btrfs_run_delalloc_nocow, and then cleans up a bunch of things in
btrfs_run_delalloc_nocow to allow me to actually undestand the logic
there, and in case of the last patch signigicantly simplifies it.

The series is on top of the for-next branch as that includes previous
work not in misc-next yet that the series relies on.

A git tree is also available here:

    git://git.infradead.org/users/hch/misc.git btrfs-nocow-cleanups

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-nocow-cleanups

Diffstat:
 inode.c        |  143 +++++++++++++++++----------------------------------------
 ordered-data.c |   24 ++++++++-
 2 files changed, 67 insertions(+), 100 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/6] btrfs: fix error handling when in a COW window in run_delalloc_nocow
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 2/6] btrfs: cleanup the COW fallback logic " Christoph Hellwig
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

When run_delalloc_nocow has cow_start set to a value other than (u64)-1,
it has delayed COW writeback pending behind cur_offset.  When an error
occurs in such a window, the range going back to cow_start and not just
cur_offset needs to be unlocked, but only two error cases handle this
correctly  Move the code to handle unlock the COW range to the common
error handling label and document the logic.

To make things even more complicated, cow_file_range as called by
fallback_to_cow will unlock the range it is operating on when it fails as
well, so we need to reset cow_start right after caling fallback_to_cow
instead of only when it succeeded.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 640e7dd26f6132..434f82fa4957d3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2031,11 +2031,8 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		leaf = path->nodes[0];
 		if (path->slots[0] >= btrfs_header_nritems(leaf)) {
 			ret = btrfs_next_leaf(root, path);
-			if (ret < 0) {
-				if (cow_start != (u64)-1)
-					cur_offset = cow_start;
+			if (ret < 0)
 				goto error;
-			}
 			if (ret > 0)
 				break;
 			leaf = path->nodes[0];
@@ -2098,13 +2095,10 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 
 		nocow_args.start = cur_offset;
 		ret = can_nocow_file_extent(path, &found_key, inode, &nocow_args);
-		if (ret < 0) {
-			if (cow_start != (u64)-1)
-				cur_offset = cow_start;
+		if (ret < 0)
 			goto error;
-		} else if (ret == 0) {
+		if (ret == 0)
 			goto out_check;
-		}
 
 		ret = 0;
 		bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr);
@@ -2135,9 +2129,9 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		if (cow_start != (u64)-1) {
 			ret = fallback_to_cow(inode, locked_page,
 					      cow_start, found_key.offset - 1);
+			cow_start = (u64)-1;
 			if (ret)
 				goto error;
-			cow_start = (u64)-1;
 		}
 
 		nocow_end = cur_offset + nocow_args.num_bytes - 1;
@@ -2216,6 +2210,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	if (cow_start != (u64)-1) {
 		cur_offset = end;
 		ret = fallback_to_cow(inode, locked_page, cow_start, end);
+		cow_start = (u64)-1;
 		if (ret)
 			goto error;
 	}
@@ -2224,6 +2219,13 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	if (nocow)
 		btrfs_dec_nocow_writers(bg);
 
+	/*
+	 * If an error happened while a COW region is outstanding, cur_offset
+	 * needs to be reset to cow_start to ensure the COW region is unlocked
+	 * as well.
+	 */
+	if (cow_start != (u64)-1)
+		cur_offset = cow_start;
 	if (ret && cur_offset < end)
 		extent_clear_unlock_delalloc(inode, cur_offset, end,
 					     locked_page, EXTENT_LOCKED |
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/6] btrfs: cleanup the COW fallback logic in run_delalloc_nocow
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 1/6] btrfs: fix error handling when in a COW window in run_delalloc_nocow Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 3/6] btrfs: consolidate the error handling " Christoph Hellwig
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

Use the block group pointer used to track the outstanding NOCOW writes as
a boolean to remove the duplicate nocow variable, and keep it contained
in the main loop to simplify the logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 47 ++++++++++++++++++++++-------------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 434f82fa4957d3..212aca4eea442b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1976,8 +1976,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	int ret;
 	bool check_prev = true;
 	u64 ino = btrfs_ino(inode);
-	struct btrfs_block_group *bg;
-	bool nocow = false;
 	struct can_nocow_file_extent_args nocow_args = { 0 };
 
 	path = btrfs_alloc_path();
@@ -1995,6 +1993,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	nocow_args.writeback_path = true;
 
 	while (1) {
+		struct btrfs_block_group *nocow_bg = NULL;
 		struct btrfs_ordered_extent *ordered;
 		struct btrfs_key found_key;
 		struct btrfs_file_extent_item *fi;
@@ -2005,8 +2004,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		int extent_type;
 		bool is_prealloc;
 
-		nocow = false;
-
 		ret = btrfs_lookup_file_extent(NULL, root, path, ino,
 					       cur_offset, 0);
 		if (ret < 0)
@@ -2065,7 +2062,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		if (found_key.offset > cur_offset) {
 			extent_end = found_key.offset;
 			extent_type = 0;
-			goto out_check;
+			goto must_cow;
 		}
 
 		/*
@@ -2098,18 +2095,19 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		if (ret < 0)
 			goto error;
 		if (ret == 0)
-			goto out_check;
+			goto must_cow;
 
 		ret = 0;
-		bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr);
-		if (bg)
-			nocow = true;
-out_check:
-		/*
-		 * If nocow is false then record the beginning of the range
-		 * that needs to be COWed
-		 */
-		if (!nocow) {
+		nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr);
+		if (!nocow_bg) {
+must_cow:
+			/*
+			 * If we can't perform NOCOW writeback for the range,
+			 * then record the beginning of the range that needs to
+			 * be COWed.  It will be written out before the next
+			 * NOCOW range if we find one, or when exiting this
+			 * loop.
+			 */
 			if (cow_start == (u64)-1)
 				cow_start = cur_offset;
 			cur_offset = extent_end;
@@ -2130,8 +2128,10 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 			ret = fallback_to_cow(inode, locked_page,
 					      cow_start, found_key.offset - 1);
 			cow_start = (u64)-1;
-			if (ret)
+			if (ret) {
+				btrfs_dec_nocow_writers(nocow_bg);
 				goto error;
+			}
 		}
 
 		nocow_end = cur_offset + nocow_args.num_bytes - 1;
@@ -2148,6 +2148,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 					  ram_bytes, BTRFS_COMPRESS_NONE,
 					  BTRFS_ORDERED_PREALLOC);
 			if (IS_ERR(em)) {
+				btrfs_dec_nocow_writers(nocow_bg);
 				ret = PTR_ERR(em);
 				goto error;
 			}
@@ -2161,6 +2162,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 				? (1 << BTRFS_ORDERED_PREALLOC)
 				: (1 << BTRFS_ORDERED_NOCOW),
 				BTRFS_COMPRESS_NONE);
+		btrfs_dec_nocow_writers(nocow_bg);
 		if (IS_ERR(ordered)) {
 			if (is_prealloc) {
 				btrfs_drop_extent_map_range(inode, cur_offset,
@@ -2170,11 +2172,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 			goto error;
 		}
 
-		if (nocow) {
-			btrfs_dec_nocow_writers(bg);
-			nocow = false;
-		}
-
 		if (btrfs_is_data_reloc_root(root))
 			/*
 			 * Error handled later, as we must prevent
@@ -2215,10 +2212,10 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 			goto error;
 	}
 
-error:
-	if (nocow)
-		btrfs_dec_nocow_writers(bg);
+	btrfs_free_path(path);
+	return 0;
 
+error:
 	/*
 	 * If an error happened while a COW region is outstanding, cur_offset
 	 * needs to be reset to cow_start to ensure the COW region is unlocked
@@ -2226,7 +2223,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	 */
 	if (cow_start != (u64)-1)
 		cur_offset = cow_start;
-	if (ret && cur_offset < end)
+	if (cur_offset < end)
 		extent_clear_unlock_delalloc(inode, cur_offset, end,
 					     locked_page, EXTENT_LOCKED |
 					     EXTENT_DELALLOC | EXTENT_DEFRAG |
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/6] btrfs: consolidate the error handling in run_delalloc_nocow
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 1/6] btrfs: fix error handling when in a COW window in run_delalloc_nocow Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 2/6] btrfs: cleanup the COW fallback logic " Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-07-24 18:27   ` Boris Burkov
  2023-07-24 14:22 ` [PATCH 4/6] btrfs: move the !zoned assert into run_delalloc_cow Christoph Hellwig
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

Share the calls to extent_clear_unlock_delalloc for btrfs_path allocation
failure handling and the normal exit path.

This relies on btrfs_free_path ignoring a NULL pointer, and the
initialization of cur_offset to start at the beginning of the function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 212aca4eea442b..2d465b50c228ed 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1973,21 +1973,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	struct btrfs_path *path;
 	u64 cow_start = (u64)-1;
 	u64 cur_offset = start;
-	int ret;
+	int ret = -ENOMEM;
 	bool check_prev = true;
 	u64 ino = btrfs_ino(inode);
 	struct can_nocow_file_extent_args nocow_args = { 0 };
 
 	path = btrfs_alloc_path();
-	if (!path) {
-		extent_clear_unlock_delalloc(inode, start, end, locked_page,
-					     EXTENT_LOCKED | EXTENT_DELALLOC |
-					     EXTENT_DO_ACCOUNTING |
-					     EXTENT_DEFRAG, PAGE_UNLOCK |
-					     PAGE_START_WRITEBACK |
-					     PAGE_END_WRITEBACK);
-		return -ENOMEM;
-	}
+	if (!path)
+		goto error;
 
 	nocow_args.end = end;
 	nocow_args.writeback_path = true;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/6] btrfs: move the !zoned assert into run_delalloc_cow
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
                   ` (2 preceding siblings ...)
  2023-07-24 14:22 ` [PATCH 3/6] btrfs: consolidate the error handling " Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-07-24 14:22 ` [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow Christoph Hellwig
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

Having the assert in the actual helper documents the pre-conditions
much better than having it in the caller, so move it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2d465b50c228ed..92182e0d27fdb5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1978,6 +1978,13 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 	u64 ino = btrfs_ino(inode);
 	struct can_nocow_file_extent_args nocow_args = { 0 };
 
+	/*
+	 * Normally on a zoned device we're only doing COW writes, but in case
+	 * of relocation on a zoned filesystem serializes I/O so that we're only
+	 * writing sequentially and can end up here as well.
+	 */
+	ASSERT(!btrfs_is_zoned(fs_info) || btrfs_is_data_reloc_root(root));
+
 	path = btrfs_alloc_path();
 	if (!path)
 		goto error;
@@ -2257,14 +2264,6 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 		 start >= page_offset(locked_page) + PAGE_SIZE));
 
 	if (should_nocow(inode, start, end)) {
-		/*
-		 * Normally on a zoned device we're only doing COW writes, but
-		 * in case of relocation on a zoned filesystem we have taken
-		 * precaution, that we're only writing sequentially. It's safe
-		 * to use run_delalloc_nocow() here, like for  regular
-		 * preallocated inodes.
-		 */
-		ASSERT(!zoned || btrfs_is_data_reloc_root(inode->root));
 		ret = run_delalloc_nocow(inode, locked_page, start, end);
 		goto out;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
                   ` (3 preceding siblings ...)
  2023-07-24 14:22 ` [PATCH 4/6] btrfs: move the !zoned assert into run_delalloc_cow Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-08-10 17:00   ` David Sterba
  2023-07-24 14:22 ` [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent Christoph Hellwig
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

When run_delalloc_cow allocates an ordered extent for an actual
NOCOW range, it uses the nocow_end variable calculated based on
the current offset and the nocow_args.num_bytes value returned
from can_nocow_file_extent for all the actual I/O, but the loop
iteration then resets cur_offset to extent_end, which caused
me a lot of confusion.  It turns out that nocow_end is based
of the minimum of the extent end and the range end, and thus
actually works perfectly fine for the loop iteration, but
using a different variable here from the actual I/O submission
is horribly confusing and wasted some of my precious brain
cells when train to understand the logic.  Switch to using
nocow_end adjusted by the the off by one to make it an exclusive
range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 92182e0d27fdb5..caaf2c002d795d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2187,7 +2187,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 					     EXTENT_CLEAR_DATA_RESV,
 					     PAGE_UNLOCK | PAGE_SET_ORDERED);
 
-		cur_offset = extent_end;
+		cur_offset = nocow_end + 1;
 
 		/*
 		 * btrfs_reloc_clone_csums() error, now we're OK to call error
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
                   ` (4 preceding siblings ...)
  2023-07-24 14:22 ` [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow Christoph Hellwig
@ 2023-07-24 14:22 ` Christoph Hellwig
  2023-08-10 17:07   ` David Sterba
  2023-07-24 18:30 ` btrfs NOCOW fix and cleanups Boris Burkov
  2023-08-10 16:56 ` David Sterba
  7 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 14:22 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba; +Cc: linux-btrfs

Error handling for btrfs_reloc_clone_csums is a royal pain.  That is
not because btrfs_reloc_clone_csums does something particularly nasty:
the only failure can come from looking up the csums in the csum tree -
instead it is because btrfs_reloc_clone_csums is called after
btrfs_alloc_ordered_extent because it adds those founds csums to the
list in the ordred extent, and cleaning up after an ordered extent
has been added to the lookup data structures is very cumbersome.

Fix this by calling btrfs_reloc_clone_csums in
btrfs_alloc_ordered_extent after allocting the ordered extents, but
before making it reachable by the wider word and thus bypassing the
complex ordered extent removal path entirely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c        | 44 -----------------------------------------
 fs/btrfs/ordered-data.c | 24 ++++++++++++++++++++--
 2 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index caaf2c002d795d..7864cae3ee2094 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1435,26 +1435,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 			ret = PTR_ERR(ordered);
 			goto out_drop_extent_cache;
 		}
-
-		if (btrfs_is_data_reloc_root(root)) {
-			ret = btrfs_reloc_clone_csums(ordered);
-
-			/*
-			 * Only drop cache here, and process as normal.
-			 *
-			 * We must not allow extent_clear_unlock_delalloc()
-			 * at out_unlock label to free meta of this ordered
-			 * extent, as its meta should be freed by
-			 * btrfs_finish_ordered_io().
-			 *
-			 * So we must continue until @start is increased to
-			 * skip current ordered extent.
-			 */
-			if (ret)
-				btrfs_drop_extent_map_range(inode, start,
-							    start + ram_size - 1,
-							    false);
-		}
 		btrfs_put_ordered_extent(ordered);
 
 		btrfs_dec_block_group_reservations(fs_info, ins.objectid);
@@ -1481,14 +1461,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 		alloc_hint = ins.objectid + ins.offset;
 		start += cur_alloc_size;
 		extent_reserved = false;
-
-		/*
-		 * btrfs_reloc_clone_csums() error, since start is increased
-		 * extent_clear_unlock_delalloc() at out_unlock label won't
-		 * free metadata of current ordered extent, we're OK to exit.
-		 */
-		if (ret)
-			goto out_unlock;
 	}
 done:
 	if (done_offset)
@@ -2171,14 +2143,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 			ret = PTR_ERR(ordered);
 			goto error;
 		}
-
-		if (btrfs_is_data_reloc_root(root))
-			/*
-			 * Error handled later, as we must prevent
-			 * extent_clear_unlock_delalloc() in error handler
-			 * from freeing metadata of created ordered extent.
-			 */
-			ret = btrfs_reloc_clone_csums(ordered);
 		btrfs_put_ordered_extent(ordered);
 
 		extent_clear_unlock_delalloc(inode, cur_offset, nocow_end,
@@ -2188,14 +2152,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 					     PAGE_UNLOCK | PAGE_SET_ORDERED);
 
 		cur_offset = nocow_end + 1;
-
-		/*
-		 * btrfs_reloc_clone_csums() error, now we're OK to call error
-		 * handler, as metadata for created ordered extent will only
-		 * be freed by btrfs_finish_ordered_io().
-		 */
-		if (ret)
-			goto error;
 		if (cur_offset > end)
 			break;
 	}
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 109e80ed25b669..caa5dbf48db572 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -18,6 +18,7 @@
 #include "delalloc-space.h"
 #include "qgroup.h"
 #include "subpage.h"
+#include "relocation.h"
 #include "file.h"
 #include "super.h"
 
@@ -274,8 +275,27 @@ struct btrfs_ordered_extent *btrfs_alloc_ordered_extent(
 	entry = alloc_ordered_extent(inode, file_offset, num_bytes, ram_bytes,
 				     disk_bytenr, disk_num_bytes, offset, flags,
 				     compress_type);
-	if (!IS_ERR(entry))
-		insert_ordered_extent(entry);
+	if (IS_ERR(entry))
+		return entry;
+
+	/*
+	 * Writes to relocation roots are special, and clones the existing csums
+	 * from the csum tree instead of calculating them.
+	 *
+	 * Clone the csums here so that the ordered extent never gets inserted
+	 * into the per-inode ordered extent tree and per-root list on failure.
+	 */
+	if (btrfs_is_data_reloc_root(inode->root)) {
+		int ret;
+
+		ret = btrfs_reloc_clone_csums(entry);
+		if (ret) {
+			kmem_cache_free(btrfs_ordered_extent_cache, entry);
+			return ERR_PTR(ret);
+		}
+	}
+
+	insert_ordered_extent(entry);
 	return entry;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] btrfs: consolidate the error handling in run_delalloc_nocow
  2023-07-24 14:22 ` [PATCH 3/6] btrfs: consolidate the error handling " Christoph Hellwig
@ 2023-07-24 18:27   ` Boris Burkov
  2023-07-24 19:48     ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Boris Burkov @ 2023-07-24 18:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 07:22:40AM -0700, Christoph Hellwig wrote:
> Share the calls to extent_clear_unlock_delalloc for btrfs_path allocation
> failure handling and the normal exit path.
> 
> This relies on btrfs_free_path ignoring a NULL pointer, and the
> initialization of cur_offset to start at the beginning of the function.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/btrfs/inode.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 212aca4eea442b..2d465b50c228ed 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -1973,21 +1973,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
>  	struct btrfs_path *path;
>  	u64 cow_start = (u64)-1;
>  	u64 cur_offset = start;
> -	int ret;
> +	int ret = -ENOMEM;
>  	bool check_prev = true;
>  	u64 ino = btrfs_ino(inode);
>  	struct can_nocow_file_extent_args nocow_args = { 0 };
>  
>  	path = btrfs_alloc_path();
> -	if (!path) {
> -		extent_clear_unlock_delalloc(inode, start, end, locked_page,
> -					     EXTENT_LOCKED | EXTENT_DELALLOC |
> -					     EXTENT_DO_ACCOUNTING |
> -					     EXTENT_DEFRAG, PAGE_UNLOCK |
> -					     PAGE_START_WRITEBACK |
> -					     PAGE_END_WRITEBACK);
> -		return -ENOMEM;
> -	}
> +	if (!path)
> +		goto error;

nit: I think it's nicer to do ret = -ENOMEM here rather than relying on
initializion. It makes it less likely for a different change to
accidentally disrupt the implicit assumption that ret == -ENOMEM.

>  
>  	nocow_args.end = end;
>  	nocow_args.writeback_path = true;
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
                   ` (5 preceding siblings ...)
  2023-07-24 14:22 ` [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent Christoph Hellwig
@ 2023-07-24 18:30 ` Boris Burkov
  2023-07-24 19:49   ` Christoph Hellwig
  2023-08-10 16:56 ` David Sterba
  7 siblings, 1 reply; 19+ messages in thread
From: Boris Burkov @ 2023-07-24 18:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 07:22:37AM -0700, Christoph Hellwig wrote:
> Hi all,
> 
> this series fixes a (found by code inspection) bug in the error handling
> in btrfs_run_delalloc_nocow, and then cleans up a bunch of things in
> btrfs_run_delalloc_nocow to allow me to actually undestand the logic
> there, and in case of the last patch signigicantly simplifies it.
> 
> The series is on top of the for-next branch as that includes previous
> work not in misc-next yet that the series relies on.

This doesn't apply on the latest for-next, but I reviewed it from your
git tree. It all LGTM there, thanks.

You can add:
Reviewed-by: Boris Burkov <boris@bur.io>

> 
> A git tree is also available here:
> 
>     git://git.infradead.org/users/hch/misc.git btrfs-nocow-cleanups
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-nocow-cleanups
> 
> Diffstat:
>  inode.c        |  143 +++++++++++++++++----------------------------------------
>  ordered-data.c |   24 ++++++++-
>  2 files changed, 67 insertions(+), 100 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] btrfs: consolidate the error handling in run_delalloc_nocow
  2023-07-24 18:27   ` Boris Burkov
@ 2023-07-24 19:48     ` Christoph Hellwig
  2023-07-25 21:36       ` David Sterba
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 19:48 UTC (permalink / raw)
  To: Boris Burkov
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 11:27:37AM -0700, Boris Burkov wrote:
> > -		return -ENOMEM;
> > -	}
> > +	if (!path)
> > +		goto error;
> 
> nit: I think it's nicer to do ret = -ENOMEM here rather than relying on
> initializion. It makes it less likely for a different change to
> accidentally disrupt the implicit assumption that ret == -ENOMEM.

I kinda like the early initialization version, but I can live with
either variant.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-24 18:30 ` btrfs NOCOW fix and cleanups Boris Burkov
@ 2023-07-24 19:49   ` Christoph Hellwig
  2023-07-24 19:58     ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 19:49 UTC (permalink / raw)
  To: Boris Burkov
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 11:30:33AM -0700, Boris Burkov wrote:
> On Mon, Jul 24, 2023 at 07:22:37AM -0700, Christoph Hellwig wrote:
> > Hi all,
> > 
> > this series fixes a (found by code inspection) bug in the error handling
> > in btrfs_run_delalloc_nocow, and then cleans up a bunch of things in
> > btrfs_run_delalloc_nocow to allow me to actually undestand the logic
> > there, and in case of the last patch signigicantly simplifies it.
> > 
> > The series is on top of the for-next branch as that includes previous
> > work not in misc-next yet that the series relies on.
> 
> This doesn't apply on the latest for-next, but I reviewed it from your
> git tree. It all LGTM there, thanks.

Yeah, looks like for-next got rebased again today.  I'll rebase and
push it out to the git tree later today and can resend as needed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-24 19:49   ` Christoph Hellwig
@ 2023-07-24 19:58     ` Christoph Hellwig
  2023-07-25 21:42       ` David Sterba
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-24 19:58 UTC (permalink / raw)
  To: Boris Burkov
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 09:49:23PM +0200, Christoph Hellwig wrote:
> Yeah, looks like for-next got rebased again today.  I'll rebase and
> push it out to the git tree later today and can resend as needed.

Looks like for-next has in fact pulled this series in already.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] btrfs: consolidate the error handling in run_delalloc_nocow
  2023-07-24 19:48     ` Christoph Hellwig
@ 2023-07-25 21:36       ` David Sterba
  0 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2023-07-25 21:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Boris Burkov, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 09:48:44PM +0200, Christoph Hellwig wrote:
> On Mon, Jul 24, 2023 at 11:27:37AM -0700, Boris Burkov wrote:
> > > -		return -ENOMEM;
> > > -	}
> > > +	if (!path)
> > > +		goto error;
> > 
> > nit: I think it's nicer to do ret = -ENOMEM here rather than relying on
> > initializion. It makes it less likely for a different change to
> > accidentally disrupt the implicit assumption that ret == -ENOMEM.
> 
> I kinda like the early initialization version, but I can live with
> either variant.

We use the style where the ret is initialized to 0 and then either
there's a direct return of error code or 'ret = -ERROR' followed by a
goto. There are a few remaining to convert but please don't add new
ones.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-24 19:58     ` Christoph Hellwig
@ 2023-07-25 21:42       ` David Sterba
  2023-07-26 12:56         ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: David Sterba @ 2023-07-25 21:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Boris Burkov, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 09:58:24PM +0200, Christoph Hellwig wrote:
> On Mon, Jul 24, 2023 at 09:49:23PM +0200, Christoph Hellwig wrote:
> > Yeah, looks like for-next got rebased again today.  I'll rebase and
> > push it out to the git tree later today and can resend as needed.
> 
> Looks like for-next has in fact pulled this series in already.

Please note that for-next is a preview branch and for early testing. It
will be rebased regularly, patches may appear and disappear or get moved
to misc-next. Currently there are some problems in misc-next so I'm
bisecting and trying to find a stable base for development branches
again.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-25 21:42       ` David Sterba
@ 2023-07-26 12:56         ` Christoph Hellwig
  2023-07-27 11:50           ` David Sterba
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-07-26 12:56 UTC (permalink / raw)
  To: David Sterba
  Cc: Christoph Hellwig, Boris Burkov, Chris Mason, Josef Bacik,
	David Sterba, linux-btrfs

On Tue, Jul 25, 2023 at 11:42:25PM +0200, David Sterba wrote:
> On Mon, Jul 24, 2023 at 09:58:24PM +0200, Christoph Hellwig wrote:
> > On Mon, Jul 24, 2023 at 09:49:23PM +0200, Christoph Hellwig wrote:
> > > Yeah, looks like for-next got rebased again today.  I'll rebase and
> > > push it out to the git tree later today and can resend as needed.
> > 
> > Looks like for-next has in fact pulled this series in already.
> 
> Please note that for-next is a preview branch and for early testing.

I know (by now), just wanted to put out the explanation why Boris
saw the reject.

If at some point there is a good time to tweak the btrfs process, it
would be really nice to name the branch that ends up in linux-next
for-next like in every other subsystem, and to not use two different
git trees, which both are things that confused even me as a long time
kernel contributor horribly.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-26 12:56         ` Christoph Hellwig
@ 2023-07-27 11:50           ` David Sterba
  0 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2023-07-27 11:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Sterba, Christoph Hellwig, Boris Burkov, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

On Wed, Jul 26, 2023 at 05:56:46AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 25, 2023 at 11:42:25PM +0200, David Sterba wrote:
> > On Mon, Jul 24, 2023 at 09:58:24PM +0200, Christoph Hellwig wrote:
> > > On Mon, Jul 24, 2023 at 09:49:23PM +0200, Christoph Hellwig wrote:
> > > > Yeah, looks like for-next got rebased again today.  I'll rebase and
> > > > push it out to the git tree later today and can resend as needed.
> > > 
> > > Looks like for-next has in fact pulled this series in already.
> > 
> > Please note that for-next is a preview branch and for early testing.
> 
> I know (by now), just wanted to put out the explanation why Boris
> saw the reject.
> 
> If at some point there is a good time to tweak the btrfs process, it
> would be really nice to name the branch that ends up in linux-next
> for-next like in every other subsystem, and to not use two different
> git trees, which both are things that confused even me as a long time
> kernel contributor horribly.

You should not care about my kernel.org git tree for development, that's
just for linux-next and pull requests, at no point I refer people there
nor mention it. The repo github.com/kdave/btrfs-devel is for development
and contains misc-next as stable development base as well as the
for-next snapshots, the branch 'for-next'. I don't see anything
confusing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: btrfs NOCOW fix and cleanups
  2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
                   ` (6 preceding siblings ...)
  2023-07-24 18:30 ` btrfs NOCOW fix and cleanups Boris Burkov
@ 2023-08-10 16:56 ` David Sterba
  7 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2023-08-10 16:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 07:22:37AM -0700, Christoph Hellwig wrote:
> Hi all,
> 
> this series fixes a (found by code inspection) bug in the error handling
> in btrfs_run_delalloc_nocow, and then cleans up a bunch of things in
> btrfs_run_delalloc_nocow to allow me to actually undestand the logic
> there, and in case of the last patch signigicantly simplifies it.
> 
> The series is on top of the for-next branch as that includes previous
> work not in misc-next yet that the series relies on.
> 
> A git tree is also available here:
> 
>     git://git.infradead.org/users/hch/misc.git btrfs-nocow-cleanups
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-nocow-cleanups
> 
> Diffstat:
>  inode.c        |  143 +++++++++++++++++----------------------------------------
>  ordered-data.c |   24 ++++++++-
>  2 files changed, 67 insertions(+), 100 deletions(-)

Patches 1-4 added to misc-next, the remaining two will be in for-next.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow
  2023-07-24 14:22 ` [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow Christoph Hellwig
@ 2023-08-10 17:00   ` David Sterba
  0 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2023-08-10 17:00 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 07:22:42AM -0700, Christoph Hellwig wrote:
> When run_delalloc_cow allocates an ordered extent for an actual
> NOCOW range, it uses the nocow_end variable calculated based on
> the current offset and the nocow_args.num_bytes value returned
> from can_nocow_file_extent for all the actual I/O, but the loop
> iteration then resets cur_offset to extent_end, which caused
> me a lot of confusion.  It turns out that nocow_end is based
> of the minimum of the extent end and the range end, and thus
> actually works perfectly fine for the loop iteration, but
> using a different variable here from the actual I/O submission
> is horribly confusing and wasted some of my precious brain
> cells when train to understand the logic.

I'm editing out such personal notes from changelogs or rephrase them to
be relevant to the code regarding readability. In the long term nobody
cares how a developer felt while understanding or writing the code, we
always know better in the hindsight.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent
  2023-07-24 14:22 ` [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent Christoph Hellwig
@ 2023-08-10 17:07   ` David Sterba
  0 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2023-08-10 17:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Mon, Jul 24, 2023 at 07:22:43AM -0700, Christoph Hellwig wrote:
> Error handling for btrfs_reloc_clone_csums is a royal pain.  That is
> not because btrfs_reloc_clone_csums does something particularly nasty:
> the only failure can come from looking up the csums in the csum tree -
> instead it is because btrfs_reloc_clone_csums is called after
> btrfs_alloc_ordered_extent because it adds those founds csums to the
> list in the ordred extent, and cleaning up after an ordered extent
> has been added to the lookup data structures is very cumbersome.
> 
> Fix this by calling btrfs_reloc_clone_csums in
> btrfs_alloc_ordered_extent after allocting the ordered extents, but
> before making it reachable by the wider word and thus bypassing the
> complex ordered extent removal path entirely.

Please rephrase the changelog and fix the typos. Relocation code is
difficult and nobody wants to break it so there are things to fix but
instead of venting I'd rather see a more concise description that helps
to understand the problem and solution. We don't have much code
documentation and the efforts put into writing good changelogs for such
areas of code is worthwhile. Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-08-10 17:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-24 14:22 btrfs NOCOW fix and cleanups Christoph Hellwig
2023-07-24 14:22 ` [PATCH 1/6] btrfs: fix error handling when in a COW window in run_delalloc_nocow Christoph Hellwig
2023-07-24 14:22 ` [PATCH 2/6] btrfs: cleanup the COW fallback logic " Christoph Hellwig
2023-07-24 14:22 ` [PATCH 3/6] btrfs: consolidate the error handling " Christoph Hellwig
2023-07-24 18:27   ` Boris Burkov
2023-07-24 19:48     ` Christoph Hellwig
2023-07-25 21:36       ` David Sterba
2023-07-24 14:22 ` [PATCH 4/6] btrfs: move the !zoned assert into run_delalloc_cow Christoph Hellwig
2023-07-24 14:22 ` [PATCH 5/6] btrfs: use nocow_end for the loop iteration in run_delalloc_cow Christoph Hellwig
2023-08-10 17:00   ` David Sterba
2023-07-24 14:22 ` [PATCH 6/6] btrfs: clone relocation checksums in btrfs_alloc_ordered_extent Christoph Hellwig
2023-08-10 17:07   ` David Sterba
2023-07-24 18:30 ` btrfs NOCOW fix and cleanups Boris Burkov
2023-07-24 19:49   ` Christoph Hellwig
2023-07-24 19:58     ` Christoph Hellwig
2023-07-25 21:42       ` David Sterba
2023-07-26 12:56         ` Christoph Hellwig
2023-07-27 11:50           ` David Sterba
2023-08-10 16:56 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.