All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases
@ 2022-04-11  6:12 Qu Wenruo
  2022-04-11  6:12 ` [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  6:12 UTC (permalink / raw)
  To: linux-btrfs

When testing my subpage raid56 support, generic/475 is always hanging
with some data page unable to be unlocked.

It turns out that, the hang is not related to the raid56 subpage
support (obviously, as the test case is not utilizing RAID56 at all),
but a recent commit 1784b7d502a9 ("btrfs: handle csum lookup errors
properly on reads") introduced a new error path, and it caught us by
surprise.

The new error path is from btrfs_lookup_bio_sums(), which can now return
error if the csum search failed.

This new error path exposed several problems:

- Double cleanup for submit_one_bio() and its callers
  Bio submission hooks, btrfs_submit_data_bio() and
  btrfs_submit_metadata_bio() will call endio to cleanup on errors.

  But those bio submission hooks will also return error, and
  finally callers of submit_extent_page() will also try to do
  cleanup.

  This will be fixed by the first patch, by always returning 0 for
  submit_one_bio().
  This fix is kept as minimal as possible, to make backport easier.
  The proper conversion to return void will be done in the last patch.

- btrfs_do_readpage() can leave page locked on error
  If submit_extent_page() failed in btrfs_do_readpage(), we only
  cleanup the current range, and leaving the remaining subpage
  range locked.

  This bug is subpage specific, and will not affect regular cases.

  Fix it by cleaning up all the remaining subpage range before
  exiting.

- __extent_writepage_io() can return 0 even it hit some error
  Although we continue writing the remaining ranges, we didn't save
  the first error, causing @ret to be overwritten.
 
  This bug is subpage specific, as for regular cases we only have one
  sector inside the page.

  Fix it by introducing @has_error and @saved_ret.

I manually checked all other submit_extent_page() callers, they all
look fine and won't cause problems like the above.

Finally since submit_one_bio() will always return 0, the final patch
will make it return void, which greatly makes our code cleaner.

But that patch is introducing quite some modifications, not a candidate
for backport, unlike the first 3 patches.

Special thanks to Josef, as my initial bisection points to his patch and
I have no clue why it can cause problems at all.
His hints immediately solved all my questions, and lead to this
patchset.

Qu Wenruo (4):
  btrfs: avoid double clean up when submit_one_bio() failed
  btrfs: fix the error handling for submit_extent_page() for
    btrfs_do_readpage()
  btrfs: return correct error number for __extent_writepage_io()
  btrfs: make submit_one_bio() to return void

 fs/btrfs/extent_io.c | 139 ++++++++++++++++++-------------------------
 fs/btrfs/extent_io.h |   3 +-
 fs/btrfs/inode.c     |   9 +--
 3 files changed, 61 insertions(+), 90 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed
  2022-04-11  6:12 [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
@ 2022-04-11  6:12 ` Qu Wenruo
  2022-04-11  7:14   ` Christoph Hellwig
  2022-04-11  6:12 ` [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage() Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  6:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable

[BUG]
When running generic/475 with 64K page size and 4K sector size, it has a
very high chance (almost 100%) to hang, with mostly data page locked but
no one is going to unlock it.

[CAUSE]
With commit 1784b7d502a9 ("btrfs: handle csum lookup errors properly on
reads"), if we failed to lookup checksum due to metadata IO error, we
will return error for btrfs_submit_data_bio().

This will cause the page to be unlocked twice in btrfs_do_readpage():

 btrfs_do_readpage()
 |- submit_extent_page()
 |  |- submit_one_bio()
 |     |- btrfs_submit_data_bio()
 |        |- if (ret) {
 |        |-     bio->bi_status = ret;
 |        |-     bio_endio(bio); }
 |               In the endio function, we will call end_page_read()
 |               and unlock_extent() to cleanup the subpage range.
 |
 |- if (ret) {
 |-        unlock_extent(); end_page_read() }
           Here we unlock the extent and cleanup the subpage range
           again.

For unlock_extent(), it's mostly double unlock safe.

But for end_page_read(), it's not, especially for subpage case,
as for subpage case we will call btrfs_subpage_end_reader() to reduce
the reader number, and use that to number to determine if we need to
unlock the full page.

If double accounted, it can underflow the number and leave the page
locked without anyone to unlock it.

[FIX]
The commit 1784b7d502a9 ("btrfs: handle csum lookup errors properly on
reads") itself is completely fine, it's our existing code not properly
handling the error from bio submission hook properly.

Since one and only one of the endio function or the submit_extent_page()
caller should do the cleanup, let's just ignore the return value from
bio submission hook.

Before the hook, it's callers' responsibility to do cleanup, after the
hook (including inside the hook), it's the endio doing the cleanup.

This patch is makes submit_one_bio() always return 0, but still keep the
old int return value to make minimal change for backport.

CC: stable@vger.kernel.org # 5.18+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 53b59944013f..34073b0ed6ca 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -165,24 +165,28 @@ static int add_extent_changeset(struct extent_state *state, u32 bits,
 	return ret;
 }
 
-int __must_check submit_one_bio(struct bio *bio, int mirror_num,
-				unsigned long bio_flags)
+int submit_one_bio(struct bio *bio, int mirror_num, unsigned long bio_flags)
 {
-	blk_status_t ret = 0;
 	struct extent_io_tree *tree = bio->bi_private;
 
 	bio->bi_private = NULL;
 
 	/* Caller should ensure the bio has at least some range added */
 	ASSERT(bio->bi_iter.bi_size);
+
 	if (is_data_inode(tree->private_data))
-		ret = btrfs_submit_data_bio(tree->private_data, bio, mirror_num,
+		btrfs_submit_data_bio(tree->private_data, bio, mirror_num,
 					    bio_flags);
 	else
-		ret = btrfs_submit_metadata_bio(tree->private_data, bio,
+		btrfs_submit_metadata_bio(tree->private_data, bio,
 						mirror_num, bio_flags);
-
-	return blk_status_to_errno(ret);
+	/*
+	 * Above submission hooks will handle the error by ending the bio,
+	 * which will do the cleanup properly.
+	 * So here we should not return any error, or the caller of
+	 * submit_extent_page() will do cleanup again, causing problems.
+	 */
+	return 0;
 }
 
 /* Cleanup unsubmitted bios */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage()
  2022-04-11  6:12 [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
  2022-04-11  6:12 ` [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
@ 2022-04-11  6:12 ` Qu Wenruo
  2022-04-11  7:18   ` Christoph Hellwig
  2022-04-11  6:12 ` [PATCH 3/4] btrfs: return correct error number for __extent_writepage_io() Qu Wenruo
  2022-04-11  6:12 ` [PATCH 4/4] btrfs: make submit_one_bio() to return void Qu Wenruo
  3 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  6:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable

[BUG]
Test case generic/475 have a very high chance (almost 100%) to hit a fs
hang, where a data page will never be unlocked and hang all later
operations.

[CAUSE]
In btrfs_do_readpage(), if we hit an error from submit_extent_page() we
will try to do the cleanup for our current io range, and exit.

This works fine for PAGE_SIZE == sectorsize cases, but not for subpage.

For subpage btrfs_do_readpage() will lock the full page first, which can
contain several different sectors and extents:

 btrfs_do_readpage()
 |- begin_page_read()
 |  |- btrfs_subpage_start_reader();
 |     Now the page will hage PAGE_SIZE / sectorsize reader pending,
 |     and the page is locked.
 |
 |- end_page_read() for different branches
 |  This function will reduce subpage readers, and when readers
 |  reach 0, it will unlock the page.

But when submit_extent_page() failed, we only cleanup the current
io range, while the remaining io range will never be cleaned up, and the
page remains locked forever.

[FIX]
Update the error handling of submit_extent_page() to cleanup all the
remaining subpage range before exiting the loop.

CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 34073b0ed6ca..8de25ce05606 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3735,8 +3735,12 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached,
 					 this_bio_flag,
 					 force_bio_submit);
 		if (ret) {
-			unlock_extent(tree, cur, cur + iosize - 1);
-			end_page_read(page, false, cur, iosize);
+			/*
+			 * We have to unlock the remaining range, or the page
+			 * will never be unlocked.
+			 */
+			unlock_extent(tree, cur, end);
+			end_page_read(page, false, cur, end + 1 - cur);
 			goto out;
 		}
 		cur = cur + iosize;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] btrfs: return correct error number for __extent_writepage_io()
  2022-04-11  6:12 [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
  2022-04-11  6:12 ` [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
  2022-04-11  6:12 ` [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage() Qu Wenruo
@ 2022-04-11  6:12 ` Qu Wenruo
  2022-04-11  6:12 ` [PATCH 4/4] btrfs: make submit_one_bio() to return void Qu Wenruo
  3 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  6:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable

[BUG]
If we hit an error from submit_extent_page() inside
__extent_writepage_io(), we could still return 0 to the caller, and
even trigger the warning in btrfs_page_assert_not_dirty().

[CAUSE]
In __extent_writepage_io(), if we hit an error from
submit_extent_page(), we will just clean up the range and continue.

This is completely fine for regular PAGE_SIZE == sectorsize, as we can
only hit one sector in one page, thus after the error we're ensured to
exit and @ret will be saved.

But for subpage case, we may have other dirty subpage range in the page,
and in the next loop, we may succeeded submitting the next range.

In that case, @ret will be overwritten, and we return 0 to the caller,
while we have hit some error.

[FIX]
Introduce @has_error and @saved_ret to record the first error we hit, so
we will never forget what error we hit.

CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 8de25ce05606..b40bb544d301 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3916,10 +3916,12 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 	u64 extent_offset;
 	u64 block_start;
 	struct extent_map *em;
+	int saved_ret = 0;
 	int ret = 0;
 	int nr = 0;
 	u32 opf = REQ_OP_WRITE;
 	const unsigned int write_flags = wbc_to_write_flags(wbc);
+	bool has_error = false;
 	bool compressed;
 
 	ret = btrfs_writepage_cow_fixup(page);
@@ -3969,6 +3971,9 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 		if (IS_ERR(em)) {
 			btrfs_page_set_error(fs_info, page, cur, end - cur + 1);
 			ret = PTR_ERR_OR_ZERO(em);
+			has_error = true;
+			if (!saved_ret)
+				saved_ret = ret;
 			break;
 		}
 
@@ -4032,6 +4037,10 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 					 end_bio_extent_writepage,
 					 0, 0, false);
 		if (ret) {
+			has_error = true;
+			if (!saved_ret)
+				saved_ret = ret;
+
 			btrfs_page_set_error(fs_info, page, cur, iosize);
 			if (PageWriteback(page))
 				btrfs_page_clear_writeback(fs_info, page, cur,
@@ -4045,8 +4054,10 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 	 * If we finish without problem, we should not only clear page dirty,
 	 * but also empty subpage dirty bits
 	 */
-	if (!ret)
+	if (!has_error)
 		btrfs_page_assert_not_dirty(fs_info, page);
+	else
+		ret = saved_ret;
 	*nr_ret = nr;
 	return ret;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11  6:12 [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-04-11  6:12 ` [PATCH 3/4] btrfs: return correct error number for __extent_writepage_io() Qu Wenruo
@ 2022-04-11  6:12 ` Qu Wenruo
  2022-04-11  7:18   ` Christoph Hellwig
  3 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  6:12 UTC (permalink / raw)
  To: linux-btrfs

Since commit "btrfs: avoid double clean up when submit_one_bio()
failed", submit_one_bio() will never return any error, as if we hit any
error, endio function will be responsible to cleanup the mess.

So here we're completely fine to make submit_one_bio() to return void.

NOTE: not all bio submission hooks should return void.

Hooks for async submission, like btrfs_submit_bio_start_direct_io() or
btrfs_submit_bio_start(), they should still return error, so
run_one_async_done() can detect the error properly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 104 +++++++++++++------------------------------
 fs/btrfs/extent_io.h |   3 +-
 fs/btrfs/inode.c     |   9 +---
 3 files changed, 34 insertions(+), 82 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b40bb544d301..186836fad012 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -165,7 +165,7 @@ static int add_extent_changeset(struct extent_state *state, u32 bits,
 	return ret;
 }
 
-int submit_one_bio(struct bio *bio, int mirror_num, unsigned long bio_flags)
+void submit_one_bio(struct bio *bio, int mirror_num, unsigned long bio_flags)
 {
 	struct extent_io_tree *tree = bio->bi_private;
 
@@ -186,7 +186,6 @@ int submit_one_bio(struct bio *bio, int mirror_num, unsigned long bio_flags)
 	 * So here we should not return any error, or the caller of
 	 * submit_extent_page() will do cleanup again, causing problems.
 	 */
-	return 0;
 }
 
 /* Cleanup unsubmitted bios */
@@ -207,13 +206,12 @@ static void end_write_bio(struct extent_page_data *epd, int ret)
  * Return 0 if everything is OK.
  * Return <0 for error.
  */
-static int __must_check flush_write_bio(struct extent_page_data *epd)
+static void flush_write_bio(struct extent_page_data *epd)
 {
-	int ret = 0;
 	struct bio *bio = epd->bio_ctrl.bio;
 
 	if (bio) {
-		ret = submit_one_bio(bio, 0, 0);
+		submit_one_bio(bio, 0, 0);
 		/*
 		 * Clean up of epd->bio is handled by its endio function.
 		 * And endio is either triggered by successful bio execution
@@ -223,7 +221,6 @@ static int __must_check flush_write_bio(struct extent_page_data *epd)
 		 */
 		epd->bio_ctrl.bio = NULL;
 	}
-	return ret;
 }
 
 int __init extent_state_cache_init(void)
@@ -3399,10 +3396,8 @@ static int submit_extent_page(unsigned int opf,
 	ASSERT(pg_offset < PAGE_SIZE && size <= PAGE_SIZE &&
 	       pg_offset + size <= PAGE_SIZE);
 	if (force_bio_submit && bio_ctrl->bio) {
-		ret = submit_one_bio(bio_ctrl->bio, mirror_num, bio_ctrl->bio_flags);
+		submit_one_bio(bio_ctrl->bio, mirror_num, bio_ctrl->bio_flags);
 		bio_ctrl->bio = NULL;
-		if (ret < 0)
-			return ret;
 	}
 
 	while (cur < pg_offset + size) {
@@ -3443,11 +3438,9 @@ static int submit_extent_page(unsigned int opf,
 		if (added < size - offset) {
 			/* The bio should contain some page(s) */
 			ASSERT(bio_ctrl->bio->bi_iter.bi_size);
-			ret = submit_one_bio(bio_ctrl->bio, mirror_num,
-					bio_ctrl->bio_flags);
+			submit_one_bio(bio_ctrl->bio, mirror_num,
+				       bio_ctrl->bio_flags);
 			bio_ctrl->bio = NULL;
-			if (ret < 0)
-				return ret;
 		}
 		cur += added;
 	}
@@ -4209,14 +4202,12 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb
 			  struct extent_page_data *epd)
 {
 	struct btrfs_fs_info *fs_info = eb->fs_info;
-	int i, num_pages, failed_page_nr;
+	int i, num_pages;
 	int flush = 0;
 	int ret = 0;
 
 	if (!btrfs_try_tree_write_lock(eb)) {
-		ret = flush_write_bio(epd);
-		if (ret < 0)
-			return ret;
+		flush_write_bio(epd);
 		flush = 1;
 		btrfs_tree_lock(eb);
 	}
@@ -4226,9 +4217,7 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb
 		if (!epd->sync_io)
 			return 0;
 		if (!flush) {
-			ret = flush_write_bio(epd);
-			if (ret < 0)
-				return ret;
+			flush_write_bio(epd);
 			flush = 1;
 		}
 		while (1) {
@@ -4275,39 +4264,13 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb
 
 		if (!trylock_page(p)) {
 			if (!flush) {
-				int err;
-
-				err = flush_write_bio(epd);
-				if (err < 0) {
-					ret = err;
-					failed_page_nr = i;
-					goto err_unlock;
-				}
+				flush_write_bio(epd);
 				flush = 1;
 			}
 			lock_page(p);
 		}
 	}
 
-	return ret;
-err_unlock:
-	/* Unlock already locked pages */
-	for (i = 0; i < failed_page_nr; i++)
-		unlock_page(eb->pages[i]);
-	/*
-	 * Clear EXTENT_BUFFER_WRITEBACK and wake up anyone waiting on it.
-	 * Also set back EXTENT_BUFFER_DIRTY so future attempts to this eb can
-	 * be made and undo everything done before.
-	 */
-	btrfs_tree_lock(eb);
-	spin_lock(&eb->refs_lock);
-	set_bit(EXTENT_BUFFER_DIRTY, &eb->bflags);
-	end_extent_buffer_writeback(eb);
-	spin_unlock(&eb->refs_lock);
-	percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len,
-				 fs_info->dirty_metadata_batch);
-	btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN);
-	btrfs_tree_unlock(eb);
 	return ret;
 }
 
@@ -4929,13 +4892,21 @@ int btree_write_cache_pages(struct address_space *mapping,
 	 *   if the fs already has error.
 	 */
 	if (!BTRFS_FS_ERROR(fs_info)) {
-		ret = flush_write_bio(&epd);
+		flush_write_bio(&epd);
 	} else {
 		ret = -EROFS;
 		end_write_bio(&epd, ret);
 	}
 out:
 	btrfs_zoned_meta_io_unlock(fs_info);
+	/*
+	 * Without the return value from flush_write_bio(), we can have
+	 * ret > 0 from submit_eb_page() which indicates how many
+	 * eb we submitted from that page.
+	 * It's just an advisable value, reset it to 0 to alert callers.
+	 */
+	if (ret > 0)
+		ret = 0;
 	return ret;
 }
 
@@ -5037,8 +5008,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 			 * tmpfs file mapping
 			 */
 			if (!trylock_page(page)) {
-				ret = flush_write_bio(epd);
-				BUG_ON(ret < 0);
+				flush_write_bio(epd);
 				lock_page(page);
 			}
 
@@ -5048,10 +5018,8 @@ static int extent_write_cache_pages(struct address_space *mapping,
 			}
 
 			if (wbc->sync_mode != WB_SYNC_NONE) {
-				if (PageWriteback(page)) {
-					ret = flush_write_bio(epd);
-					BUG_ON(ret < 0);
-				}
+				if (PageWriteback(page))
+					flush_write_bio(epd);
 				wait_on_page_writeback(page);
 			}
 
@@ -5091,9 +5059,8 @@ static int extent_write_cache_pages(struct address_space *mapping,
 		 * page in our current bio, and thus deadlock, so flush the
 		 * write bio here.
 		 */
-		ret = flush_write_bio(epd);
-		if (!ret)
-			goto retry;
+		flush_write_bio(epd);
+		goto retry;
 	}
 
 	if (wbc->range_cyclic || (wbc->nr_to_write > 0 && range_whole))
@@ -5119,8 +5086,7 @@ int extent_write_full_page(struct page *page, struct writeback_control *wbc)
 		return ret;
 	}
 
-	ret = flush_write_bio(&epd);
-	ASSERT(ret <= 0);
+	flush_write_bio(&epd);
 	return ret;
 }
 
@@ -5182,7 +5148,7 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end)
 	}
 
 	if (!found_error)
-		ret = flush_write_bio(&epd);
+		flush_write_bio(&epd);
 	else
 		end_write_bio(&epd, ret);
 
@@ -5215,7 +5181,7 @@ int extent_writepages(struct address_space *mapping,
 		end_write_bio(&epd, ret);
 		return ret;
 	}
-	ret = flush_write_bio(&epd);
+	flush_write_bio(&epd);
 	return ret;
 }
 
@@ -5238,10 +5204,8 @@ void extent_readahead(struct readahead_control *rac)
 	if (em_cached)
 		free_extent_map(em_cached);
 
-	if (bio_ctrl.bio) {
-		if (submit_one_bio(bio_ctrl.bio, 0, bio_ctrl.bio_flags))
-			return;
-	}
+	if (bio_ctrl.bio)
+		submit_one_bio(bio_ctrl.bio, 0, bio_ctrl.bio_flags);
 }
 
 /*
@@ -6583,12 +6547,8 @@ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
 		atomic_dec(&eb->io_pages);
 	}
 	if (bio_ctrl.bio) {
-		int tmp;
-
-		tmp = submit_one_bio(bio_ctrl.bio, mirror_num, 0);
+		submit_one_bio(bio_ctrl.bio, mirror_num, 0);
 		bio_ctrl.bio = NULL;
-		if (tmp < 0)
-			return tmp;
 	}
 	if (ret || wait != WAIT_COMPLETE)
 		return ret;
@@ -6701,10 +6661,8 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num)
 	}
 
 	if (bio_ctrl.bio) {
-		err = submit_one_bio(bio_ctrl.bio, mirror_num, bio_ctrl.bio_flags);
+		submit_one_bio(bio_ctrl.bio, mirror_num, bio_ctrl.bio_flags);
 		bio_ctrl.bio = NULL;
-		if (err)
-			return err;
 	}
 
 	if (ret || wait != WAIT_COMPLETE)
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 151e9da5da2d..2cfb49bf003e 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -178,8 +178,7 @@ typedef struct extent_map *(get_extent_t)(struct btrfs_inode *inode,
 int try_release_extent_mapping(struct page *page, gfp_t mask);
 int try_release_extent_buffer(struct page *page);
 
-int __must_check submit_one_bio(struct bio *bio, int mirror_num,
-				unsigned long bio_flags);
+void submit_one_bio(struct bio *bio, int mirror_num, unsigned long bio_flags);
 int btrfs_do_readpage(struct page *page, struct extent_map **em_cached,
 		      struct btrfs_bio_ctrl *bio_ctrl,
 		      unsigned int read_flags, u64 *prev_em_start);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 78a5145353e1..43eebb1cf76d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8098,13 +8098,8 @@ int btrfs_readpage(struct file *file, struct page *page)
 	btrfs_lock_and_flush_ordered_range(inode, start, end, NULL);
 
 	ret = btrfs_do_readpage(page, NULL, &bio_ctrl, 0, NULL);
-	if (bio_ctrl.bio) {
-		int ret2;
-
-		ret2 = submit_one_bio(bio_ctrl.bio, 0, bio_ctrl.bio_flags);
-		if (ret == 0)
-			ret = ret2;
-	}
+	if (bio_ctrl.bio)
+		submit_one_bio(bio_ctrl.bio, 0, bio_ctrl.bio_flags);
 	return ret;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed
  2022-04-11  6:12 ` [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
@ 2022-04-11  7:14   ` Christoph Hellwig
  2022-04-11  7:15     ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-04-11  7:14 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, stable

On Mon, Apr 11, 2022 at 02:12:49PM +0800, Qu Wenruo wrote:
> +	/*
> +	 * Above submission hooks will handle the error by ending the bio,
> +	 * which will do the cleanup properly.
> +	 * So here we should not return any error, or the caller of
> +	 * submit_extent_page() will do cleanup again, causing problems.
> +	 */
> +	return 0;

This should not return anything.  Similar to how e.g. submit_bio
works it needs to be a void return.  And yes, that will properly
fix all the double completion issues.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed
  2022-04-11  7:14   ` Christoph Hellwig
@ 2022-04-11  7:15     ` Qu Wenruo
  2022-04-11  7:19       ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  7:15 UTC (permalink / raw)
  To: Christoph Hellwig, Qu Wenruo; +Cc: linux-btrfs, stable



On 2022/4/11 15:14, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 02:12:49PM +0800, Qu Wenruo wrote:
>> +	/*
>> +	 * Above submission hooks will handle the error by ending the bio,
>> +	 * which will do the cleanup properly.
>> +	 * So here we should not return any error, or the caller of
>> +	 * submit_extent_page() will do cleanup again, causing problems.
>> +	 */
>> +	return 0;
>
> This should not return anything.  Similar to how e.g. submit_bio
> works it needs to be a void return.  And yes, that will properly
> fix all the double completion issues.

Yes, check the last patch.

This patch itself is for backport, thus I didn't change the return type
to make backport easier.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage()
  2022-04-11  6:12 ` [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage() Qu Wenruo
@ 2022-04-11  7:18   ` Christoph Hellwig
  2022-04-11  7:31     ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-04-11  7:18 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, stable

On Mon, Apr 11, 2022 at 02:12:50PM +0800, Qu Wenruo wrote:
> [BUG]
> Test case generic/475 have a very high chance (almost 100%) to hit a fs
> hang, where a data page will never be unlocked and hang all later
> operations.

Question:  how can we even get an error?  The submission already stopped
return errors with patch 1.  btrfs_get_chunk_map called from
btrfs_zoned_get_device and calc_bio_boundaries really just has sanity
checks that should be fatal if not met, same for btrfs_get_io_geometry.

So yes, we could fix the nasty error handling here.  Or just remove it
entirely, which would reduce the possibility of bugs even more.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11  6:12 ` [PATCH 4/4] btrfs: make submit_one_bio() to return void Qu Wenruo
@ 2022-04-11  7:18   ` Christoph Hellwig
  2022-04-11  7:20     ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-04-11  7:18 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Apr 11, 2022 at 02:12:52PM +0800, Qu Wenruo wrote:
> Since commit "btrfs: avoid double clean up when submit_one_bio()
> failed", submit_one_bio() will never return any error, as if we hit any
> error, endio function will be responsible to cleanup the mess.
> 
> So here we're completely fine to make submit_one_bio() to return void.
> 
> NOTE: not all bio submission hooks should return void.
> 
> Hooks for async submission, like btrfs_submit_bio_start_direct_io() or
> btrfs_submit_bio_start(), they should still return error, so
> run_one_async_done() can detect the error properly.

This really should go into patch 1.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed
  2022-04-11  7:15     ` Qu Wenruo
@ 2022-04-11  7:19       ` Christoph Hellwig
  2022-04-11  7:35         ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-04-11  7:19 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Hellwig, Qu Wenruo, linux-btrfs, stable

On Mon, Apr 11, 2022 at 03:15:23PM +0800, Qu Wenruo wrote:
> This patch itself is for backport, thus I didn't change the return type
> to make backport easier.

Umm, if you don't remove all that buggy error handling code in a
backport you'll have to fix it.  So do the right thing here and just
get rid of it ASAP including for the backport.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11  7:18   ` Christoph Hellwig
@ 2022-04-11  7:20     ` Qu Wenruo
  2022-04-11 12:15       ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  7:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs



On 2022/4/11 15:18, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 02:12:52PM +0800, Qu Wenruo wrote:
>> Since commit "btrfs: avoid double clean up when submit_one_bio()
>> failed", submit_one_bio() will never return any error, as if we hit any
>> error, endio function will be responsible to cleanup the mess.
>>
>> So here we're completely fine to make submit_one_bio() to return void.
>>
>> NOTE: not all bio submission hooks should return void.
>>
>> Hooks for async submission, like btrfs_submit_bio_start_direct_io() or
>> btrfs_submit_bio_start(), they should still return error, so
>> run_one_async_done() can detect the error properly.
> 
> This really should go into patch 1.
> 
Stable tree won't be happy about the size.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage()
  2022-04-11  7:18   ` Christoph Hellwig
@ 2022-04-11  7:31     ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  7:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs, stable



On 2022/4/11 15:18, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 02:12:50PM +0800, Qu Wenruo wrote:
>> [BUG]
>> Test case generic/475 have a very high chance (almost 100%) to hit a fs
>> hang, where a data page will never be unlocked and hang all later
>> operations.
> 
> Question:  how can we even get an error?  The submission already stopped
> return errors with patch 1.  btrfs_get_chunk_map called from
> btrfs_zoned_get_device and calc_bio_boundaries really just has sanity
> checks that should be fatal if not met, same for btrfs_get_io_geometry.

Yep, btrfs_get_chunk_map() related call sites can only get error due to 
sanity check.

But the sanity check still makes sense, and those can not be easily 
rejected by our existing tree-checkers.

So we still need to do the error handling.

> 
> So yes, we could fix the nasty error handling here.  Or just remove it
> entirely, which would reduce the possibility of bugs even more.
> 

I don't see a super good way to remove the sanity check.

The get_chunk_map() one is here to catch possible bad bio which wants to 
write into non-existing chunk.
To completely get rid that, we need a bullet proof solution to make sure 
all of our bio will never point to some non-existing logical bytenr.

Especially considering we have extra mechanisms to make chunk 
allocation/deletion more dynamic and harder to catch such situation at 
other locations.

Which can be more challenging than the error handling.

Anyway, we still need to handle the error for __get_extent_map(), we 
just need to do the same one here.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed
  2022-04-11  7:19       ` Christoph Hellwig
@ 2022-04-11  7:35         ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11  7:35 UTC (permalink / raw)
  To: Christoph Hellwig, Qu Wenruo; +Cc: linux-btrfs, stable



On 2022/4/11 15:19, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 03:15:23PM +0800, Qu Wenruo wrote:
>> This patch itself is for backport, thus I didn't change the return type
>> to make backport easier.
> 
> Umm, if you don't remove all that buggy error handling code in a
> backport you'll have to fix it.  So do the right thing here and just
> get rid of it ASAP including for the backport.
> 

I don't think there is much difference by making submit_one_bio() to 
always return 0, and backporting the full type change.

For btrfs development, no one will base their code on stable, thus what 
they see is the one returning void.

For vendor backports, I have left a comment on why we return 0 manually, 
it should be clear enough to solve any future conflicts.

I'll let David to determine the correct way, as small backport + proper 
cleanup policy for bug fixes is pretty common here AFAIK.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11  7:20     ` Qu Wenruo
@ 2022-04-11 12:15       ` Christoph Hellwig
  2022-04-11 12:23         ` Greg KH
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-04-11 12:15 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs, gregkh

On Mon, Apr 11, 2022 at 03:20:47PM +0800, Qu Wenruo wrote:
> > This really should go into patch 1.
> > 
> Stable tree won't be happy about the size.

Really?  I've never really seen stable maintainers complain about
the size of a patch.  Especially if it is almost 100% removal of buggy
code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11 12:15       ` Christoph Hellwig
@ 2022-04-11 12:23         ` Greg KH
  2022-04-11 12:29           ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2022-04-11 12:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Qu Wenruo, linux-btrfs

On Mon, Apr 11, 2022 at 05:15:50AM -0700, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 03:20:47PM +0800, Qu Wenruo wrote:
> > > This really should go into patch 1.
> > > 
> > Stable tree won't be happy about the size.
> 
> Really?  I've never really seen stable maintainers complain about
> the size of a patch.  Especially if it is almost 100% removal of buggy
> code.

That sounds like a great patch to take for stable kernels :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] btrfs: make submit_one_bio() to return void
  2022-04-11 12:23         ` Greg KH
@ 2022-04-11 12:29           ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2022-04-11 12:29 UTC (permalink / raw)
  To: Greg KH, Christoph Hellwig; +Cc: Qu Wenruo, linux-btrfs



On 2022/4/11 20:23, Greg KH wrote:
> On Mon, Apr 11, 2022 at 05:15:50AM -0700, Christoph Hellwig wrote:
>> On Mon, Apr 11, 2022 at 03:20:47PM +0800, Qu Wenruo wrote:
>>>> This really should go into patch 1.
>>>>
>>> Stable tree won't be happy about the size.
>>
>> Really?  I've never really seen stable maintainers complain about
>> the size of a patch.  Especially if it is almost 100% removal of buggy
>> code.
>
> That sounds like a great patch to take for stable kernels :)

OK, I'll merge patch 1 and patch 4 in this case.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-04-11 12:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11  6:12 [PATCH 0/4] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
2022-04-11  6:12 ` [PATCH 1/4] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
2022-04-11  7:14   ` Christoph Hellwig
2022-04-11  7:15     ` Qu Wenruo
2022-04-11  7:19       ` Christoph Hellwig
2022-04-11  7:35         ` Qu Wenruo
2022-04-11  6:12 ` [PATCH 2/4] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage() Qu Wenruo
2022-04-11  7:18   ` Christoph Hellwig
2022-04-11  7:31     ` Qu Wenruo
2022-04-11  6:12 ` [PATCH 3/4] btrfs: return correct error number for __extent_writepage_io() Qu Wenruo
2022-04-11  6:12 ` [PATCH 4/4] btrfs: make submit_one_bio() to return void Qu Wenruo
2022-04-11  7:18   ` Christoph Hellwig
2022-04-11  7:20     ` Qu Wenruo
2022-04-11 12:15       ` Christoph Hellwig
2022-04-11 12:23         ` Greg KH
2022-04-11 12:29           ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.