linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback
@ 2022-09-23 11:59 Qu Wenruo
  2022-09-23 11:59 ` [PATCH 1/3] btrfs-progs: remove unused function extent_io_tree_init_cache_max() Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-09-23 11:59 UTC (permalink / raw)
  To: linux-btrfs

Christoph Anton Mitterer reported a crash if we try to call "btrfs check
--clear-space-cache v2" on a block device which is set read-only by
"blockdev --setro".

For such blockdevice, open() with O_RDWR won't report error immediately,
but only return error when we write to do any writes.

So what we can do is to enhance the error handling of metadata writeback
during transaction commit.

The first 2 patches are cleanups/fixes I exposed during the development.
The last one is the main disk for the fix.

Qu Wenruo (3):
  btrfs-progs: remove unused function extent_io_tree_init_cache_max()
  btrfs-progs: remove duplicated leakde extent buffer reporst
  btrfs-progs: properly handle write error when writing back tree blocks

 kernel-shared/extent_io.c   | 14 ++++++--------
 kernel-shared/extent_io.h   |  2 --
 kernel-shared/transaction.c | 33 +++++++++++++++++++++++++++++++--
 3 files changed, 37 insertions(+), 12 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] btrfs-progs: remove unused function extent_io_tree_init_cache_max()
  2022-09-23 11:59 [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback Qu Wenruo
@ 2022-09-23 11:59 ` Qu Wenruo
  2022-09-23 11:59 ` [PATCH 2/3] btrfs-progs: remove duplicated leakde extent buffer reporst Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-09-23 11:59 UTC (permalink / raw)
  To: linux-btrfs

The function is introduced by commit a5ce5d219822 ("btrfs-progs:
extent-cache: actually cache extent buffers") but never got utilized.

Thus we can just remove it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 kernel-shared/extent_io.c | 7 -------
 kernel-shared/extent_io.h | 2 --
 2 files changed, 9 deletions(-)

diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c
index 48bcf2cf2f96..a34616c9e783 100644
--- a/kernel-shared/extent_io.c
+++ b/kernel-shared/extent_io.c
@@ -43,13 +43,6 @@ void extent_io_tree_init(struct extent_io_tree *tree)
 	tree->max_cache_size = (u64)total_memory() / 4;
 }
 
-void extent_io_tree_init_cache_max(struct extent_io_tree *tree,
-				   u64 max_cache_size)
-{
-	extent_io_tree_init(tree);
-	tree->max_cache_size = max_cache_size;
-}
-
 static struct extent_state *alloc_extent_state(void)
 {
 	struct extent_state *state;
diff --git a/kernel-shared/extent_io.h b/kernel-shared/extent_io.h
index 2148a8112428..ccdf768c1e5d 100644
--- a/kernel-shared/extent_io.h
+++ b/kernel-shared/extent_io.h
@@ -97,8 +97,6 @@ static inline void extent_buffer_get(struct extent_buffer *eb)
 }
 
 void extent_io_tree_init(struct extent_io_tree *tree);
-void extent_io_tree_init_cache_max(struct extent_io_tree *tree,
-				   u64 max_cache_size);
 void extent_io_tree_cleanup(struct extent_io_tree *tree);
 int set_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, int bits);
 int clear_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, int bits);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] btrfs-progs: remove duplicated leakde extent buffer reporst
  2022-09-23 11:59 [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback Qu Wenruo
  2022-09-23 11:59 ` [PATCH 1/3] btrfs-progs: remove unused function extent_io_tree_init_cache_max() Qu Wenruo
@ 2022-09-23 11:59 ` Qu Wenruo
  2022-09-23 11:59 ` [PATCH 3/3] btrfs-progs: properly handle write error when writing back tree blocks Qu Wenruo
  2022-09-27 15:13 ` [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-09-23 11:59 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When transaction is aborted halfway, we can have extent buffer leaked,
and in that case, the same leaked extent buffer can be reported for
multiple times:

  ERROR: failed to clear free space cache v2: -1
  extent buffer leak: start 30441472 len 16384
  WARNING: dirty eb leak (aborted trans): start 30441472 len 16384
  extent buffer leak: start 30720000 len 16384
  extent buffer leak: start 30425088 len 16384
  extent buffer leak: start 30425088 len 16384 << Duplicated
  WARNING: dirty eb leak (aborted trans): start 30425088 len 16384

Note that 30425088 line is reported twice (not accounting the "dirty eb
leak" line).

[CAUSE]
When we detected a leaked eb, we call free_extent_buffer_nocache(), but
free_extent_buffer_nocache() can only remove the eb when its reduced
refs is 0.

If the eb has refs 2, it will need two free_extent_buffer_nocache()
calls to remove it from the cache.

[FIX]
Just reset the eb->refs to 1 so that free_extent_buffer_nocache() can
remove it from cache for sure.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 kernel-shared/extent_io.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c
index a34616c9e783..f10acc3595c3 100644
--- a/kernel-shared/extent_io.c
+++ b/kernel-shared/extent_io.c
@@ -81,6 +81,11 @@ void extent_io_tree_cleanup(struct extent_io_tree *tree)
 	while(!list_empty(&tree->lru)) {
 		eb = list_entry(tree->lru.next, struct extent_buffer, lru);
 		if (eb->refs) {
+			/*
+			 * Reset extent buffer refs to 1, so the
+			 * free_extent_buffer_nocache() can free it for sure.
+			 */
+			eb->refs = 1;
 			fprintf(stderr,
 				"extent buffer leak: start %llu len %u\n",
 				(unsigned long long)eb->start, eb->len);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] btrfs-progs: properly handle write error when writing back tree blocks
  2022-09-23 11:59 [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback Qu Wenruo
  2022-09-23 11:59 ` [PATCH 1/3] btrfs-progs: remove unused function extent_io_tree_init_cache_max() Qu Wenruo
  2022-09-23 11:59 ` [PATCH 2/3] btrfs-progs: remove duplicated leakde extent buffer reporst Qu Wenruo
@ 2022-09-23 11:59 ` Qu Wenruo
  2022-09-27 15:13 ` [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2022-09-23 11:59 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Anton Mitterer

[BUG]
If we emulate a write error during commit transaction, by setting the
block device read-only, then we can easily have the following crash
using "btrfs check --clear-space-cache v2":

  Opening filesystem to check...
  Checking filesystem on /dev/test/scratch1
  UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73
  Clear free space cache v2
  Error writing to device 1
  kernel-shared/transaction.c:156: __commit_transaction: BUG_ON `ret` triggered, value 1
  ./btrfs(+0x570c9)[0x562ec894f0c9]
  ./btrfs(+0x57167)[0x562ec894f167]
  ./btrfs(__commit_transaction+0x13b)[0x562ec894f7f2]
  ./btrfs(btrfs_commit_transaction+0x214)[0x562ec894fa64]
  ./btrfs(btrfs_clear_free_space_tree+0x177)[0x562ec8941ae6]
  ./btrfs(+0xc8958)[0x562ec89c0958]
  ./btrfs(+0xc9d53)[0x562ec89c1d53]
  ./btrfs(+0x17ec7)[0x562ec890fec7]
  ./btrfs(main+0x12f)[0x562ec8910908]
  /usr/lib/libc.so.6(+0x232d0)[0x7ff917ee82d0]
  /usr/lib/libc.so.6(__libc_start_main+0x8a)[0x7ff917ee838a]
  ./btrfs(_start+0x25)[0x562ec890fdc5]
  Aborted (core dumped)

[CAUSE]
The call trace has shown it's a BUG_ON(), and it's from
__commit_transaction(), which is writing tree blocks back.

[FIX]
The fix is pretty simple, just return error.

In fact we even have an error value check in btrfs_commit_transaction()
just after __commit_transaction() call (although not catching the return
value from it).

And since we're here, also call btrfs_abort_transaction() to prevent
newer transactions from being started.

Now we won't have a full crash:

  Opening filesystem to check...
  Checking filesystem on /dev/test/scratch1
  UUID: 5945915b-37f1-4bfa-9f64-684b318b8f73
  Clear free space cache v2
  Error writing to device 1
  ERROR: failed to write bytenr 30425088 length 16384: Operation not permitted
  ERROR: failed to write tree block 30425088: Operation not permitted
  ERROR: failed to clear free space cache v2: -1
  extent buffer leak: start 30720000 len 16384

Reported-by: Christoph Anton Mitterer <calestyo@scientia.org>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 kernel-shared/extent_io.c   |  2 +-
 kernel-shared/transaction.c | 33 +++++++++++++++++++++++++++++++--
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c
index f10acc3595c3..3875b8f61242 100644
--- a/kernel-shared/extent_io.c
+++ b/kernel-shared/extent_io.c
@@ -1011,7 +1011,7 @@ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset,
 				if (ret < 0) {
 					fprintf(stderr, "Error writing to "
 						"device %d\n", errno);
-					ret = errno;
+					ret = -errno;
 					kfree(multi);
 					return ret;
 				} else {
diff --git a/kernel-shared/transaction.c b/kernel-shared/transaction.c
index 8d74016e21d2..28b1684828ee 100644
--- a/kernel-shared/transaction.c
+++ b/kernel-shared/transaction.c
@@ -153,13 +153,41 @@ again:
 			eb = find_first_extent_buffer(tree, start);
 			BUG_ON(!eb || eb->start != start);
 			ret = write_tree_block(trans, fs_info, eb);
-			BUG_ON(ret);
+			if (ret < 0) {
+				free_extent_buffer(eb);
+				errno = -ret;
+				error("failed to write tree block %llu: %m",
+				      eb->start);
+				goto cleanup;
+			}
 			start += eb->len;
 			clear_extent_buffer_dirty(eb);
 			free_extent_buffer(eb);
 		}
 	}
 	return 0;
+cleanup:
+	/*
+	 * Mark all remaining dirty ebs clean, as they have no chance to be written
+	 * back anymore.
+	 */
+	while (1) {
+		int find_ret;
+
+		find_ret = find_first_extent_bit(tree, 0, &start, &end, EXTENT_DIRTY);
+
+		if (find_ret)
+			break;
+
+		while (start <= end) {
+			eb = find_first_extent_buffer(tree, start);
+			BUG_ON(!eb || eb->start != start);
+			start += eb->len;
+			clear_extent_buffer_dirty(eb);
+			free_extent_buffer(eb);
+		}
+	}
+	return ret;
 }
 
 int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
@@ -219,7 +247,7 @@ commit_tree:
 		if (ret < 0)
 			goto error;
 	}
-	__commit_transaction(trans, root);
+	ret = __commit_transaction(trans, root);
 	if (ret < 0)
 		goto error;
 
@@ -244,6 +272,7 @@ commit_tree:
 	}
 	return ret;
 error:
+	btrfs_abort_transaction(trans, ret);
 	btrfs_destroy_delayed_refs(trans);
 	free(trans);
 	return ret;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback
  2022-09-23 11:59 [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-09-23 11:59 ` [PATCH 3/3] btrfs-progs: properly handle write error when writing back tree blocks Qu Wenruo
@ 2022-09-27 15:13 ` David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2022-09-27 15:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Sep 23, 2022 at 07:59:43PM +0800, Qu Wenruo wrote:
> Christoph Anton Mitterer reported a crash if we try to call "btrfs check
> --clear-space-cache v2" on a block device which is set read-only by
> "blockdev --setro".
> 
> For such blockdevice, open() with O_RDWR won't report error immediately,
> but only return error when we write to do any writes.
> 
> So what we can do is to enhance the error handling of metadata writeback
> during transaction commit.
> 
> The first 2 patches are cleanups/fixes I exposed during the development.
> The last one is the main disk for the fix.
> 
> Qu Wenruo (3):
>   btrfs-progs: remove unused function extent_io_tree_init_cache_max()
>   btrfs-progs: remove duplicated leakde extent buffer reporst
>   btrfs-progs: properly handle write error when writing back tree blocks

Added to devel, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-09-27 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23 11:59 [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback Qu Wenruo
2022-09-23 11:59 ` [PATCH 1/3] btrfs-progs: remove unused function extent_io_tree_init_cache_max() Qu Wenruo
2022-09-23 11:59 ` [PATCH 2/3] btrfs-progs: remove duplicated leakde extent buffer reporst Qu Wenruo
2022-09-23 11:59 ` [PATCH 3/3] btrfs-progs: properly handle write error when writing back tree blocks Qu Wenruo
2022-09-27 15:13 ` [PATCH 0/3] btrfs-progs: enhance error handling for metadata writeback David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).