All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Rework fs error handling
@ 2021-05-20 15:21 Josef Bacik
  2021-05-20 15:21 ` [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle Josef Bacik
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Josef Bacik @ 2021-05-20 15:21 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

This series brings together 3 patches I had sent separately, because they mostly
depend on eachother.  The first is

	btrfs: always abort the transaction if we abort a trans handle

Which changes our behavior for trans handles that haven't modified any metadata.
Our dependency chain can be complex, and thus we may rely on a particular async
action happening and rely on the transaction actually aborting if it fails.
Previously if nothing occurred we'd just let the transaction conitue, but this
could put us in a bad state.  We need to simply always abort the transaction for
safety reasons.

That patch allows us to use the next patch

	btrfs: add a btrfs_has_fs_error helper

Now that we can rely on FS_STATE_ERROR always being set if something is wrong,
we can wrap that check in a helper and convert all the open coded checks for
FS_STATE_ERROR and FS_STATE_ABORTED to the new helper.

And finally an actual fix

	btrfs: do not infinite loop in data reclaim if we aborted

The async data path could infinite loop if we aborted because it only broke in
the case of space_info->full, which would never happen if we weren't full and
had aborted the transaction.  This is fine stand alone, but I took advantage of
the helpers here so I want to make sure the fix goes along with its
dependencies.  Thanks,

Josef

Josef Bacik (3):
  btrfs: always abort the transaction if we abort a trans handle
  btrfs: add a btrfs_has_fs_error helper
  btrfs: do not infinite loop in data reclaim if we aborted

 fs/btrfs/ctree.c       |  5 +----
 fs/btrfs/ctree.h       |  5 +++++
 fs/btrfs/disk-io.c     |  8 +++-----
 fs/btrfs/extent-tree.c |  1 -
 fs/btrfs/extent_io.c   |  2 +-
 fs/btrfs/file.c        |  2 +-
 fs/btrfs/inode.c       |  6 +++---
 fs/btrfs/scrub.c       |  2 +-
 fs/btrfs/space-info.c  | 29 ++++++++++++++++++++++++-----
 fs/btrfs/super.c       | 13 +------------
 fs/btrfs/transaction.c | 19 +++++--------------
 fs/btrfs/transaction.h |  1 -
 fs/btrfs/tree-log.c    |  2 +-
 13 files changed, 46 insertions(+), 49 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle
  2021-05-20 15:21 [PATCH 0/3] Rework fs error handling Josef Bacik
@ 2021-05-20 15:21 ` Josef Bacik
  2021-05-25 15:27   ` David Sterba
  2021-05-20 15:21 ` [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper Josef Bacik
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2021-05-20 15:21 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While stress testing our error handling I noticed that sometimes we
would still commit the transaction even though we had aborted the
transaction.

Currently we track if a trans handle has dirtied any metadata, and if it
hasn't we mark the FS as having an error (so no new transactions can be
started), but we will allow the current transaction to complete as we do
not mark the transaction itself as having been aborted.

This sounds good in theory, but we were not properly tracking IO errors
in btrfs_finish_ordered_io, and thus committing the transaction with
bogus free space data.  This isn't necessarily a problem per-se with the
free space cache, as the other guards in place would have kept us from
accepting the free space cache as valid, but hi-lights a real world case
where we had a bug and could have corrupted the filesystem because of
it.

This "skip abort on empty trans handle" is nice in theory, but assumes
we have perfect error handling everywhere, which we clearly do not.
Also we do not allow further transactions to be started, so all this
does is save the last transaction that was happening, which doesn't
necessarily gain us anything other than the potential for real
corruption.

Remove this particular bit of code, if we decide we need to abort the
transaction then abort the current one and keep us from doing real harm
to the file system, regardless of whether this specific trans handle
dirtied anything or not.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c       |  5 +----
 fs/btrfs/extent-tree.c |  1 -
 fs/btrfs/super.c       | 11 -----------
 fs/btrfs/transaction.c |  8 --------
 fs/btrfs/transaction.h |  1 -
 5 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index a484fb72a01f..4bc3ca2cbd7d 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -596,7 +596,6 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
 		       trans->transid, fs_info->generation);
 
 	if (!should_cow_block(trans, root, buf)) {
-		trans->dirty = true;
 		*cow_ret = buf;
 		return 0;
 	}
@@ -1788,10 +1787,8 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 			 * then we don't want to set the path blocking,
 			 * so we test it here
 			 */
-			if (!should_cow_block(trans, root, b)) {
-				trans->dirty = true;
+			if (!should_cow_block(trans, root, b))
 				goto cow_done;
-			}
 
 			/*
 			 * must have write locks on this node and the
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 24b5e54935a9..790de24576ac 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4784,7 +4784,6 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		set_extent_dirty(&trans->transaction->dirty_pages, buf->start,
 			 buf->start + buf->len - 1, GFP_NOFS);
 	}
-	trans->dirty = true;
 	/* this returns a buffer locked for blocking */
 	return buf;
 }
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4a396c1147f1..bc613218c8c5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -299,17 +299,6 @@ void __btrfs_abort_transaction(struct btrfs_trans_handle *trans,
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 
 	WRITE_ONCE(trans->aborted, errno);
-	/* Nothing used. The other threads that have joined this
-	 * transaction may be able to continue. */
-	if (!trans->dirty && list_empty(&trans->new_bgs)) {
-		const char *errstr;
-
-		errstr = btrfs_decode_error(errno);
-		btrfs_warn(fs_info,
-		           "%s:%d: Aborting unused transaction(%s).",
-		           function, line, errstr);
-		return;
-	}
 	WRITE_ONCE(trans->transaction->aborted, errno);
 	/* Wake up anybody who may be waiting on this transaction */
 	wake_up(&fs_info->transaction_wait);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index f75de9f6c0ad..e0a82aa7da89 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2074,14 +2074,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 
 	ASSERT(refcount_read(&trans->use_count) == 1);
 
-	/*
-	 * Some places just start a transaction to commit it.  We need to make
-	 * sure that if this commit fails that the abort code actually marks the
-	 * transaction as failed, so set trans->dirty to make the abort code do
-	 * the right thing.
-	 */
-	trans->dirty = true;
-
 	/* Stop the commit early if ->aborted is set */
 	if (TRANS_ABORTED(cur_trans)) {
 		ret = cur_trans->aborted;
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 364cfbb4c5c5..c49e2266b28b 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -143,7 +143,6 @@ struct btrfs_trans_handle {
 	bool allocating_chunk;
 	bool can_flush_pending_bgs;
 	bool reloc_reserved;
-	bool dirty;
 	bool in_fsync;
 	struct btrfs_root *root;
 	struct btrfs_fs_info *fs_info;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper
  2021-05-20 15:21 [PATCH 0/3] Rework fs error handling Josef Bacik
  2021-05-20 15:21 ` [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle Josef Bacik
@ 2021-05-20 15:21 ` Josef Bacik
  2021-05-25 15:34   ` David Sterba
  2021-05-20 15:21 ` [PATCH 3/3] btrfs: do not infinite loop in data reclaim if we aborted Josef Bacik
  2021-05-25 15:39 ` [PATCH 0/3] Rework fs error handling David Sterba
  3 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2021-05-20 15:21 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have a few flags that are inconsistently used to describe the fs in
different states of failure.  As of

	btrfs: always abort the transaction if we abort a trans handle

we will always set BTRFS_FS_STATE_ERROR if we abort, so we don't have to
check both ABORTED and ERROR to see if things have gone wrong.  Add a
helper to check BTRFS_FS_STATE_HELPER and then convert all checkers of
FS_STATE_ERROR to use the helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h       |  5 +++++
 fs/btrfs/disk-io.c     |  8 +++-----
 fs/btrfs/extent_io.c   |  2 +-
 fs/btrfs/file.c        |  2 +-
 fs/btrfs/inode.c       |  6 +++---
 fs/btrfs/scrub.c       |  2 +-
 fs/btrfs/super.c       |  2 +-
 fs/btrfs/transaction.c | 11 +++++------
 fs/btrfs/tree-log.c    |  2 +-
 9 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 938d8ebf4cf3..3c22c3308667 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3531,6 +3531,11 @@ do {								\
 			  (errno), fmt, ##args);		\
 } while (0)
 
+static inline bool btrfs_has_fs_error(struct btrfs_fs_info *fs_info)
+{
+	return test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state);
+}
+
 __printf(5, 6)
 __cold
 void __btrfs_panic(struct btrfs_fs_info *fs_info, const char *function,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8c3db9076988..ab1d0b9f90e7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1969,8 +1969,7 @@ static int transaction_kthread(void *arg)
 		wake_up_process(fs_info->cleaner_kthread);
 		mutex_unlock(&fs_info->transaction_kthread_mutex);
 
-		if (unlikely(test_bit(BTRFS_FS_STATE_ERROR,
-				      &fs_info->fs_state)))
+		if (unlikely(btrfs_has_fs_error(fs_info)))
 			btrfs_cleanup_transaction(fs_info);
 		if (!kthread_should_stop() &&
 				(!btrfs_transaction_blocked(fs_info) ||
@@ -4221,7 +4220,7 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
 		drop_ref = true;
 	spin_unlock(&fs_info->fs_roots_radix_lock);
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+	if (btrfs_has_fs_error(fs_info)) {
 		ASSERT(root->log_root == NULL);
 		if (root->reloc_root) {
 			btrfs_put_root(root->reloc_root);
@@ -4372,8 +4371,7 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 			btrfs_err(fs_info, "commit super ret %d", ret);
 	}
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state) ||
-	    test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state))
+	if (btrfs_has_fs_error(fs_info))
 		btrfs_error_commit_super(fs_info);
 
 	kthread_stop(fs_info->transaction_kthread);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 78d3f2ec90e0..c89871eabef8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4613,7 +4613,7 @@ int btree_write_cache_pages(struct address_space *mapping,
 	 *   extent io tree. Thus we don't want to submit such wild eb
 	 *   if the fs already has error.
 	 */
-	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+	if (!btrfs_has_fs_error(fs_info)) {
 		ret = flush_write_bio(&epd);
 	} else {
 		ret = -EROFS;
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 3b10d98b4ebb..3a68d55c1870 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1999,7 +1999,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	 * have opened a file as writable, we have to stop this write operation
 	 * to ensure consistency.
 	 */
-	if (test_bit(BTRFS_FS_STATE_ERROR, &inode->root->fs_info->fs_state))
+	if (btrfs_has_fs_error(inode->root->fs_info))
 		return -EROFS;
 
 	if (!(iocb->ki_flags & IOCB_DIRECT) &&
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ca835ee61045..351b28366aa1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4221,7 +4221,7 @@ static void btrfs_prune_dentries(struct btrfs_root *root)
 	struct inode *inode;
 	u64 objectid = 0;
 
-	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+	if (!btrfs_has_fs_error(fs_info))
 		WARN_ON(btrfs_root_refs(&root->root_item) != 0);
 
 	spin_lock(&root->inode_lock);
@@ -9703,7 +9703,7 @@ int btrfs_start_delalloc_snapshot(struct btrfs_root *root, bool in_reclaim_conte
 	};
 	struct btrfs_fs_info *fs_info = root->fs_info;
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+	if (btrfs_has_fs_error(fs_info))
 		return -EROFS;
 
 	return start_delalloc_inodes(root, &wbc, true, in_reclaim_context);
@@ -9722,7 +9722,7 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
 	struct list_head splice;
 	int ret;
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+	if (btrfs_has_fs_error(fs_info))
 		return -EROFS;
 
 	INIT_LIST_HEAD(&splice);
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 485cda3eb8d7..4db4286069b0 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3893,7 +3893,7 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
 	int	ret;
 	struct btrfs_fs_info *fs_info = sctx->fs_info;
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+	if (btrfs_has_fs_error(fs_info))
 		return -EROFS;
 
 	/* Seed devices of a new filesystem has their own generation. */
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index bc613218c8c5..eb6a48f0b236 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2018,7 +2018,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		if (ret)
 			goto restore;
 	} else {
-		if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+		if (btrfs_has_fs_error(fs_info)) {
 			btrfs_err(fs_info,
 				"Remounting read-write after error is not allowed");
 			ret = -EINVAL;
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index e0a82aa7da89..f74ba158c0c3 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -285,7 +285,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info,
 	spin_lock(&fs_info->trans_lock);
 loop:
 	/* The file system has been taken offline. No new transactions. */
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+	if (btrfs_has_fs_error(fs_info)) {
 		spin_unlock(&fs_info->trans_lock);
 		return -EROFS;
 	}
@@ -333,7 +333,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info,
 		 */
 		kfree(cur_trans);
 		goto loop;
-	} else if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+	} else if (btrfs_has_fs_error(fs_info)) {
 		spin_unlock(&fs_info->trans_lock);
 		kfree(cur_trans);
 		return -EROFS;
@@ -586,7 +586,7 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
 	/* Send isn't supposed to start transactions. */
 	ASSERT(current->journal_info != BTRFS_SEND_TRANS_STUB);
 
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+	if (btrfs_has_fs_error(fs_info))
 		return ERR_PTR(-EROFS);
 
 	if (current->journal_info) {
@@ -999,8 +999,7 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
 	if (throttle)
 		btrfs_run_delayed_iputs(info);
 
-	if (TRANS_ABORTED(trans) ||
-	    test_bit(BTRFS_FS_STATE_ERROR, &info->fs_state)) {
+	if (TRANS_ABORTED(trans) || btrfs_has_fs_error(info)) {
 		wake_up_process(info->transaction_kthread);
 		if (TRANS_ABORTED(trans))
 			err = trans->aborted;
@@ -2187,7 +2186,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 		 * abort to prevent writing a new superblock that reflects a
 		 * corrupt state (pointing to trees with unwritten nodes/leafs).
 		 */
-		if (test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state)) {
+		if (btrfs_has_fs_error(fs_info)) {
 			ret = -EROFS;
 			goto cleanup_transaction;
 		}
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 16eca7d091fd..83d0b796d5d4 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3310,7 +3310,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	 * here would result in transid mismatches.  If there is an error here
 	 * just bail.
 	 */
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+	if (btrfs_has_fs_error(fs_info)) {
 		ret = -EIO;
 		btrfs_set_log_full_commit(trans);
 		btrfs_abort_transaction(trans, ret);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] btrfs: do not infinite loop in data reclaim if we aborted
  2021-05-20 15:21 [PATCH 0/3] Rework fs error handling Josef Bacik
  2021-05-20 15:21 ` [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle Josef Bacik
  2021-05-20 15:21 ` [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper Josef Bacik
@ 2021-05-20 15:21 ` Josef Bacik
  2021-05-25 15:39 ` [PATCH 0/3] Rework fs error handling David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2021-05-20 15:21 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Error injection stressing uncovered a busy loop in our data reclaim
loop.  There are two cases here, one where we loop creating block groups
until space_info->full is set, or in the main loop we will skip erroring
out any tickets if space_info->full == 0.  Unfortunately if we aborted
the transaction then we will never allocate chunks or reclaim any space
and thus never get ->full, and you'll see stack traces like this

watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [kworker/u4:4:139]
CPU: 0 PID: 139 Comm: kworker/u4:4 Tainted: G        W         5.13.0-rc1+ #328
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Workqueue: events_unbound btrfs_async_reclaim_data_space
RIP: 0010:btrfs_join_transaction+0x12/0x20
RSP: 0018:ffffb2b780b77de0 EFLAGS: 00000246
RAX: ffffb2b781863d58 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000801 RSI: ffff987952b57400 RDI: ffff987940aa3000
RBP: ffff987954d55000 R08: 0000000000000001 R09: ffff98795539e8f0
R10: 000000000000000f R11: 000000000000000f R12: ffffffffffffffff
R13: ffff987952b574c8 R14: ffff987952b57400 R15: 0000000000000008
FS:  0000000000000000(0000) GS:ffff9879bbc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0703da4000 CR3: 0000000113398004 CR4: 0000000000370ef0
Call Trace:
 flush_space+0x4a8/0x660
 btrfs_async_reclaim_data_space+0x55/0x130
 process_one_work+0x1e9/0x380
 worker_thread+0x53/0x3e0
 ? process_one_work+0x380/0x380
 kthread+0x118/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x1f/0x30

Fix this by checking to see if we have a btrfs fs error in either of the
reclaim loops, and if so fail the tickets and bail.  In addition to
this, fix maybe_fail_all_tickets() to not try to grant tickets if we've
aborted, simply fail everything.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 42d0fa2092d4..077e54cdc29f 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -941,6 +941,7 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 	struct reserve_ticket *ticket;
 	u64 tickets_id = space_info->tickets_id;
 	u64 first_ticket_bytes = 0;
+	const bool aborted = btrfs_has_fs_error(fs_info);
 
 	if (btrfs_test_opt(fs_info, ENOSPC_DEBUG)) {
 		btrfs_info(fs_info, "cannot satisfy tickets, dumping space info");
@@ -952,7 +953,7 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
 
-		if (ticket->steal &&
+		if (!aborted && ticket->steal &&
 		    steal_from_global_rsv(fs_info, space_info, ticket))
 			return true;
 
@@ -968,15 +969,18 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 		 */
 		if (first_ticket_bytes == 0)
 			first_ticket_bytes = ticket->bytes;
-		else if (first_ticket_bytes > ticket->bytes)
+		else if (!aborted && first_ticket_bytes > ticket->bytes)
 			return true;
 
-		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+		if (!aborted && btrfs_test_opt(fs_info, ENOSPC_DEBUG))
 			btrfs_info(fs_info, "failing ticket with %llu bytes",
 				   ticket->bytes);
 
 		remove_ticket(space_info, ticket);
-		ticket->error = -ENOSPC;
+		if (aborted)
+			ticket->error = -EIO;
+		else
+			ticket->error = -ENOSPC;
 		wake_up(&ticket->wait);
 
 		/*
@@ -985,7 +989,8 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 		 * here to see if we can make progress with the next ticket in
 		 * the list.
 		 */
-		btrfs_try_granting_tickets(fs_info, space_info);
+		if (!aborted)
+			btrfs_try_granting_tickets(fs_info, space_info);
 	}
 	return (tickets_id != space_info->tickets_id);
 }
@@ -1253,6 +1258,10 @@ static void btrfs_async_reclaim_data_space(struct work_struct *work)
 			spin_unlock(&space_info->lock);
 			return;
 		}
+
+		/* Something happened, fail everything and bail. */
+		if (btrfs_has_fs_error(fs_info))
+			goto aborted_fs;
 		last_tickets_id = space_info->tickets_id;
 		spin_unlock(&space_info->lock);
 	}
@@ -1283,9 +1292,19 @@ static void btrfs_async_reclaim_data_space(struct work_struct *work)
 			} else {
 				flush_state = 0;
 			}
+
+			/* Something happened, fail everything and bail. */
+			if (btrfs_has_fs_error(fs_info))
+				goto aborted_fs;
+
 		}
 		spin_unlock(&space_info->lock);
 	}
+	return;
+aborted_fs:
+	maybe_fail_all_tickets(fs_info, space_info);
+	space_info->flush = 0;
+	spin_unlock(&space_info->lock);
 }
 
 void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle
  2021-05-20 15:21 ` [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle Josef Bacik
@ 2021-05-25 15:27   ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2021-05-25 15:27 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, May 20, 2021 at 11:21:31AM -0400, Josef Bacik wrote:
> While stress testing our error handling I noticed that sometimes we
> would still commit the transaction even though we had aborted the
> transaction.
> 
> Currently we track if a trans handle has dirtied any metadata, and if it
> hasn't we mark the FS as having an error (so no new transactions can be
> started), but we will allow the current transaction to complete as we do
> not mark the transaction itself as having been aborted.
> 
> This sounds good in theory, but we were not properly tracking IO errors
> in btrfs_finish_ordered_io, and thus committing the transaction with
> bogus free space data.  This isn't necessarily a problem per-se with the
> free space cache, as the other guards in place would have kept us from
> accepting the free space cache as valid, but hi-lights a real world case
> where we had a bug and could have corrupted the filesystem because of
> it.
> 
> This "skip abort on empty trans handle" is nice in theory, but assumes
> we have perfect error handling everywhere, which we clearly do not.
> Also we do not allow further transactions to be started, so all this
> does is save the last transaction that was happening, which doesn't
> necessarily gain us anything other than the potential for real
> corruption.
> 
> Remove this particular bit of code, if we decide we need to abort the
> transaction then abort the current one and keep us from doing real harm
> to the file system, regardless of whether this specific trans handle
> dirtied anything or not.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

I've checked logs what would be the effects of leaving out the message
printed by the unused transaction. Getting rid of it sounds like an
improvement as it's not adding any information, the first transaction is
noisy and that's where the problem happens. Additional messages about
the abort are confusing, so yeah. Besides, the updates to trans->dirty
lack any serialization so it's quite unreliable anyway.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper
  2021-05-20 15:21 ` [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper Josef Bacik
@ 2021-05-25 15:34   ` David Sterba
  2021-09-27 17:52     ` Josef Bacik
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2021-05-25 15:34 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, May 20, 2021 at 11:21:32AM -0400, Josef Bacik wrote:
> We have a few flags that are inconsistently used to describe the fs in
> different states of failure.  As of
> 
> 	btrfs: always abort the transaction if we abort a trans handle
> 
> we will always set BTRFS_FS_STATE_ERROR if we abort, so we don't have to
> check both ABORTED and ERROR to see if things have gone wrong.  Add a
> helper to check BTRFS_FS_STATE_HELPER and then convert all checkers of
> FS_STATE_ERROR to use the helper.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ctree.h       |  5 +++++
>  fs/btrfs/disk-io.c     |  8 +++-----
>  fs/btrfs/extent_io.c   |  2 +-
>  fs/btrfs/file.c        |  2 +-
>  fs/btrfs/inode.c       |  6 +++---
>  fs/btrfs/scrub.c       |  2 +-
>  fs/btrfs/super.c       |  2 +-
>  fs/btrfs/transaction.c | 11 +++++------
>  fs/btrfs/tree-log.c    |  2 +-
>  9 files changed, 21 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 938d8ebf4cf3..3c22c3308667 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3531,6 +3531,11 @@ do {								\
>  			  (errno), fmt, ##args);		\
>  } while (0)
>  
> +static inline bool btrfs_has_fs_error(struct btrfs_fs_info *fs_info)
> +{
> +	return test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state);

This could also get the unlikely() annotattion, similar to what
TRANS_ABORTED does. Which means this inline would have to be a macro eg.
FS_ERROR or similar.


> @@ -4372,8 +4371,7 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
>  			btrfs_err(fs_info, "commit super ret %d", ret);
>  	}
>  
> -	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state) ||
> -	    test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state))
> +	if (btrfs_has_fs_error(fs_info))
>  		btrfs_error_commit_super(fs_info);

This is not equivalent change, previously it also checks for
STATE_TRANS_ABORTED, this was added in af7227338135 ("Btrfs: clean up
resources during umount after trans is aborted") . So it should be kept
in place or there may be a bigger problem when the filesystem and
transaction error bits get out of sync, I haven't checked.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Rework fs error handling
  2021-05-20 15:21 [PATCH 0/3] Rework fs error handling Josef Bacik
                   ` (2 preceding siblings ...)
  2021-05-20 15:21 ` [PATCH 3/3] btrfs: do not infinite loop in data reclaim if we aborted Josef Bacik
@ 2021-05-25 15:39 ` David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2021-05-25 15:39 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, May 20, 2021 at 11:21:30AM -0400, Josef Bacik wrote:
> Josef Bacik (3):
>   btrfs: always abort the transaction if we abort a trans handle

This one now added to misc-next.

>   btrfs: add a btrfs_has_fs_error helper
>   btrfs: do not infinite loop in data reclaim if we aborted

I can fix up the 2nd patch locally, it's basically only a rename but the
fs and transaction bits getting out of sync as mentioned could be a
problem (or not). Patch 3 depends on the 2nd so I'll keep both in
for-next for the time being.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper
  2021-05-25 15:34   ` David Sterba
@ 2021-09-27 17:52     ` Josef Bacik
  0 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2021-09-27 17:52 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On 5/25/21 11:34 AM, David Sterba wrote:
> On Thu, May 20, 2021 at 11:21:32AM -0400, Josef Bacik wrote:
>> We have a few flags that are inconsistently used to describe the fs in
>> different states of failure.  As of
>>
>> 	btrfs: always abort the transaction if we abort a trans handle
>>
>> we will always set BTRFS_FS_STATE_ERROR if we abort, so we don't have to
>> check both ABORTED and ERROR to see if things have gone wrong.  Add a
>> helper to check BTRFS_FS_STATE_HELPER and then convert all checkers of
>> FS_STATE_ERROR to use the helper.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/ctree.h       |  5 +++++
>>   fs/btrfs/disk-io.c     |  8 +++-----
>>   fs/btrfs/extent_io.c   |  2 +-
>>   fs/btrfs/file.c        |  2 +-
>>   fs/btrfs/inode.c       |  6 +++---
>>   fs/btrfs/scrub.c       |  2 +-
>>   fs/btrfs/super.c       |  2 +-
>>   fs/btrfs/transaction.c | 11 +++++------
>>   fs/btrfs/tree-log.c    |  2 +-
>>   9 files changed, 21 insertions(+), 19 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 938d8ebf4cf3..3c22c3308667 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -3531,6 +3531,11 @@ do {								\
>>   			  (errno), fmt, ##args);		\
>>   } while (0)
>>   
>> +static inline bool btrfs_has_fs_error(struct btrfs_fs_info *fs_info)
>> +{
>> +	return test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state);
> 
> This could also get the unlikely() annotattion, similar to what
> TRANS_ABORTED does. Which means this inline would have to be a macro eg.
> FS_ERROR or similar.
> 
> 

Yup I'll do that.

>> @@ -4372,8 +4371,7 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
>>   			btrfs_err(fs_info, "commit super ret %d", ret);
>>   	}
>>   
>> -	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state) ||
>> -	    test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state))
>> +	if (btrfs_has_fs_error(fs_info))
>>   		btrfs_error_commit_super(fs_info);
> 
> This is not equivalent change, previously it also checks for
> STATE_TRANS_ABORTED, this was added in af7227338135 ("Btrfs: clean up
> resources during umount after trans is aborted") . So it should be kept
> in place or there may be a bigger problem when the filesystem and
> transaction error bits get out of sync, I haven't checked.
> 

We had this because sometimes an empty transaction would be aborted and we 
wouldn't set FS_STATE_ERROR.  With the patch

   btrfs: always abort the transaction if we abort a trans handle

we no longer have the case where we have TRANS_ABORTED but not FS_STATE_ERROR 
set.  Now TRANS_ABORTED simply keeps us from printing multiple transaction abort 
messages, while FS_STATE_ERROR tells us there's been a problem.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-27 17:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-20 15:21 [PATCH 0/3] Rework fs error handling Josef Bacik
2021-05-20 15:21 ` [PATCH 1/3] btrfs: always abort the transaction if we abort a trans handle Josef Bacik
2021-05-25 15:27   ` David Sterba
2021-05-20 15:21 ` [PATCH 2/3] btrfs: add a btrfs_has_fs_error helper Josef Bacik
2021-05-25 15:34   ` David Sterba
2021-09-27 17:52     ` Josef Bacik
2021-05-20 15:21 ` [PATCH 3/3] btrfs: do not infinite loop in data reclaim if we aborted Josef Bacik
2021-05-25 15:39 ` [PATCH 0/3] Rework fs error handling David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.