All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: do not write supers if we have an fs error
@ 2021-05-19 21:15 Josef Bacik
  2021-05-24 10:48 ` Filipe Manana
  2021-05-25 15:10 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Josef Bacik @ 2021-05-19 21:15 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: stable

Error injection testing uncovered a pretty severe problem where we could
end up committing a super that pointed to the wrong tree roots,
resulting in transid mismatch errors.

The way we commit the transaction is we update the super copy with the
current generations and bytenrs of the important roots, and then copy
that into our super_for_commit.  Then we allow transactions to continue
again, we write out the dirty pages for the transaction, and then we
write the super.  If the write out fails we'll bail and skip writing the
supers.

However since we've allowed a new transaction to start, we can have a
log attempting to sync at this point, which would be blocked on
fs_info->tree_log_mutex.  Once the commit fails we're allowed to do the
log tree commit, which uses super_for_commit, which now points at fs
tree's that were not written out.

Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the
tree_log_mutex.  This way if the transaction commit fails we're sure to
see this bit set and we can skip writing the super out.  This patch
fixes this specific transid mismatch error I was seeing with this
particular error path.

cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-log.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 0278f489e7d9..83e8f105869b 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3302,6 +3302,22 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	 *    begins and releases it only after writing its superblock.
 	 */
 	mutex_lock(&fs_info->tree_log_mutex);
+
+	/*
+	 * The transaction writeout phase could have failed, and thus marked the
+	 * fs in an error state.  We must not commit here, as we could have
+	 * updated our generation in the super_for_commit and writing the super
+	 * here would result in transid mismatches.  If there is an error here
+	 * just bail.
+	 */
+	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+		ret = -EIO;
+		btrfs_set_log_full_commit(trans);
+		btrfs_abort_transaction(trans, ret);
+		mutex_unlock(&fs_info->tree_log_mutex);
+		goto out_wake_log_root;
+	}
+
 	btrfs_set_super_log_root(fs_info->super_for_commit, log_root_start);
 	btrfs_set_super_log_root_level(fs_info->super_for_commit, log_root_level);
 	ret = write_all_supers(fs_info, 1);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not write supers if we have an fs error
  2021-05-19 21:15 [PATCH] btrfs: do not write supers if we have an fs error Josef Bacik
@ 2021-05-24 10:48 ` Filipe Manana
  2021-05-25 15:10 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: Filipe Manana @ 2021-05-24 10:48 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, stable

On Wed, May 19, 2021 at 10:20 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Error injection testing uncovered a pretty severe problem where we could
> end up committing a super that pointed to the wrong tree roots,
> resulting in transid mismatch errors.
>
> The way we commit the transaction is we update the super copy with the
> current generations and bytenrs of the important roots, and then copy
> that into our super_for_commit.  Then we allow transactions to continue
> again, we write out the dirty pages for the transaction, and then we
> write the super.  If the write out fails we'll bail and skip writing the
> supers.
>
> However since we've allowed a new transaction to start, we can have a
> log attempting to sync at this point, which would be blocked on
> fs_info->tree_log_mutex.  Once the commit fails we're allowed to do the
> log tree commit, which uses super_for_commit, which now points at fs
> tree's that were not written out.
>
> Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the
> tree_log_mutex.  This way if the transaction commit fails we're sure to
> see this bit set and we can skip writing the super out.  This patch
> fixes this specific transid mismatch error I was seeing with this
> particular error path.
>
> cc: stable@vger.kernel.org
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/tree-log.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 0278f489e7d9..83e8f105869b 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -3302,6 +3302,22 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>          *    begins and releases it only after writing its superblock.
>          */
>         mutex_lock(&fs_info->tree_log_mutex);
> +
> +       /*
> +        * The transaction writeout phase could have failed, and thus marked the

To be more clear, it would be better to say "The previous
transaction", as otherwise it will be
confusing when reading the comment.

Other than that, it looks good, great catch.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> +        * fs in an error state.  We must not commit here, as we could have
> +        * updated our generation in the super_for_commit and writing the super
> +        * here would result in transid mismatches.  If there is an error here
> +        * just bail.
> +        */
> +       if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
> +               ret = -EIO;
> +               btrfs_set_log_full_commit(trans);
> +               btrfs_abort_transaction(trans, ret);
> +               mutex_unlock(&fs_info->tree_log_mutex);
> +               goto out_wake_log_root;
> +       }
> +
>         btrfs_set_super_log_root(fs_info->super_for_commit, log_root_start);
>         btrfs_set_super_log_root_level(fs_info->super_for_commit, log_root_level);
>         ret = write_all_supers(fs_info, 1);
> --
> 2.26.3
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not write supers if we have an fs error
  2021-05-19 21:15 [PATCH] btrfs: do not write supers if we have an fs error Josef Bacik
  2021-05-24 10:48 ` Filipe Manana
@ 2021-05-25 15:10 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2021-05-25 15:10 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, stable

On Wed, May 19, 2021 at 05:15:53PM -0400, Josef Bacik wrote:
> Error injection testing uncovered a pretty severe problem where we could
> end up committing a super that pointed to the wrong tree roots,
> resulting in transid mismatch errors.
> 
> The way we commit the transaction is we update the super copy with the
> current generations and bytenrs of the important roots, and then copy
> that into our super_for_commit.  Then we allow transactions to continue
> again, we write out the dirty pages for the transaction, and then we
> write the super.  If the write out fails we'll bail and skip writing the
> supers.
> 
> However since we've allowed a new transaction to start, we can have a
> log attempting to sync at this point, which would be blocked on
> fs_info->tree_log_mutex.  Once the commit fails we're allowed to do the
> log tree commit, which uses super_for_commit, which now points at fs
> tree's that were not written out.
> 
> Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the
> tree_log_mutex.  This way if the transaction commit fails we're sure to
> see this bit set and we can skip writing the super out.  This patch
> fixes this specific transid mismatch error I was seeing with this
> particular error path.
> 
> cc: stable@vger.kernel.org
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Added to misc-next, with the suggested comment update. Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-25 15:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19 21:15 [PATCH] btrfs: do not write supers if we have an fs error Josef Bacik
2021-05-24 10:48 ` Filipe Manana
2021-05-25 15:10 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.