From: David Sterba <dsterba@suse.cz>
To: fdmanana@kernel.org
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] Btrfs: fix race leading to fs corruption after transaction abortion
Date: Fri, 26 Jul 2019 16:19:23 +0200 [thread overview]
Message-ID: <20190726141922.GC2868@twin.jikos.cz> (raw)
In-Reply-To: <20190725102704.11404-1-fdmanana@kernel.org>
On Thu, Jul 25, 2019 at 11:27:04AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> When one transaction is finishing its commit, it is possible for another
> transaction to start and enter its initial commit phase as well. If the
> first ends up getting aborted, we have a small time window where the second
> transaction commit does not notice that the previous transaction aborted
> and ends up committing, writing a superblock that points to btrees that
> reference extent buffers (nodes and leafs) that were not persisted to disk.
> The consequence is that after mounting the filesystem again, we will be
> unable to load some btree nodes/leafs, either because the content on disk
> is either garbage (or just zeroes) or corresponds to the old content of a
> previouly COWed or deleted node/leaf, resulting in the well known error
> messages "parent transid verify failed on ...".
> The following sequence diagram illustrates how this can happen.
>
> CPU 1 CPU 2
>
> <at transaction N>
>
> btrfs_commit_transaction()
> (...)
> --> sets transaction state to
> TRANS_STATE_UNBLOCKED
> --> sets fs_info->running_transaction
> to NULL
>
> (...)
> btrfs_start_transaction()
> start_transaction()
> wait_current_trans()
> --> returns immediately
> because
> fs_info->running_transaction
> is NULL
> join_transaction()
> --> creates transaction N + 1
> --> sets
> fs_info->running_transaction
> to transaction N + 1
> --> adds transaction N + 1 to
> the fs_info->trans_list list
> --> returns transaction handle
> pointing to the new
> transaction N + 1
> (...)
>
> btrfs_sync_file()
> btrfs_start_transaction()
> --> returns handle to
> transaction N + 1
> (...)
>
> btrfs_write_and_wait_transaction()
> --> writeback of some extent
> buffer fails, returns an
> error
> btrfs_handle_fs_error()
> --> sets BTRFS_FS_STATE_ERROR in
> fs_info->fs_state
> --> jumps to label "scrub_continue"
> cleanup_transaction()
> btrfs_abort_transaction(N)
> --> sets BTRFS_FS_STATE_TRANS_ABORTED
> flag in fs_info->fs_state
> --> sets aborted field in the
> transaction and transaction
> handle structures, for
> transaction N only
> --> removes transaction from the
> list fs_info->trans_list
> btrfs_commit_transaction(N + 1)
> --> transaction N + 1 was not
> aborted, so it proceeds
> (...)
> --> sets the transaction's state
> to TRANS_STATE_COMMIT_START
> --> does not find the previous
> transaction (N) in the
> fs_info->trans_list, so it
> doesn't know that transaction
> was aborted, and the commit
> of transaction N + 1 proceeds
> (...)
> --> sets transaction N + 1 state
> to TRANS_STATE_UNBLOCKED
> btrfs_write_and_wait_transaction()
> --> succeeds writing all extent
> buffers created in the
> transaction N + 1
> write_all_supers()
> --> succeeds
> --> we now have a superblock on
> disk that points to trees
> that refer to at least one
> extent buffer that was
> never persisted
>
> So fix this by updating the transaction commit path to check if the flag
> BTRFS_FS_STATE_TRANS_ABORTED is set on fs_info->fs_state if after setting
> the transaction to the TRANS_STATE_COMMIT_START we do not find any previous
> transaction in the fs_info->trans_list. If the flag is set, just fail the
> transaction commit with -EROFS, as we do in other places. The exact error
> code for the previous transaction abort was already logged and reported.
>
> Fixes: 49b25e0540904b ("btrfs: enhance transaction abort infrastructure")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Queued for 5.3, thanks.
next prev parent reply other threads:[~2019-07-26 14:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-25 10:27 [PATCH] Btrfs: fix race leading to fs corruption after transaction abortion fdmanana
2019-07-26 14:19 ` David Sterba [this message]
2019-07-29 13:38 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190726141922.GC2868@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).