linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 0/9 v3] xfs: shutdown is a racy mess
Date: Tue, 10 Aug 2021 15:18:16 +1000	[thread overview]
Message-ID: <20210810051825.40715-1-david@fromorbit.com> (raw)

With the recent log problems we've uncovered, it's clear that the
way we shut down filesystems and the log is a chaotic mess. We can
have multiple filesystem shutdown executions being in progress at
once, all competing to run shutdown processing and emit log messages
saying the filesystem has been shut down and why. Further, shutdown
changes the log state and runs log IO completion callbacks without
any co-ordination with ongoing log operations.

This results in shutdowns running unpredictably, running multiple
times, racing with the iclog state machine transitions and exposing
us to use-after-free situations and unexpected state changes within
the log itself.

This patch series tries to address the chaotic nature of shutdowns
by making shutdown execution consistent and predictable. This is
achieved by:

- making the mount shutdown state transistion atomic and not
  dependent on log state.
- making operational log state transitions atomic
- making the log shutdown check be based entirely on the operational
  XLOG_IO_ERROR log state rather than a combination of log flags and
  iclog XLOG_STATE_IOERROR checks.
- Getting rid of XLOG_STATE_IOERROR means shutdown doesn't perturb
  iclog state in the middle of operations that are expecting iclogs
  to be in specific state(s).
- shutdown doesn't process iclogs that are actively referenced.
  This avoids use-after-free situations where shutdown runs
  callbacks and frees objects that own the reference to the iclog
  and are still in use by the iclog reference owner.
- Run shutdown processing when the last active reference to an iclog
  goes away. This guarantees that shutdown processing occurs on all
  iclogs, but it only occurs when it is safe to do so.
- acknowledge that log state is not consistent once shutdown has
  been entered and so don't try to apply consistency checking during
  a shutdown...

At the end of this patch series, shutdown runs once and once only at
the first trigger, iclog state is not modified by shutdown, and
iclog callbacks and wakeups are not processed until all active
references to the iclog(s) are dropped. Hence we now have
deterministic shutdown behaviour for both the mount and the log and
a consistent iclog lifecycle framework that we can build more
complex functionality on top of safely.

Version 3:
- rebase on 5.14-rc4 + for-next @ 130916145229
- Fixed typos in commit messages

Version 2:
- https://lore.kernel.org/linux-xfs/20210714031958.2614411-1-david@fromorbit.com/
- rebase on 5.14-rc1
- added comment about XFS_FORCED_SHUTDOWN -> xlog_is_shutdown in commit message.
- fix spurious semi-colon at end of for loop.
- fixed typos in commit messages
- undid the do {} while -> for {} conversion in xlog_state_do_callbacks()
- removed spurious blank lines in xfs_do_force_shutdown()
- added comment to commit description explaining the unconditional stack dump on
  shutdown if the error level is high enough.
- added comment about iclog IO completion avoiding shutdown races with
  referenced iclogs that haven't yet been submitted to commit description.
- cleaned up xlog_state_release_iclog() structure for better readability.
- cleaned up xlog_space_left() structure for better readability.

Version 1:
- https://lore.kernel.org/linux-xfs/20210630063813.1751007-1-david@fromorbit.com/


             reply	other threads:[~2021-08-10  5:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10  5:18 Dave Chinner [this message]
2021-08-10  5:18 ` [PATCH 1/9] xfs: convert XLOG_FORCED_SHUTDOWN() to xlog_is_shutdown() Dave Chinner
2021-08-10  5:18 ` [PATCH 2/9] xfs: XLOG_STATE_IOERROR must die Dave Chinner
2021-08-10  5:18 ` [PATCH 3/9] xfs: move recovery needed state updates to xfs_log_mount_finish Dave Chinner
2021-08-10  5:18 ` [PATCH 4/9] xfs: convert log flags to an operational state field Dave Chinner
2021-08-10  5:18 ` [PATCH 5/9] xfs: make forced shutdown processing atomic Dave Chinner
2021-08-10  5:18 ` [PATCH 6/9] xfs: rework xlog_state_do_callback() Dave Chinner
2021-08-10  5:18 ` [PATCH 7/9] xfs: separate out log shutdown callback processing Dave Chinner
2021-08-10  5:18 ` [PATCH 8/9] xfs: don't run shutdown callbacks on active iclogs Dave Chinner
2021-08-10  5:18 ` [PATCH 9/9] xfs: log head and tail aren't reliable during shutdown Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210810051825.40715-1-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).