All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6 v2] xfs: more shutdown/recovery fixes
@ 2022-03-24  0:20 Dave Chinner
  2022-03-24  0:20 ` [PATCH 1/6] xfs: aborting inodes on shutdown may need buffer lock Dave Chinner
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Dave Chinner @ 2022-03-24  0:20 UTC (permalink / raw)
  To: linux-xfs

Hi folks,

V2 of this patchset has blown out from 2 to 6 patches because of
the sudden explosion of everyone having new problems with
shutdown/recovery behaviour. Patches 3-6 are new patches in the
series.

Patch 3 addresses the shutdown log force wakeup failure Brian
reported here:

https://lore.kernel.org/linux-xfs/YjneHEoFRDXu+EcA@bfoster/

Patches 4-6 fix a long standing shutdown race where
xfs_trans_commit() can abort modified log items and leave them
unpinned and dirty in memory while the log is still running,
allowing unjournalled, incomplete changes to be written back to disk
before the log is shut down. This race condition has been around
for a long time - it looks to be a zero-day bug in the original
shutdown code introduced in January 1997.

Fixing this requires the log to be able to shut down indepedently of
the mount (i.e. from log IO completion context), mount shutdowns to
be forced to wait until the log shutdown is complete and for log
shutdowns to also shut down the mount because otherwise shit just
breaks all over the place because random stuff errors out on log
shutdown and xfs_is_shutdown() is not set so those errors are
not handled appropriately by high level code. Or just assert fail
because the mount isn't shutdown down.

Once all that is done, we can fix xfs_trans_commit() and
xfs_trans_cancel() to not leak aborted items into memory until the
log is fully shut down.

This now makes recoveryloop largely stable on my test machines. I am
still seeing failures, but they are one-off, whacky things (like
weird udev/netlink memory freeing warnings) that I'm unable to
reproduce in any way.

-Dave.

Version 2:
- rework inode cluster buffer checks in inode item pushing (patch 1)
- clean up comments and separation of inode abort behaviour (p1)
- Fix shutdown callback/log force wakeup ordering issue (p3)
- Fix writeback of aborted, incomplete, unlogged changes during
  shutdown races (p4-6)

Version 1:
- https://lore.kernel.org/linux-xfs/20220321012329.376307-1-david@fromorbit.com/


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-03-30  1:20 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-24  0:20 [PATCH 0/6 v2] xfs: more shutdown/recovery fixes Dave Chinner
2022-03-24  0:20 ` [PATCH 1/6] xfs: aborting inodes on shutdown may need buffer lock Dave Chinner
2022-03-28 22:44   ` Darrick J. Wong
2022-03-28 23:11     ` Dave Chinner
2022-03-30  1:20       ` Darrick J. Wong
2022-03-24  0:20 ` [PATCH 2/6] xfs: shutdown in intent recovery has non-intent items in the AIL Dave Chinner
2022-03-28 22:46   ` Darrick J. Wong
2022-03-24  0:21 ` [PATCH 3/6] xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks Dave Chinner
2022-03-28 23:05   ` Darrick J. Wong
2022-03-28 23:13     ` Dave Chinner
2022-03-28 23:36       ` Darrick J. Wong
2022-03-24  0:21 ` [PATCH 4/6] xfs: log shutdown triggers should only shut down the log Dave Chinner
2022-03-29  0:14   ` Darrick J. Wong
2022-03-24  0:21 ` [PATCH 5/6] xfs: xfs_do_force_shutdown needs to block racing shutdowns Dave Chinner
2022-03-29  0:19   ` Darrick J. Wong
2022-03-24  0:21 ` [PATCH 6/6] xfs: xfs_trans_commit() path must check for log shutdown Dave Chinner
2022-03-29  0:36   ` Darrick J. Wong
2022-03-29  3:08     ` Dave Chinner
2022-03-27 22:55 ` [PATCH 7/6] xfs: xfs: shutdown during log recovery needs to mark the " Dave Chinner
2022-03-29  0:37   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.