All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/2] xfs: fix corruption of free rt extent count
@ 2022-04-07 20:46 Darrick J. Wong
  2022-04-07 20:46 ` [PATCH 1/2] xfs: recalculate free rt extents after log recovery Darrick J. Wong
  2022-04-07 20:47 ` [PATCH 2/2] xfs: use a separate frextents counter for rt extent reservations Darrick J. Wong
  0 siblings, 2 replies; 10+ messages in thread
From: Darrick J. Wong @ 2022-04-07 20:46 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, david

Hi all,

I've been noticing sporadic failures with djwong-dev with xfs/141, which
is a looping log recovery test.  The primary symptoms have been that
online fsck reports incorrect free extent counts after some number of
recovery loops.  The root cause seems to be the use of sb_frextents in
the xfs_mount for incore reservations to transactions -- if someone
calls xfs_log_sb while there's a transaction with an rtx reservation
running, the artificially low value will then get logged to disk!
If that's the /last/ time anyone logs the superblock before the log
goes down, then recovery will replay the incorrect value into the live
superblock.  Effectively, we leak the rt extents.

So, the first thing to do is to fix log recovery to recompute frextents
from the rt bitmap so that we can catch and correct ondisk metadata; and
the second fix is to create a percpu counter to track both ondisk and
incore rtextent counts, similar to how m_fdblocks relates to
sb_fdblocks.

The next thing to do after this is to fix xfs_repair to check frextents
and the rt bitmap/summary files, since it doesn't check the ondisk
values against its own observations.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=frextents-fixes-5.18
---
 fs/xfs/xfs_fsops.c   |    5 +---
 fs/xfs/xfs_icache.c  |    9 ++++---
 fs/xfs/xfs_mount.c   |   38 +++++++++++++++++++++-------
 fs/xfs/xfs_mount.h   |    2 +
 fs/xfs/xfs_rtalloc.c |   69 +++++++++++++++++++++++++++++++++++++++++++++++---
 fs/xfs/xfs_super.c   |   14 +++++++++-
 fs/xfs/xfs_trans.c   |   37 ++++++++++++++++++++++-----
 7 files changed, 146 insertions(+), 28 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-08 17:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-07 20:46 [PATCHSET 0/2] xfs: fix corruption of free rt extent count Darrick J. Wong
2022-04-07 20:46 ` [PATCH 1/2] xfs: recalculate free rt extents after log recovery Darrick J. Wong
2022-04-07 21:56   ` Dave Chinner
2022-04-07 23:39     ` Darrick J. Wong
2022-04-08  0:06       ` Dave Chinner
2022-04-08 17:42         ` Darrick J. Wong
2022-04-07 20:47 ` [PATCH 2/2] xfs: use a separate frextents counter for rt extent reservations Darrick J. Wong
2022-04-07 23:17   ` Dave Chinner
2022-04-07 23:45     ` Darrick J. Wong
2022-04-08  0:12       ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.