[PATCH v16 00/21] xfs-4.19: online repair support

* [PATCH v16 00/21] xfs-4.19: online repair support
@ 2018-06-24 19:23 Darrick J. Wong
  2018-06-24 19:23 ` [PATCH 01/21] xfs: don't assume a left rmap when allocating a new rmap Darrick J. Wong
                   ` (20 more replies)
  0 siblings, 21 replies; 77+ messages in thread
From: Darrick J. Wong @ 2018-06-24 19:23 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

This is the sixteenth revision of a patchset that adds to XFS kernel
support for online metadata scrubbing and repair.  There aren't any
on-disk format changes.

New for this version of the patch series are: a renaming of the 'repair
freeze' code into the 'scrub freeze' code so that scrubbers can freeze
the filesystem to check or repair metadata; a new ioctl flag for
userspace to grant the kernel permission to freeze the filesystem to do
work; a centralized iput helper that delays iput processing until after
the scrub freeze is lifted; and a new fs summary counter scrub/repair
function that can check the global icount/ifree/fdblock counters.

The first patch fixes a soon-to-be-invalid assumption in the rmap code
that the rmapbt can never be empty.  This won't be true at all for
realtime rmap and is briefly untrue for rmap repairs, so we might as
well restructure that assumption out of the code.

The next two patches create two predicates that decide if an inode has
CoW staging blocks that need to be freed or post-eof blocks that need to
be freed.

Patches 4-7 implement reconstruction of the AGF/AGI/AGFL headers, the
free space btrees, and the inode btrees.  These are the same patches as
v15.

Patches 8-9 implement the deferred iput code -- first the scrub-specific
iput helper that enables us delay iputting of inodes that have post-eof
or CoW blocks that need to be freed until we no longer have a
transaction and can let the iput work proceed.  The second patch fleshes
out the scrub iget/iput tracepoints for easier debugging.

Next comes the scrub freeze code which enables us to pause all other
work in the filesystem and introduces the FREEZE_OK ioctl flag for
userspace to control freeze policy.

Patch 11-19 implement online rmap, refcount, inode, ifork, bmap,
symlink, extended attribute, and quota repairs.  Patch 20 implements
online quotacheck via the scrub freezer.

Patch 21 implements the filesystem summary counter check and repair
code.  This turned out to be easier to implement than I had thought --
we gather the icount, ifree, and fdblocks counts from each of the AGs.
Next we adjust fdblocks by all the in-core reservations: resblks, per-AG
reservations, and all delayed allocation blocks of all in-core inodes.
Then we can compare our counts against the superblock's counts and
adjust accordingly.

If you're going to start using this mess, you probably ought to just
pull from my git trees.  The kernel patches[1] should apply against
4.18-rc2.  xfsprogs[2] and xfstests[3] can be found in their usual
places.  The git trees contain all four series' worth of changes.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel
[2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel
[3] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=djwong-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread