All of lore.kernel.org
 help / color / mirror / Atom feed
From: Allison Henderson <allison.henderson@oracle.com>
To: "djwong@kernel.org" <djwong@kernel.org>
Cc: Catherine Hoang <catherine.hoang@oracle.com>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"willy@infradead.org" <willy@infradead.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	Chandan Babu <chandan.babu@oracle.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"hch@infradead.org" <hch@infradead.org>
Subject: Re: [PATCH 10/14] xfs: document full filesystem scans for online fsck
Date: Sat, 25 Feb 2023 07:33:38 +0000	[thread overview]
Message-ID: <04811ecc7a2e7abc7e14e248f7389a37ee4d8ded.camel@oracle.com> (raw)
In-Reply-To: <Y+6ywAO0fdddf79C@magnolia>

On Thu, 2023-02-16 at 14:48 -0800, Darrick J. Wong wrote:
> On Thu, Feb 16, 2023 at 03:47:20PM +0000, Allison Henderson wrote:
> > On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > Certain parts of the online fsck code need to scan every file in
> > > the
> > > entire filesystem.  It is not acceptable to block the entire
> > > filesystem
> > > while this happens, which means that we need to be clever in
> > > allowing
> > > scans to coordinate with ongoing filesystem updates.  We also
> > > need to
> > > hook the filesystem so that regular updates propagate to the
> > > staging
> > > records.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > >  .../filesystems/xfs-online-fsck-design.rst         |  677
> > > ++++++++++++++++++++
> > >  1 file changed, 677 insertions(+)
> > > 
> > > 
> > > diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst
> > > b/Documentation/filesystems/xfs-online-fsck-design.rst
> > > index a658da8fe4ae..c0f08a773f08 100644
> > > --- a/Documentation/filesystems/xfs-online-fsck-design.rst
> > > +++ b/Documentation/filesystems/xfs-online-fsck-design.rst
> > > @@ -3018,3 +3018,680 @@ The proposed patchset is the
> > >  `summary counter cleanup
> > >  <
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=repair-fscounters>`_
> > >  series.
> > > +
> > > +Full Filesystem Scans
> > > +---------------------
> > > +
> > > +Certain types of metadata can only be checked by walking every
> > > file
> > > in the
> > > +entire filesystem to record observations and comparing the
> > > observations against
> > > +what's recorded on disk.
> > > +Like every other type of online repair, repairs are made by
> > > writing
> > > those
> > > +observations to disk in a replacement structure and committing
> > > it
> > > atomically.
> > > +However, it is not practical to shut down the entire filesystem
> > > to
> > > examine
> > > +hundreds of billions of files because the downtime would be
> > > excessive.
> > > +Therefore, online fsck must build the infrastructure to manage a
> > > live scan of
> > > +all the files in the filesystem.
> > > +There are two questions that need to be solved to perform a live
> > > walk:
> > > +
> > > +- How does scrub manage the scan while it is collecting data?
> > > +
> > > +- How does the scan keep abreast of changes being made to the
> > > system
> > > by other
> > > +  threads?
> > > +
> > > +.. _iscan:
> > > +
> > > +Coordinated Inode Scans
> > > +```````````````````````
> > > +
> > > +In the original Unix filesystems of the 1970s, each directory
> > > entry
> > > contained
> > > +an index number (*inumber*) which was used as an index into on
> > > ondisk array
> > > +(*itable*) of fixed-size records (*inodes*) describing a file's
> > > attributes and
> > > +its data block mapping.
> > > +This system is described by J. Lions, `"inode (5659)"
> > > +<http://www.lemis.com/grog/Documentation/Lions/>`_ in *Lions'
> > > Commentary on
> > > +UNIX, 6th Edition*, (Dept. of Computer Science, the University
> > > of
> > > New South
> > > +Wales, November 1977), pp. 18-2; and later by D. Ritchie and K.
> > > Thompson,
> > > +`"Implementation of the File System"
> > > +<https://archive.org/details/bstj57-6-1905/page/n8/mode/1up>`_,
> > > from
> > > *The UNIX
> > > +Time-Sharing System*, (The Bell System Technical Journal, July
> > > 1978), pp.
> > > +1913-4.
> > > +
> > > +XFS retains most of this design, except now inumbers are search
> > > keys
> > > over all
> > > +the space in the data section filesystem.
> > > +They form a continuous keyspace that can be expressed as a 64-
> > > bit
> > > integer,
> > > +though the inodes themselves are sparsely distributed within the
> > > keyspace.
> > > +Scans proceed in a linear fashion across the inumber keyspace,
> > > starting from
> > > +``0x0`` and ending at ``0xFFFFFFFFFFFFFFFF``.
> > > +Naturally, a scan through a keyspace requires a scan cursor
> > > object
> > > to track the
> > > +scan progress.
> > > +Because this keyspace is sparse, this cursor contains two parts.
> > > +The first part of this scan cursor object tracks the inode that
> > > will
> > > be
> > > +examined next; call this the examination cursor.
> > > +Somewhat less obviously, the scan cursor object must also track
> > > which parts of
> > > +the keyspace have already been visited, which is critical for
> > > deciding if a
> > > +concurrent filesystem update needs to be incorporated into the
> > > scan
> > > data.
> > > +Call this the visited inode cursor.
> > > +
> > > +Advancing the scan cursor is a multi-step process encapsulated
> > > in
> > > +``xchk_iscan_iter``:
> > > +
> > > +1. Lock the AGI buffer of the AG containing the inode pointed to
> > > by
> > > the visited
> > > +   inode cursor.
> > > +   This guarantee that inodes in this AG cannot be allocated or
> > > freed while
> > > +   advancing the cursor.
> > > +
> > > +2. Use the per-AG inode btree to look up the next inumber after
> > > the
> > > one that
> > > +   was just visited, since it may not be keyspace adjacent.
> > > +
> > > +3. If there are no more inodes left in this AG:
> > > +
> > > +   a. Move the examination cursor to the point of the inumber
> > > keyspace that
> > > +      corresponds to the start of the next AG.
> > > +
> > > +   b. Adjust the visited inode cursor to indicate that it has
> > > "visited" the
> > > +      last possible inode in the current AG's inode keyspace.
> > > +      XFS inumbers are segmented, so the cursor needs to be
> > > marked
> > > as having
> > > +      visited the entire keyspace up to just before the start of
> > > the
> > > next AG's
> > > +      inode keyspace.
> > > +
> > > +   c. Unlock the AGI and return to step 1 if there are
> > > unexamined
> > > AGs in the
> > > +      filesystem.
> > > +
> > > +   d. If there are no more AGs to examine, set both cursors to
> > > the
> > > end of the
> > > +      inumber keyspace.
> > > +      The scan is now complete.
> > > +
> > > +4. Otherwise, there is at least one more inode to scan in this
> > > AG:
> > > +
> > > +   a. Move the examination cursor ahead to the next inode marked
> > > as
> > > allocated
> > > +      by the inode btree.
> > > +
> > > +   b. Adjust the visited inode cursor to point to the inode just
> > > prior to where
> > > +      the examination cursor is now.
> > > +      Because the scanner holds the AGI buffer lock, no inodes
> > > could
> > > have been
> > > +      created in the part of the inode keyspace that the visited
> > > inode cursor
> > > +      just advanced.
> > > +
> > > +5. Get the incore inode for the inumber of the examination
> > > cursor.
> > > +   By maintaining the AGI buffer lock until this point, the
> > > scanner
> > > knows that
> > > +   it was safe to advance the examination cursor across the
> > > entire
> > > keyspace,
> > > +   and that it has stabilized this next inode so that it cannot
> > > disappear from
> > > +   the filesystem until the scan releases the incore inode.
> > > +
> > > +6. Drop the AGI lock and return the incore inode to the caller.
> > > +
> > > +Online fsck functions scan all files in the filesystem as
> > > follows:
> > > +
> > > +1. Start a scan by calling ``xchk_iscan_start``.
> > Hmm, I actually did not find xchk_iscan_start in the below branch,
> > I
> > found xchk_iscan_iter in "xfs: implement live inode scan for
> > scrub",
> > but it doesnt look like anything uses it yet, at least not in that
> > branch.
> 
> <nod> The topic branch linked below has the implementation, but no
> users.  The first user is online quotacheck, which is in the next
> branch
> after that:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck
> 
> Specifically, this patch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/commit/?h=repair-quotacheck&id=3640515b9282514d91a407b6aa8d8b73caa123c5
> 
> I'll restate what you probably saw in the commit message for this
> email discussion:
> 
> This "one branch to introduce a new infrastructure and a second
> branch
> to actually use it" pattern is a result of reviewer requests for
> smaller
> more focused branches.  This has turned out to be useful in practice
> because it's easier to move just these pieces up and down in the
> branch
> as needed.  The inode scan was originally developed for rmapbt repair
> (which comes *much* later) and moved it up once I realized that
> quotacheck has far fewer dependencies and hence all of this could
> come
> earlier.
> 
> You're right that this section ought to point to an actual user of
> the
> functionality.  Will fix. :)

Alrighty then, sounds good

> 
> > Also, it took me a bit to figure out that "initial user" meant
> > "calling
> > function"
> 
> Er... are you talking about the sentence "...new code is split out as
> a
> separate patch from its initial user" in the patch commit message?
> 
> Maybe I should reword that:
> 
> "This new code is a separate patch from the patches adding callers
> for
> the sake of enabling the author to move patches around his tree..."
Yes, I think that's clearer :-)

> 
> > > +
> > > +2. Advance the scan cursor (``xchk_iscan_iter``) to get the next
> > > inode.
> > > +   If one is provided:
> > > +
> > > +   a. Lock the inode to prevent updates during the scan.
> > > +
> > > +   b. Scan the inode.
> > > +
> > > +   c. While still holding the inode lock, adjust the visited
> > > inode
> > > cursor
> > > +      (``xchk_iscan_mark_visited``) to point to this inode.
> > > +
> > > +   d. Unlock and release the inode.
> > > +
> > > +8. Call ``xchk_iscan_finish`` to complete the scan.
> > > +
> > > +There are subtleties with the inode cache that complicate
> > > grabbing
> > > the incore
> > > +inode for the caller.
> > > +Obviously, it is an absolute requirement that the inode metadata
> > > be
> > > consistent
> > > +enough to load it into the inode cache.
> > > +Second, if the incore inode is stuck in some intermediate state,
> > > the
> > > scan
> > > +coordinator must release the AGI and push the main filesystem to
> > > get
> > > the inode
> > > +back into a loadable state.
> > > +
> > > +The proposed patches are the
> > > +`inode scanner
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=scrub-iscan>`_
> > > +series.
> > > +
> > > +Inode Management
> > > +````````````````
> > > +
> > > +In regular filesystem code, references to allocated XFS incore
> > > inodes are
> > > +always obtained (``xfs_iget``) outside of transaction context
> > > because the
> > > +creation of the incore context for ane xisting file does not
> > > require
> > an existing
> 
> Corrected, thank you.
> 
> > > metadata
> > > +updates.
> > > +However, it is important to note that references to incore
> > > inodes
> > > obtained as
> > > +part of file creation must be performed in transaction context
> > > because the
> > > +filesystem must ensure the atomicity of the ondisk inode btree
> > > index
> > > updates
> > > +and the initialization of the actual ondisk inode.
> > > +
> > > +References to incore inodes are always released (``xfs_irele``)
> > > outside of
> > > +transaction context because there are a handful of activities
> > > that
> > > might
> > > +require ondisk updates:
> > > +
> > > +- The VFS may decide to kick off writeback as part of a
> > > ``DONTCACHE`` inode
> > > +  release.
> > > +
> > > +- Speculative preallocations need to be unreserved.
> > > +
> > > +- An unlinked file may have lost its last reference, in which
> > > case
> > > the entire
> > > +  file must be inactivated, which involves releasing all of its
> > > resources in
> > > +  the ondisk metadata and freeing the inode.
> > > +
> > > +These activities are collectively called inode inactivation.
> > > +Inactivation has two parts -- the VFS part, which initiates
> > > writeback on all
> > > +dirty file pages, and the XFS part, which cleans up XFS-specific
> > > information
> > > +and frees the inode if it was unlinked.
> > > +If the inode is unlinked (or unconnected after a file handle
> > > operation), the
> > > +kernel drops the inode into the inactivation machinery
> > > immediately.
> > > +
> > > +During normal operation, resource acquisition for an update
> > > follows
> > > this order
> > > +to avoid deadlocks:
> > > +
> > > +1. Inode reference (``iget``).
> > > +
> > > +2. Filesystem freeze protection, if repairing
> > > (``mnt_want_write_file``).
> > > +
> > > +3. Inode ``IOLOCK`` (VFS ``i_rwsem``) lock to control file IO.
> > > +
> > > +4. Inode ``MMAPLOCK`` (page cache ``invalidate_lock``) lock for
> > > operations that
> > > +   can update page cache mappings.
> > > +
> > > +5. Log feature enablement.
> > > +
> > > +6. Transaction log space grant.
> > > +
> > > +7. Space on the data and realtime devices for the transaction.
> > > +
> > > +8. Incore dquot references, if a file is being repaired.
> > > +   Note that they are not locked, merely acquired.
> > > +
> > > +9. Inode ``ILOCK`` for file metadata updates.
> > > +
> > > +10. AG header buffer locks / Realtime metadata inode ILOCK.
> > > +
> > > +11. Realtime metadata buffer locks, if applicable.
> > > +
> > > +12. Extent mapping btree blocks, if applicable.
> > > +
> > > +Resources are often released in the reverse order, though this
> > > is
> > > not required.
> > > +However, online fsck differs from regular XFS operations because
> > > it
> > > may examine
> > > +an object that normally is acquired in a later stage of the
> > > locking
> > > order, and
> > > +then decide to cross-reference the object with an object that is
> > > acquired
> > > +earlier in the order.
> > > +The next few sections detail the specific ways in which online
> > > fsck
> > > takes care
> > > +to avoid deadlocks.
> > > +
> > > +iget and irele During a Scrub
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +An inode scan performed on behalf of a scrub operation runs in
> > > transaction
> > > +context, and possibly with resources already locked and bound to
> > > it.
> > > +This isn't much of a problem for ``iget`` since it can operate
> > > in
> > > the context
> > > +of an existing transaction, as long as all of the bound
> > > resources
> > > are acquired
> > > +before the inode reference in the regular filesystem.
> > > +
> > > +When the VFS ``iput`` function is given a linked inode with no
> > > other
> > > +references, it normally puts the inode on an LRU list in the
> > > hope
> > > that it can
> > > +save time if another process re-opens the file before the system
> > > runs out
> > > +of memory and frees it.
> > > +Filesystem callers can short-circuit the LRU process by setting
> > > a
> > > ``DONTCACHE``
> > > +flag on the inode to cause the kernel to try to drop the inode
> > > into
> > > the
> > > +inactivation machinery immediately.
> > > +
> > > +In the past, inactivation was always done from the process that
> > > dropped the
> > > +inode, which was a problem for scrub because scrub may already
> > > hold
> > > a
> > > +transaction, and XFS does not support nesting transactions.
> > > +On the other hand, if there is no scrub transaction, it is
> > > desirable
> > > to drop
> > > +otherwise unused inodes immediately to avoid polluting caches.
> > > +To capture these nuances, the online fsck code has a separate
> > > ``xchk_irele``
> > > +function to set or clear the ``DONTCACHE`` flag to get the
> > > required
> > > release
> > > +behavior.
> > > +
> > > +Proposed patchsets include fixing
> > > +`scrub iget usage
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=scrub-iget-fixes>`_ and
> > > +`dir iget usage
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=scrub-dir-iget-fixes>`_.
> > > +
> > > +Locking Inodes
> > > +^^^^^^^^^^^^^^
> > > +
> > > +In regular filesystem code, the VFS and XFS will acquire
> > > multiple
> > > IOLOCK locks
> > > +in a well-known order: parent → child when updating the
> > > directory
> > > tree, and
> > > +``struct inode`` address order otherwise.
> > > +For regular files, the MMAPLOCK can be acquired after the IOLOCK
> > > to
> > > stop page
> > > +faults.
> > > +If two MMAPLOCKs must be acquired, they are acquired in 
> > 
> > 
> > > ``struct
> > > +address_space`` order.
> > the order of their memory address
> > 
> > ?
> 
> Urghg.  I think I need to clarify this more:
> 
> "...they are acquired in numerical order of the addresses of their
> ``struct address_space`` objects."
> 
> See filemap_invalidate_lock_two.
> 
Yep, I think that works

> > > +Due to the structure of existing filesystem code, IOLOCKs and
> > > MMAPLOCKs must be
> > > +acquired before transactions are allocated.
> > > +If two ILOCKs must be acquired, they are acquired in inumber
> > > order.
> > > +
> > > +Inode lock acquisition must be done carefully during a
> > > coordinated
> > > inode scan.
> > > +Online fsck cannot abide these conventions, because for a
> > > directory
> > > tree
> > > +scanner, the scrub process holds the IOLOCK of the file being
> > > scanned and it
> > > +needs to take the IOLOCK of the file at the other end of the
> > > directory link.
> > > +If the directory tree is corrupt because it contains a cycle,
> > > ``xfs_scrub``
> > > +cannot use the regular inode locking functions and avoid
> > > becoming
> > > trapped in an
> > > +ABBA deadlock.
> > > +
> > > +Solving both of these problems is straightforward -- any time
> > > online
> > > fsck
> > > +needs to take a second lock of the same class, it uses trylock
> > > to
> > > avoid an ABBA
> > > +deadlock.
> > > +If the trylock fails, scrub drops all inode locks and use
> > > trylock
> > > loops to
> > > +(re)acquire all necessary resources.
> > > +Trylock loops enable scrub to check for pending fatal signals,
> > > which
> > > is how
> > > +scrub avoids deadlocking the filesystem or becoming an
> > > unresponsive
> > > process.
> > > +However, trylock loops means that online fsck must be prepared
> > > to
> > > measure the
> > > +resource being scrubbed before and after the lock cycle to
> > > detect
> > > changes and
> > > +react accordingly.
> > > +
> > > +.. _dirparent:
> > > +
> > > +Case Study: Finding a Directory Parent
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +Consider the directory parent pointer repair code as an example.
> > > +Online fsck must verify that the dotdot dirent of a directory
> > > points
> > > up to a
> > > +parent directory, and that the parent directory contains exactly
> > > one
> > > dirent
> > > +pointing down to the child directory.
> > > +Fully validating this relationship (and repairing it if
> > > possible)
> > > requires a
> > > +walk of every directory on the filesystem while holding the
> > > child
> > > locked, and
> > > +while updates to the directory tree are being made.
> > > +The coordinated inode scan provides a way to walk the filesystem
> > > without the
> > > +possibility of missing an inode.
> > > +The child directory is kept locked to prevent updates to the
> > > dotdot
> > > dirent, but
> > > +if the scanner fails to lock a parent, it can drop and relock
> > > both
> > > the child
> > > +and the prospective parent.
> > > +If the dotdot entry changes while the directory is unlocked,
> > > then a
> > > move or
> > > +rename operation must have changed the child's parentage, and
> > > the
> > > scan can
> > > +exit early.
> > > +
> > > +The proposed patchset is the
> > > +`directory repair
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=repair-dirs>`_
> > > +series.
> > > +
> > > +.. _fshooks:
> > > +
> > > +Filesystem Hooks
> > > +`````````````````
> > > +
> > > +The second piece of support that online fsck functions need
> > > during a
> > > full
> > > +filesystem scan is the ability to stay informed about updates
> > > being
> > > made by
> > > +other threads in the filesystem, since comparisons against the
> > > past
> > > are useless
> > > +in a dynamic environment.
> > > +Two pieces of Linux kernel infrastructure enable online fsck to
> > > monitor regular
> > > +filesystem operations: filesystem hooks and :ref:`static
> > > keys<jump_labels>`.
> > > +
> > > +Filesystem hooks convey information about an ongoing filesystem
> > > operation to
> > > +a downstream consumer.
> > > +In this case, the downstream consumer is always an online fsck
> > > function.
> > > +Because multiple fsck functions can run in parallel, online fsck
> > > uses the Linux
> > > +notifier call chain facility to dispatch updates to any number
> > > of
> > > interested
> > > +fsck processes.
> > > +Call chains are a dynamic list, which means that they can be
> > > configured at
> > > +run time.
> > > +Because these hooks are private to the XFS module, the
> > > information
> > > passed along
> > > +contains exactly what the checking function needs to update its
> > > observations.
> > > +
> > > +The current implementation of XFS hooks uses SRCU notifier
> > > chains to
> > > reduce the
> > > +impact to highly threaded workloads.
> > > +Regular blocking notifier chains use a rwsem and seem to have a
> > > much
> > > lower
> > > +overhead for single-threaded applications.
> > > +However, it may turn out that the combination of blocking chains
> > > and
> > > static
> > > +keys are a more performant combination; more study is needed
> > > here.
> > > +
> > > +The following pieces are necessary to hook a certain point in
> > > the
> > > filesystem:
> > > +
> > > +- A ``struct xfs_hooks`` object must be embedded in a convenient
> > > place such as
> > > +  a well-known incore filesystem object.
> > > +
> > > +- Each hook must define an action code and a structure
> > > containing
> > > more context
> > > +  about the action.
> > > +
> > > +- Hook providers should provide appropriate wrapper functions
> > > and
> > > structs
> > > +  around the ``xfs_hooks`` and ``xfs_hook`` objects to take
> > > advantage of type
> > > +  checking to ensure correct usage.
> > > +
> > > +- A callsite in the regular filesystem code must be chosen to
> > > call
> > > +  ``xfs_hooks_call`` with the action code and data structure.
> > > +  This place should be adjacent to (and not earlier than) the
> > > place
> > > where
> > > +  the filesystem update is committed to the transaction.
> > > +  In general, when the filesystem calls a hook chain, it should
> > > be
> > > able to
> > > +  handle sleeping and should not be vulnerable to memory reclaim
> > > or
> > > locking
> > > +  recursion.
> > > +  However, the exact requirements are very dependent on the
> > > context
> > > of the hook
> > > +  caller and the callee.
> > > +
> > > +- The online fsck function should define a structure to hold
> > > scan
> > > data, a lock
> > > +  to coordinate access to the scan data, and a ``struct
> > > xfs_hook``
> > > object.
> > > +  The scanner function and the regular filesystem code must
> > > acquire
> > > resources
> > > +  in the same order; see the next section for details.
> > > +
> > > +- The online fsck code must contain a C function to catch the
> > > hook
> > > action code
> > > +  and data structure.
> > > +  If the object being updated has already been visited by the
> > > scan,
> > > then the
> > > +  hook information must be applied to the scan data.
> > > +
> > > +- Prior to unlocking inodes to start the scan, online fsck must
> > > call
> > > +  ``xfs_hooks_setup`` to initialize the ``struct xfs_hook``, and
> > > +  ``xfs_hooks_add`` to enable the hook.
> > > +
> > > +- Online fsck must call ``xfs_hooks_del`` to disable the hook
> > > once
> > > the scan is
> > > +  complete.
> > > +
> > > +The number of hooks should be kept to a minimum to reduce
> > > complexity.
> > > +Static keys are used to reduce the overhead of filesystem hooks
> > > to
> > > nearly
> > > +zero when online fsck is not running.
> > > +
> > > +.. _liveupdate:
> > > +
> > > +Live Updates During a Scan
> > > +``````````````````````````
> > > +
> > > +The code paths of the online fsck scanning code and the
> > > :ref:`hooked<fshooks>`
> > > +filesystem code look like this::
> > > +
> > > +            other program
> > > +                  ↓
> > > +            inode lock ←────────────────────┐
> > > +                  ↓                         │
> > > +            AG header lock                  │
> > > +                  ↓                         │
> > > +            filesystem function             │
> > > +                  ↓                         │
> > > +            notifier call chain             │    same
> > > +                  ↓                         ├─── inode
> > > +            scrub hook function             │    lock
> > > +                  ↓                         │
> > > +            scan data mutex ←──┐    same    │
> > > +                  ↓            ├─── scan    │
> > > +            update scan data   │    lock    │
> > > +                  ↑            │            │
> > > +            scan data mutex ←──┘            │
> > > +                  ↑                         │
> > > +            inode lock ←────────────────────┘
> > > +                  ↑
> > > +            scrub function
> > > +                  ↑
> > > +            inode scanner
> > > +                  ↑
> > > +            xfs_scrub
> > > +
> > > +These rules must be followed to ensure correct interactions
> > > between
> > > the
> > > +checking code and the code making an update to the filesystem:
> > > +
> > > +- Prior to invoking the notifier call chain, the filesystem
> > > function
> > > being
> > > +  hooked must acquire the same lock that the scrub scanning
> > > function
> > > acquires
> > > +  to scan the inode.
> > > +
> > > +- The scanning function and the scrub hook function must
> > > coordinate
> > > access to
> > > +  the scan data by acquiring a lock on the scan data.
> > > +
> > > +- Scrub hook function must not add the live update information
> > > to
> > > the scan
> > > +  observations unless the inode being updated has already been
> > > scanned.
> > > +  The scan coordinator has a helper predicate
> > > (``xchk_iscan_want_live_update``)
> > > +  for this.
> > > +
> > > +- Scrub hook functions must not change the caller's state,
> > > including
> > > the
> > > +  transaction that it is running.
> > > +  They must not acquire any resources that might conflict with
> > > the
> > > filesystem
> > > +  function being hooked.
> > > +
> > > +- The hook function can abort the inode scan to avoid breaking
> > > the
> > > other rules.
> > > +
> > > +The inode scan APIs are pretty simple:
> > > +
> > > +- ``xchk_iscan_start`` starts a scan
> > > +
> > > +- ``xchk_iscan_iter`` grabs a reference to the next inode in the
> > > scan or
> > > +  returns zero if there is nothing left to scan
> > > +
> > > +- ``xchk_iscan_want_live_update`` to decide if an inode has
> > > already
> > > been
> > > +  visited in the scan.
> > > +  This is critical for hook functions to decide if they need to
> > > update the
> > > +  in-memory scan information.
> > > +
> > > +- ``xchk_iscan_mark_visited`` to mark an inode as having been
> > > visited in the
> > > +  scan
> > > +
> > > +- ``xchk_iscan_finish`` to finish the scan
> > > +
> > > +The proposed patches are at the start of the
> > > +`online quotacheck
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=repair-quota>`_
> > > +series.
> > Wrong link?  This looks like it goes to the section below.
> 
> Oops.  This one should link to scrub-iscan, and the next one should
> link
> to repair-quotacheck.
> 
> > > +
> > > +.. _quotacheck:
> > > +
> > > +Case Study: Quota Counter Checking
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +It is useful to compare the mount time quotacheck code to the
> > > online
> > > repair
> > > +quotacheck code.
> > > +Mount time quotacheck does not have to contend with concurrent
> > > operations, so
> > > +it does the following:
> > > +
> > > +1. Make sure the ondisk dquots are in good enough shape that all
> > > the
> > > incore
> > > +   dquots will actually load, and zero the resource usage
> > > counters
> > > in the
> > > +   ondisk buffer.
> > > +
> > > +2. Walk every inode in the filesystem.
> > > +   Add each file's resource usage to the incore dquot.
> > > +
> > > +3. Walk each incore dquot.
> > > +   If the incore dquot is not being flushed, add the ondisk
> > > buffer
> > > backing the
> > > +   incore dquot to a delayed write (delwri) list.
> > > +
> > > +4. Write the buffer list to disk.
> > > +
> > > +Like most online fsck functions, online quotacheck can't write
> > > to
> > > regular
> > > +filesystem objects until the newly collected metadata reflect
> > > all
> > > filesystem
> > > +state.
> > > +Therefore, online quotacheck records file resource usage to a
> > > shadow
> > > dquot
> > > +index implemented with a sparse ``xfarray``, and only writes to
> > > the
> > > real dquots
> > > +once the scan is complete.
> > > +Handling transactional updates is tricky because quota resource
> > > usage updates
> > > +are handled in phases to minimize contention on dquots:
> > > +
> > > +1. The inodes involved are joined and locked to a transaction.
> > > +
> > > +2. For each dquot attached to the file:
> > > +
> > > +   a. The dquot is locked.
> > > +
> > > +   b. A quota reservation is added to the dquot's resource
> > > usage.
> > > +      The reservation is recorded in the transaction.
> > > +
> > > +   c. The dquot is unlocked.
> > > +
> > > +3. Changes in actual quota usage are tracked in the transaction.
> > > +
> > > +4. At transaction commit time, each dquot is examined again:
> > > +
> > > +   a. The dquot is locked again.
> > > +
> > > +   b. Quota usage changes are logged and unused reservation is
> > > given
> > > back to
> > > +      the dquot.
> > > +
> > > +   c. The dquot is unlocked.
> > > +
> > > +For online quotacheck, hooks are placed in steps 2 and 4.
> > > +The step 2 hook creates a shadow version of the transaction
> > > dquot
> > > context
> > > +(``dqtrx``) that operates in a similar manner to the regular
> > > code.
> > > +The step 4 hook commits the shadow ``dqtrx`` changes to the
> > > shadow
> > > dquots.
> > > +Notice that both hooks are called with the inode locked, which
> > > is
> > > how the
> > > +live update coordinates with the inode scanner.
> > > +
> > > +The quotacheck scan looks like this:
> > > +
> > > +1. Set up a coordinated inode scan.
> > > +
> > > +2. For each inode returned by the inode scan iterator:
> > > +
> > > +   a. Grab and lock the inode.
> > > +
> > > +   b. Determine that inode's resource usage (data blocks, inode
> > > counts,
> > > +      realtime blocks) 
> > nit: move this list to the first appearance of "resource usage". 
> > Step
> > 2 of the first list I think
> 
> I don't understand this proposed change.  Are you talking about "2.
> For
> each dquot attached to the file:" above?  That list describes the
> steps
> taken by regular code wanting to allocate file space that's accounted
> to
> quotas.  This list describes what online quotacheck does.  The two
> don't
> mix.
Oh, youre right, disregard this one

> 
> > > and add that to the shadow dquots for the user, group,
> > > +      and project ids associated with the inode.
> > > +
> > > +   c. Unlock and release the inode.
> > > +
> > > +3. For each dquot in the system:
> > > +
> > > +   a. Grab and lock the dquot.
> > > +
> > > +   b. Check the dquot against the shadow dquots created by the
> > > scan
> > > and updated
> > > +      by the live hooks.
> > > +
> > > +Live updates are key to being able to walk every quota record
> > > without
> > > +needing to hold any locks for a long duration.
> > > +If repairs are desired, the real and shadow dquots are locked
> > > and
> > > their
> > > +resource counts are set to the values in the shadow dquot.
> > > +
> > > +The proposed patchset is the
> > > +`online quotacheck
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=repair-quota>`_
> 
> Changed from repair-quota to repair-quotacheck.
> 
> > > +series.
> > > +
> > > +.. _nlinks:
> > > +
> > > +Case Study: File Link Count Checking
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +File link count checking also uses live update hooks.
> > > +The coordinated inode scanner is used to visit all directories
> > > on
> > > the
> > > +filesystem, and per-file link count records are stored in a
> > > sparse
> > > ``xfarray``
> > > +indexed by inumber.
> > > +During the scanning phase, each entry in a directory generates
> > > observation
> > > +data as follows:
> > > +
> > > +1. If the entry is a dotdot (``'..'``) entry of the root
> > > directory,
> > > the
> > > +   directory's parent link count is bumped because the root
> > > directory's dotdot
> > > +   entry is self referential.
> > > +
> > > +2. If the entry is a dotdot entry of a subdirectory, the
> > > parent's
> > > backref
> > > +   count is bumped.
> > > +
> > > +3. If the entry is neither a dot nor a dotdot entry, the target
> > > file's parent
> > > +   count is bumped.
> > > +
> > > +4. If the target is a subdirectory, the parent's child link
> > > count is
> > > bumped.
> > > +
> > > +A crucial point to understand about how the link count inode
> > > scanner
> > > interacts
> > > +with the live update hooks is that the scan cursor tracks which
> > > *parent*
> > > +directories have been scanned.
> > > +In other words, the live updates ignore any update about ``A →
> > > B``
> > > when A has
> > > +not been scanned, even if B has been scanned.
> > > +Furthermore, a subdirectory A with a dotdot entry pointing back
> > > to B
> > > is
> > > +accounted as a backref counter in the shadow data for A, since
> > > child
> > > dotdot
> > > +entries affect the parent's link count.
> > > +Live update hooks are carefully placed in all parts of the
> > > filesystem that
> > > +create, change, or remove directory entries, since those
> > > operations
> > > involve
> > > +bumplink and droplink.
> > > +
> > > +For any file, the correct link count is the number of parents
> > > plus
> > > the number
> > > +of child subdirectories.
> > > +Non-directories never have children of any kind.
> > > +The backref information is used to detect inconsistencies in the
> > > number of
> > > +links pointing to child subdirectories and the number of dotdot
> > > entries
> > > +pointing back.
> > > +
> > > +After the scan completes, the link count of each file can be
> > > checked
> > > by locking
> > > +both the inode and the shadow data, and comparing the link
> > > counts.
> > > +A second coordinated inode scan cursor is used for comparisons.
> > > +Live updates are key to being able to walk every inode without
> > > needing to hold
> > > +any locks between inodes.
> > > +If repairs are desired, the inode's link count is set to the
> > > value
> > > in the
> > > +shadow information.
> > > +If no parents are found, the file must be :ref:`reparented
> > > <orphanage>` to the
> > > +orphanage to prevent the file from being lost forever.
> > > +
> > > +The proposed patchset is the
> > > +`file link count repair
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=scrub-nlinks>`_
> > > +series.
> > > +
> > > +.. _rmap_repair:
> > > +
> > > +Case Study: Rebuilding Reverse Mapping Records
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +Most repair functions follow the same pattern: lock filesystem
> > > resources,
> > > +walk the surviving ondisk metadata looking for replacement
> > > metadata
> > > records,
> > > +and use an :ref:`in-memory array <xfarray>` to store the
> > > gathered
> > > observations.
> > > +The primary advantage of this approach is the simplicity and
> > > modularity of the
> > > +repair code -- code and data are entirely contained within the
> > > scrub
> > > module,
> > > +do not require hooks in the main filesystem, and are usually the
> > > most efficient
> > > +in memory use.
> > > +A secondary advantage of this repair approach is atomicity --
> > > once
> > > the kernel
> > > +decides a structure is corrupt, no other threads can access the
> > > metadata until
> > > +the kernel finishes repairing and revalidating the metadata.
> > > +
> > > +For repairs going on within a shard of the filesystem, these
> > > advantages
> > > +outweigh the delays inherent in locking the shard while
> > > repairing
> > > parts of the
> > > +shard.
> > > +Unfortunately, repairs to the reverse mapping btree cannot use
> > > the
> > > "standard"
> > > +btree repair strategy because it must scan every space mapping
> > > of
> > > every fork of
> > > +every file in the filesystem, and the filesystem cannot stop.
> > > +Therefore, rmap repair foregoes atomicity between scrub and
> > > repair.
> > > +It combines a :ref:`coordinated inode scanner <iscan>`,
> > > :ref:`live
> > > update hooks
> > > +<liveupdate>`, and an :ref:`in-memory rmap btree <xfbtree>` to
> > > complete the
> > > +scan for reverse mapping records.
> > > +
> > > +1. Set up an xfbtree to stage rmap records.
> > > +
> > > +2. While holding the locks on the AGI and AGF buffers acquired
> > > during the
> > > +   scrub, generate reverse mappings for all AG metadata: inodes,
> > > btrees, CoW
> > > +   staging extents, and the internal log.
> > > +
> > > +3. Set up an inode scanner.
> > > +
> > > +4. Hook into rmap updates for the AG being repaired so that the
> > > live
> > > scan data
> > > +   can receive updates to the rmap btree from the rest of the
> > > filesystem during
> > > +   the file scan.
> > > +
> > > +5. For each space mapping found in either fork of each file
> > > scanned,
> > > +   decide if the mapping matches the AG of interest.
> > > +   If so:
> > > +
> > > +   a. Create a btree cursor for the in-memory btree.
> > > +
> > > +   b. Use the rmap code to add the record to the in-memory
> > > btree.
> > > +
> > > +   c. Use the :ref:`special commit function <xfbtree_commit>` to
> > > write the
> > > +      xfbtree changes to the xfile.
> > > +
> > > +6. For each live update received via the hook, decide if the
> > > owner
> > > has already
> > > +   been scanned.
> > > +   If so, apply the live update into the scan data:
> > > +
> > > +   a. Create a btree cursor for the in-memory btree.
> > > +
> > > +   b. Replay the operation into the in-memory btree.
> > > +
> > > +   c. Use the :ref:`special commit function <xfbtree_commit>` to
> > > write the
> > > +      xfbtree changes to the xfile.
> > > +      This is performed with an empty transaction to avoid
> > > changing
> > > the
> > > +      caller's state.
> > > +
> > > +7. When the inode scan finishes, create a new scrub transaction
> > > and
> > > relock the
> > > +   two AG headers.
> > > +
> > > +8. Compute the new btree geometry using the number of rmap
> > > records
> > > in the
> > > +   shadow btree, like all other btree rebuilding functions.
> > > +
> > > +9. Allocate the number of blocks computed in the previous step.
> > > +
> > > +10. Perform the usual btree bulk loading and commit to install
> > > the
> > > new rmap
> > > +    btree.
> > > +
> > > +11. Reap the old rmap btree blocks as discussed in the case
> > > study
> > > about how
> > > +    to :ref:`reap after rmap btree repair <rmap_reap>`.
> > > +
> > > +12. Free the xfbtree now that it not needed.
> > > +
> > > +The proposed patchset is the
> > > +`rmap repair
> > > +<
> > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > > log/?h=repair-rmap-btree>`_
> > > +series.
> > > 
> > 
> > Mostly looks good nits aside, I do sort of wonder if this patch
> > would
> > do better to appear before patch 6 (or move 6 down), since it gets
> > into
> > more challenges concerning locks and hooks, where as here we are
> > mostly
> > discussing what they are and how they work.  So it might build
> > better
> > to move this patch up a little.
> 
> (I might be a tad confused here, bear with me.)
> 
> Patch 6, the section about eventual consistency?
> 
> Hmm.  The intent drains exist to quiesce intent chains targeting
> specific AGs.  It briefly mentions "fshooks" in the context of using
> jump labels to avoid the overhead of calling notify_all on the drain
> waitqueue when scrub isn't running.  That's perhaps bad naming on my
> part, since the other "fshooks" are jump labels to avoid bouncing
> through the notifier chain code when scrub isn't running.  The jump
> labels themselves are not hooks, they're structured dynamic code
> patching.
> 
> I probably should've named those something else.  fsgates?
Oh, i see, yes I did sort of try to correlate them, so maybe the
different name would help.
> 
> Or maybe you were talking specifically about "Case Study: Rebuilding
> Reverse Mapping Records"?  In which case I remark that the case study
> needs both the intent drains to quiesce the AG and the live scans to
> work properly, which is why the case study of it couldn't come
> earlier.
> The intent drains section still ought to come before the refcountbt
> section, because it's the refcountbt scrubber that first hit the
> coordination problem.
> 
> Things are getting pretty awkward like this because there are sooo
> many
> interdependent pieces. :(

I see, ok no worries then, I think people will figure it out either
way.  I mostly look for ways to make the presentation easier but it is
getting harder to move stuff with chicken and egg dependencies.

> 
> Regardless, thank you very much for slogging through.
> 
> --D
> 
> > Allison
> > 


  reply	other threads:[~2023-02-25  7:34 UTC|newest]

Thread overview: 218+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 21:13 [NYE DELUGE 1/4] xfs: all pending online scrub improvements Darrick J. Wong
2022-12-30 22:10 ` [PATCHSET v24.0 00/14] xfs: design documentation for online fsck Darrick J. Wong
2022-12-30 22:10   ` [PATCH 02/14] xfs: document the general theory underlying online fsck design Darrick J. Wong
2023-01-11  1:25     ` Allison Henderson
2023-01-11 23:39       ` Darrick J. Wong
2023-01-12  0:29         ` Dave Chinner
2023-01-18  0:03         ` Allison Henderson
2023-01-18  2:35           ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 01/14] xfs: document the motivation for " Darrick J. Wong
2023-01-07  5:01     ` Allison Henderson
2023-01-11 19:10       ` Darrick J. Wong
2023-01-18  0:03         ` Allison Henderson
2023-01-18  1:29           ` Darrick J. Wong
2023-01-12  0:10       ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 08/14] xfs: document btree bulk loading Darrick J. Wong
2023-02-09  5:47     ` Allison Henderson
2023-02-10  0:24       ` Darrick J. Wong
2023-02-16 15:46         ` Allison Henderson
2023-02-16 21:08           ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 04/14] xfs: document the user interface for online fsck Darrick J. Wong
2023-01-18  0:03     ` Allison Henderson
2023-01-18  2:42       ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 03/14] xfs: document the testing plan " Darrick J. Wong
2023-01-18  0:03     ` Allison Henderson
2023-01-18  2:38       ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 05/14] xfs: document the filesystem metadata checking strategy Darrick J. Wong
2023-01-21  1:38     ` Allison Henderson
2023-02-02 19:04       ` Darrick J. Wong
2023-02-09  5:41         ` Allison Henderson
2022-12-30 22:10   ` [PATCH 07/14] xfs: document pageable kernel memory Darrick J. Wong
2023-02-02  7:14     ` Allison Henderson
2023-02-02 23:14       ` Darrick J. Wong
2023-02-09  5:41         ` Allison Henderson
2023-02-09 23:14           ` Darrick J. Wong
2023-02-25  7:32             ` Allison Henderson
2022-12-30 22:10   ` [PATCH 09/14] xfs: document online file metadata repair code Darrick J. Wong
2022-12-30 22:10   ` [PATCH 06/14] xfs: document how online fsck deals with eventual consistency Darrick J. Wong
2023-01-05  9:08     ` Amir Goldstein
2023-01-05 19:40       ` Darrick J. Wong
2023-01-06  3:33         ` Amir Goldstein
2023-01-11 17:54           ` Darrick J. Wong
2023-01-31  6:11     ` Allison Henderson
2023-02-02 19:55       ` Darrick J. Wong
2023-02-09  5:41         ` Allison Henderson
2022-12-30 22:10   ` [PATCH 13/14] xfs: document the userspace fsck driver program Darrick J. Wong
2023-03-01  5:36     ` Allison Henderson
2023-03-02  0:27       ` Darrick J. Wong
2023-03-03 23:51         ` Allison Henderson
2023-03-04  2:25           ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 14/14] xfs: document future directions of online fsck Darrick J. Wong
2023-03-01  5:37     ` Allison Henderson
2023-03-02  0:39       ` Darrick J. Wong
2023-03-03 23:51         ` Allison Henderson
2023-03-04  2:28           ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 11/14] xfs: document metadata file repair Darrick J. Wong
2023-02-25  7:33     ` Allison Henderson
2023-03-01  2:42       ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 10/14] xfs: document full filesystem scans for online fsck Darrick J. Wong
2023-02-16 15:47     ` Allison Henderson
2023-02-16 22:48       ` Darrick J. Wong
2023-02-25  7:33         ` Allison Henderson [this message]
2023-03-01 22:09           ` Darrick J. Wong
2022-12-30 22:10   ` [PATCH 12/14] xfs: document directory tree repairs Darrick J. Wong
2023-01-14  2:32     ` [PATCH v24.2 " Darrick J. Wong
2023-02-03  2:12     ` [PATCH v24.3 " Darrick J. Wong
2023-02-25  7:33       ` Allison Henderson
2023-03-02  0:14         ` Darrick J. Wong
2023-03-03 23:50           ` Allison Henderson
2023-03-04  2:19             ` Darrick J. Wong
2023-03-07  1:30   ` [PATCHSET v24.3 00/14] xfs: design documentation for online fsck Darrick J. Wong
2023-03-07  1:30   ` Darrick J. Wong
2023-03-07  1:30     ` [PATCH 01/14] xfs: document the motivation for online fsck design Darrick J. Wong
2023-03-07  1:31     ` [PATCH 02/14] xfs: document the general theory underlying " Darrick J. Wong
2023-03-07  1:31     ` [PATCH 03/14] xfs: document the testing plan for online fsck Darrick J. Wong
2023-03-07  1:31     ` [PATCH 04/14] xfs: document the user interface " Darrick J. Wong
2023-03-07  1:31     ` [PATCH 05/14] xfs: document the filesystem metadata checking strategy Darrick J. Wong
2023-03-07  1:31     ` [PATCH 06/14] xfs: document how online fsck deals with eventual consistency Darrick J. Wong
2023-03-07  1:31     ` [PATCH 07/14] xfs: document pageable kernel memory Darrick J. Wong
2023-03-07  1:31     ` [PATCH 08/14] xfs: document btree bulk loading Darrick J. Wong
2023-03-07  1:31     ` [PATCH 09/14] xfs: document online file metadata repair code Darrick J. Wong
2023-03-07  1:31     ` [PATCH 10/14] xfs: document full filesystem scans for online fsck Darrick J. Wong
2023-03-07  1:31     ` [PATCH 11/14] xfs: document metadata file repair Darrick J. Wong
2023-03-07  1:31     ` [PATCH 12/14] xfs: document directory tree repairs Darrick J. Wong
2023-03-07  1:32     ` [PATCH 13/14] xfs: document the userspace fsck driver program Darrick J. Wong
2023-03-07  1:32     ` [PATCH 14/14] xfs: document future directions of online fsck Darrick J. Wong
2022-12-30 22:10 ` [PATCHSET v24.0 0/8] xfs: variable and naming cleanups for intent items Darrick J. Wong
2022-12-30 22:10   ` [PATCH 1/8] xfs: pass the xfs_bmbt_irec directly through the log intent code Darrick J. Wong
2022-12-30 22:10   ` [PATCH 2/8] xfs: fix confusing variable names in xfs_bmap_item.c Darrick J. Wong
2022-12-30 22:10   ` [PATCH 8/8] xfs: fix confusing variable names in xfs_refcount_item.c Darrick J. Wong
2022-12-30 22:10   ` [PATCH 3/8] xfs: pass xfs_extent_free_item directly through the log intent code Darrick J. Wong
2022-12-30 22:10   ` [PATCH 5/8] xfs: pass rmap space mapping " Darrick J. Wong
2022-12-30 22:10   ` [PATCH 4/8] xfs: fix confusing xfs_extent_item variable names Darrick J. Wong
2022-12-30 22:10   ` [PATCH 6/8] xfs: fix confusing variable names in xfs_rmap_item.c Darrick J. Wong
2022-12-30 22:10   ` [PATCH 7/8] xfs: pass refcount intent directly through the log intent code Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/5] xfs: make intent items take a perag reference Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/5] xfs: give xfs_bmap_intent its own " Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/5] xfs: give xfs_refcount_intent " Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/5] xfs: pass per-ag references to xfs_free_extent Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/5] xfs: give xfs_extfree_intent its own perag reference Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/5] xfs: give xfs_rmap_intent " Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/1] xfs: pass perag references around when possible Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/1] xfs: create a function to duplicate an active perag reference Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/5] xfs: drain deferred work items when scrubbing Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/5] xfs: clean up scrub context if scrub setup returns -EDEADLOCK Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/5] xfs: allow queued AG intents to drain before scrubbing Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/5] xfs: add a tracepoint to report incorrect extent refcounts Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/5] xfs: minimize overhead of drain wakeups by using jump labels Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/5] xfs: scrub should use ECHRNG to signal that the drain is needed Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/8] xfs: standardize btree record checking code Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/8] xfs: return a failure address from xfs_rmap_irec_offset_unpack Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/8] xfs: standardize ondisk to incore conversion for inode btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/8] xfs: standardize ondisk to incore conversion for free space btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/8] xfs: standardize ondisk to incore conversion for refcount btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 7/8] xfs: complain about bad records in query_range helpers Darrick J. Wong
2022-12-30 22:11   ` [PATCH 6/8] xfs: standardize ondisk to incore conversion for bmap btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 8/8] xfs: complain about bad file mapping records in the ondisk bmbt Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/8] xfs: standardize ondisk to incore conversion for rmap btrees Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/3] xfs: hoist scrub record checks into libxfs Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/3] xfs: hoist inode record alignment checks from scrub Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/3] xfs: hoist rmap record flag " Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/3] " Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/2] xfs: fix rmap btree key flag handling Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/2] xfs: fix rm_offset flag handling in rmap keys Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/2] xfs: detect unwritten bit set in rmapbt node block keys Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/2] xfs: enhance btree key scrubbing Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/2] xfs: always scrub record/key order of interior records Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/2] xfs: check btree keys reflect the child block Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/6] xfs: detect incorrect gaps in refcount btree Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/6] xfs: replace xfs_btree_has_record with a general keyspace scanner Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/6] xfs: refactor ->diff_two_keys callsites Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/6] xfs: refactor converting btree irec to btree key Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/6] xfs: check the reference counts of gaps in the refcount btree Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/6] xfs: implement masked btree key comparisons for _has_records scans Darrick J. Wong
2022-12-30 22:11   ` [PATCH 6/6] xfs: ensure that all metadata and data blocks are not cow staging extents Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/4] xfs: detect incorrect gaps in inode btree Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/4] xfs: clean up broken eearly-exit code in the inode btree scrubber Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/4] xfs: remove pointless shadow variable from xfs_difree_inobt Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/4] xfs: convert xfs_ialloc_has_inodes_at_extent to return keyfill scan results Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/4] xfs: directly cross-reference the inode btrees with each other Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/2] xfs: detect incorrect gaps in rmap btree Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/2] xfs: teach scrub to check for sole ownership of metadata objects Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/2] xfs: ensure that single-owner file blocks are not owned by others Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/4] xfs: fix iget/irele usage in online fsck Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/4] xfs: fix an inode lookup race in xchk_get_inode Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/4] xfs: manage inode DONTCACHE status at irele time Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/4] xfs: retain the AGI when we can't iget an inode to scrub the core Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/4] xfs: rename xchk_get_inode -> xchk_iget_for_scrubbing Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/3] xfs: fix iget usage in directory scrub Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/3] xfs: xfs_iget in the directory scrubber needs to use UNTRUSTED Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/3] xfs: make checking directory dotdot entries more reliable Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/3] xfs: always check the existence of a dirent's child inode Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/6] xfs: detect mergeable and overlapping btree records Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/6] xfs: change bmap scrubber to store the previous mapping Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/6] xfs: alert the user about data/attr fork mappings that could be merged Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/6] xfs: check overlapping rmap btree records Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/6] xfs: flag refcount btree records that could be merged Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/6] xfs: flag free space " Darrick J. Wong
2022-12-30 22:11   ` [PATCH 6/6] xfs: check for reverse mapping " Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 00/11] xfs: clean up memory management in xattr scrub Darrick J. Wong
2022-12-30 22:11   ` [PATCH 04/11] xfs: split freemap from xchk_xattr_buf.buf Darrick J. Wong
2022-12-30 22:11   ` [PATCH 06/11] xfs: split valuebuf " Darrick J. Wong
2022-12-30 22:11   ` [PATCH 02/11] xfs: don't shadow @leaf in xchk_xattr_block Darrick J. Wong
2022-12-30 22:11   ` [PATCH 01/11] xfs: xattr scrub should ensure one namespace bit per name Darrick J. Wong
2022-12-30 22:11   ` [PATCH 03/11] xfs: remove unnecessary dstmap in xattr scrubber Darrick J. Wong
2022-12-30 22:11   ` [PATCH 05/11] xfs: split usedmap from xchk_xattr_buf.buf Darrick J. Wong
2022-12-30 22:11   ` [PATCH 07/11] xfs: remove flags argument from xchk_setup_xattr_buf Darrick J. Wong
2022-12-30 22:11   ` [PATCH 09/11] xfs: check used space of shortform xattr structures Darrick J. Wong
2022-12-30 22:11   ` [PATCH 10/11] xfs: clean up xattr scrub initialization Darrick J. Wong
2022-12-30 22:11   ` [PATCH 11/11] xfs: only allocate free space bitmap for xattr scrub if needed Darrick J. Wong
2022-12-30 22:11   ` [PATCH 08/11] xfs: move xattr scrub buffer allocation to top level function Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/3] xfs: rework online fsck incore bitmap Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/3] xfs: drop the _safe behavior from the xbitmap foreach macro Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/3] xfs: convert xbitmap to interval tree Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/3] xfs: remove the for_each_xbitmap_ helpers Darrick J. Wong
2022-12-30 22:11 ` [PATCHSET v24.0 0/5] xfs: strengthen rmapbt scrubbing Darrick J. Wong
2022-12-30 22:11   ` [PATCH 1/5] xfs: introduce bitmap type for AG blocks Darrick J. Wong
2022-12-30 22:11   ` [PATCH 5/5] xfs: cross-reference rmap records with refcount btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 4/5] xfs: cross-reference rmap records with inode btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 3/5] xfs: cross-reference rmap records with free space btrees Darrick J. Wong
2022-12-30 22:11   ` [PATCH 2/5] xfs: cross-reference rmap records with ag btrees Darrick J. Wong
2022-12-30 22:12 ` [PATCHSET v24.0 0/4] xfs: fix rmap btree key flag handling Darrick J. Wong
2022-12-30 22:12   ` [PATCH 1/4] xfs: fix rm_offset flag handling in rmap keys Darrick J. Wong
2022-12-30 22:12   ` [PATCH 2/4] xfs_repair: check low keys of rmap btrees Darrick J. Wong
2022-12-30 22:12   ` [PATCH 4/4] xfs_db: expose the unwritten flag in rmapbt keys Darrick J. Wong
2022-12-30 22:12   ` [PATCH 3/4] xfs_repair: warn about unwritten bits set in rmap btree keys Darrick J. Wong
2022-12-30 22:12 ` [PATCHSET v24.0 00/16] fstests: refactor online fsck stress tests Darrick J. Wong
2022-12-30 22:12   ` [PATCH 03/16] xfs/422: rework feature detection so we only test-format scratch once Darrick J. Wong
2022-12-30 22:12   ` [PATCH 01/16] xfs/422: create a new test group for fsstress/repair racers Darrick J. Wong
2022-12-30 22:12   ` [PATCH 07/16] fuzzy: give each test local control over what scrub stress tests get run Darrick J. Wong
2022-12-30 22:12   ` [PATCH 04/16] fuzzy: clean up scrub stress programs quietly Darrick J. Wong
2022-12-30 22:12   ` [PATCH 05/16] fuzzy: rework scrub stress output filtering Darrick J. Wong
2022-12-30 22:12   ` [PATCH 06/16] fuzzy: explicitly check for common/inject in _require_xfs_stress_online_repair Darrick J. Wong
2022-12-30 22:12   ` [PATCH 02/16] xfs/422: move the fsstress/freeze/scrub racing logic to common/fuzzy Darrick J. Wong
2022-12-30 22:12   ` [PATCH 12/16] fuzzy: increase operation count for each fsstress invocation Darrick J. Wong
2023-01-13 19:55     ` Zorro Lang
2023-01-13 21:28       ` Darrick J. Wong
2022-12-30 22:12   ` [PATCH 11/16] fuzzy: clear out the scratch filesystem if it's too full Darrick J. Wong
2022-12-30 22:12   ` [PATCH 09/16] fuzzy: make scrub stress loop control more robust Darrick J. Wong
2022-12-30 22:12   ` [PATCH 08/16] fuzzy: test the scrub stress subcommands before looping Darrick J. Wong
2022-12-30 22:12   ` [PATCH 13/16] fuzzy: clean up frozen fses after scrub stress testing Darrick J. Wong
2022-12-30 22:12   ` [PATCH 10/16] fuzzy: abort scrub stress testing if the scratch fs went down Darrick J. Wong
2022-12-30 22:12   ` [PATCH 14/16] fuzzy: make freezing optional for scrub stress tests Darrick J. Wong
2022-12-30 22:12   ` [PATCH 15/16] fuzzy: allow substitution of AG numbers when configuring scrub stress test Darrick J. Wong
2022-12-30 22:12   ` [PATCH 16/16] fuzzy: delay the start of the scrub loop when stress-testing scrub Darrick J. Wong
2022-12-30 22:12 ` [PATCHSET v24.0 0/3] fstests: refactor GETFSMAP stress tests Darrick J. Wong
2022-12-30 22:12   ` [PATCH 1/3] fuzzy: enhance scrub stress testing to use fsx Darrick J. Wong
2023-01-05  5:49     ` Zorro Lang
2023-01-05 18:28       ` Darrick J. Wong
2023-01-05 18:28     ` [PATCH v24.1 " Darrick J. Wong
2022-12-30 22:12   ` [PATCH 3/3] xfs: race fsmap with readonly remounts to detect crash or livelock Darrick J. Wong
2022-12-30 22:12   ` [PATCH 2/3] fuzzy: refactor fsmap stress test to use our helper functions Darrick J. Wong
2022-12-30 22:13 ` [PATCHSET v24.0 0/2] fstests: race online scrub with mount state changes Darrick J. Wong
2022-12-30 22:13   ` [PATCH 1/2] xfs: stress test xfs_scrub(8) with fsstress Darrick J. Wong
2022-12-30 22:13   ` [PATCH 2/2] xfs: stress test xfs_scrub(8) with freeze and ro-remount loops Darrick J. Wong
2023-01-13 20:10 ` [NYE DELUGE 1/4] xfs: all pending online scrub improvements Zorro Lang
2023-01-13 21:28   ` Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2022-10-02 18:19 [PATCHSET v23.3 00/14] xfs: design documentation for online fsck Darrick J. Wong
2022-10-02 18:19 ` [PATCH 10/14] xfs: document full filesystem scans " Darrick J. Wong
2022-08-07 18:30 [PATCHSET v2 00/14] xfs: design documentation " Darrick J. Wong
2022-08-07 18:31 ` [PATCH 10/14] xfs: document full filesystem scans " Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04811ecc7a2e7abc7e14e248f7389a37ee4d8ded.camel@oracle.com \
    --to=allison.henderson@oracle.com \
    --cc=catherine.hoang@oracle.com \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.