All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHBOMB v13.3] xfs: tweaks and fixes to online repair, part 2
@ 2024-04-17 23:10 Darrick J. Wong
  2024-04-17 23:13 ` [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub Darrick J. Wong
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
  0 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, Chandan Babu R

Hi everyone,

As most of you are aware, any large body of code naturally attracts
bugs.  This online repair patchbomb tries to address a few things that I
noticed during the review of parent pointers v13.2 -- we're working the
inode cache recycler harder than we need to, there were some bugs in the
code that unlocks on failure, and we need to be a bit more aggressive
about invalidating dentries when moving files to the lost+found.

There's also a cleanup to remove the code that used to turn on
exchange-range dynamically since it's now a permanent feature, which
means that we bail out of repair on unsupported filesystems earlier.

Finally, there's a speed optimization for the vectorized scrub path that
has us iget the file being scrubbed and hold it across all the scrub
vectors so that we amortize the overhead of untrusted iget lookups
especially if reclaim is being aggressive with the icache.

These are the last few pieces of part 2 of online repair.

Full versions are here:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-fixes
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=repair-fixes

--D

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub
  2024-04-17 23:10 [PATCHBOMB v13.3] xfs: tweaks and fixes to online repair, part 2 Darrick J. Wong
@ 2024-04-17 23:13 ` Darrick J. Wong
  2024-04-17 23:13   ` [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub Darrick J. Wong
  2024-04-17 23:14   ` [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle Darrick J. Wong
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
  1 sibling, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:13 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

Hi all,

This patchset looks to reduce iget overhead in two ways: First, a
previous patch conditionally set DONTCACHE on inodes during xchk_irele
on the grounds that we knew better at irele time if an inode should be
dropped.  Unfortunately, over time that patch morphed into a call to
d_mark_dontcache, which resulted in inodes being dropped even if they
were referenced by the dcache.  This actually caused *more* recycle
overhead than if we'd simply called xfs_iget to set DONTCACHE only on
misses.

The second patch reduces the cost of untrusted iget for a vectored scrub
call by having the scrubv code maintain a separate refcount to the inode
so that the cache will always hit.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=reduce-scrub-iget-overhead
---
Commits in this patchset:
 * xfs: use dontcache for grabbing inodes during scrub
 * xfs: only iget the file once when doing vectored scrub-by-handle
---
 fs/xfs/scrub/common.c |   12 +++---------
 fs/xfs/scrub/iscan.c  |   13 +++++++++++--
 fs/xfs/scrub/scrub.c  |   45 +++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/scrub/scrub.h  |    7 +++++++
 4 files changed, 66 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHSET v13.3 2/2] xfs: minor fixes to online repair
  2024-04-17 23:10 [PATCHBOMB v13.3] xfs: tweaks and fixes to online repair, part 2 Darrick J. Wong
  2024-04-17 23:13 ` [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub Darrick J. Wong
@ 2024-04-17 23:13 ` Darrick J. Wong
  2024-04-17 23:14   ` [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails Darrick J. Wong
                     ` (3 more replies)
  1 sibling, 4 replies; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:13 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

Hi all,

Here are some miscellaneous bug fixes for the online repair code.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-fixes

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=repair-fixes
---
Commits in this patchset:
 * xfs: drop the scrub file's iolock when transaction allocation fails
 * xfs: fix iunlock calls in xrep_adoption_trans_alloc
 * xfs: exchange-range for repairs is no longer dynamic
 * xfs: invalidate dentries for a file before moving it to the orphanage
---
 fs/xfs/scrub/attr_repair.c      |    3 ++
 fs/xfs/scrub/dir_repair.c       |    3 ++
 fs/xfs/scrub/nlinks_repair.c    |    4 ++-
 fs/xfs/scrub/orphanage.c        |   49 +++++++++++++++++----------------------
 fs/xfs/scrub/parent_repair.c    |   10 ++++++--
 fs/xfs/scrub/rtsummary_repair.c |   10 +++-----
 fs/xfs/scrub/scrub.c            |    8 ++----
 fs/xfs/scrub/scrub.h            |    7 ------
 fs/xfs/scrub/symlink_repair.c   |    3 ++
 fs/xfs/scrub/tempexch.h         |    1 -
 fs/xfs/scrub/tempfile.c         |   24 ++-----------------
 fs/xfs/scrub/trace.h            |    3 --
 12 files changed, 49 insertions(+), 76 deletions(-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub
  2024-04-17 23:13 ` [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub Darrick J. Wong
@ 2024-04-17 23:13   ` Darrick J. Wong
  2024-04-18  4:22     ` Christoph Hellwig
  2024-04-17 23:14   ` [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle Darrick J. Wong
  1 sibling, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:13 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

Back when I wrote commit a03297a0ca9f2, I had thought that we'd be doing
users a favor by only marking inodes dontcache at the end of a scrub
operation, and only if there's only one reference to that inode.  This
was more or less true back when I_DONTCACHE was an XFS iflag and the
only thing it did was change the outcome of xfs_fs_drop_inode to 1.

Note: If there are dentries pointing to the inode when scrub finishes,
the inode will have positive i_count and stay around in cache until
dentry reclaim.

But now we have d_mark_dontcache, which cause the inode *and* the
dentries attached to it all to be marked I_DONTCACHE, which means that
we drop the dentries ASAP, which drops the inode ASAP.

This is bad if scrub found problems with the inode, because now they can
be scheduled for inactivation, which can cause inodegc to trip on it and
shut down the filesystem.

Even if the inode isn't bad, this is still suboptimal because phases 3-7
each initiate inode scans.  Dropping the inode immediately during phase
3 is silly because phase 5 will reload it and drop it immediately, etc.
It's fine to mark the inodes dontcache, but if there have been accesses
to the file that set up dentries, we should keep them.

I validated this by setting up ftrace to capture xfs_iget_recycle*
tracepoints and ran xfs/285 for 30 seconds.  With current djwong-wtf I
saw ~30,000 recycle events.  I then dropped the d_mark_dontcache calls
and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
30 seconds.

Therefore, grab the inode with XFS_IGET_DONTCACHE, which only has the
effect of setting I_DONTCACHE for cache misses.  Remove the
d_mark_dontcache call that can happen in xchk_irele.

Fixes: a03297a0ca9f2 ("xfs: manage inode DONTCACHE status at irele time")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/common.c |   12 +++---------
 fs/xfs/scrub/iscan.c  |   13 +++++++++++--
 fs/xfs/scrub/scrub.h  |    7 +++++++
 3 files changed, 21 insertions(+), 11 deletions(-)


diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 48302532d10d1..c3612419c9f8a 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -782,7 +782,7 @@ xchk_iget(
 {
 	ASSERT(sc->tp != NULL);
 
-	return xfs_iget(sc->mp, sc->tp, inum, XFS_IGET_UNTRUSTED, 0, ipp);
+	return xfs_iget(sc->mp, sc->tp, inum, XCHK_IGET_FLAGS, 0, ipp);
 }
 
 /*
@@ -833,8 +833,8 @@ xchk_iget_agi(
 	if (error)
 		return error;
 
-	error = xfs_iget(mp, tp, inum,
-			XFS_IGET_NORETRY | XFS_IGET_UNTRUSTED, 0, ipp);
+	error = xfs_iget(mp, tp, inum, XFS_IGET_NORETRY | XCHK_IGET_FLAGS, 0,
+			ipp);
 	if (error == -EAGAIN) {
 		/*
 		 * The inode may be in core but temporarily unavailable and may
@@ -1061,12 +1061,6 @@ xchk_irele(
 		spin_lock(&VFS_I(ip)->i_lock);
 		VFS_I(ip)->i_state &= ~I_DONTCACHE;
 		spin_unlock(&VFS_I(ip)->i_lock);
-	} else if (atomic_read(&VFS_I(ip)->i_count) == 1) {
-		/*
-		 * If this is the last reference to the inode and the caller
-		 * permits it, set DONTCACHE to avoid thrashing.
-		 */
-		d_mark_dontcache(VFS_I(ip));
 	}
 
 	xfs_irele(ip);
diff --git a/fs/xfs/scrub/iscan.c b/fs/xfs/scrub/iscan.c
index c380207702e26..cf9d983667cec 100644
--- a/fs/xfs/scrub/iscan.c
+++ b/fs/xfs/scrub/iscan.c
@@ -407,6 +407,15 @@ xchk_iscan_iget_retry(
 	return -EAGAIN;
 }
 
+/*
+ * For an inode scan, we hold the AGI and want to try to grab a batch of
+ * inodes.  Holding the AGI prevents inodegc from clearing freed inodes,
+ * so we must use noretry here.  For every inode after the first one in the
+ * batch, we don't want to wait, so we use retry there too.  Finally, use
+ * dontcache to avoid polluting the cache.
+ */
+#define ISCAN_IGET_FLAGS	(XFS_IGET_NORETRY | XFS_IGET_DONTCACHE)
+
 /*
  * Grab an inode as part of an inode scan.  While scanning this inode, the
  * caller must ensure that no other threads can modify the inode until a call
@@ -434,7 +443,7 @@ xchk_iscan_iget(
 	ASSERT(iscan->__inodes[0] == NULL);
 
 	/* Fill the first slot in the inode array. */
-	error = xfs_iget(sc->mp, sc->tp, ino, XFS_IGET_NORETRY, 0,
+	error = xfs_iget(sc->mp, sc->tp, ino, ISCAN_IGET_FLAGS, 0,
 			&iscan->__inodes[idx]);
 
 	trace_xchk_iscan_iget(iscan, error);
@@ -507,7 +516,7 @@ xchk_iscan_iget(
 
 		ASSERT(iscan->__inodes[idx] == NULL);
 
-		error = xfs_iget(sc->mp, sc->tp, ino, XFS_IGET_NORETRY, 0,
+		error = xfs_iget(sc->mp, sc->tp, ino, ISCAN_IGET_FLAGS, 0,
 				&iscan->__inodes[idx]);
 		if (error)
 			break;
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index 4e7e3edb6350c..1da10182f7f42 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -60,6 +60,13 @@ static inline int xchk_maybe_relax(struct xchk_relax *widget)
 #define XCHK_GFP_FLAGS	((__force gfp_t)(GFP_KERNEL | __GFP_NOWARN | \
 					 __GFP_RETRY_MAYFAIL))
 
+/*
+ * For opening files by handle for fsck operations, we don't trust the inumber
+ * or the allocation state; therefore, perform an untrusted lookup.  We don't
+ * want these inodes to pollute the cache, so mark them for immediate removal.
+ */
+#define XCHK_IGET_FLAGS	(XFS_IGET_UNTRUSTED | XFS_IGET_DONTCACHE)
+
 /* Type info and names for the scrub types. */
 enum xchk_type {
 	ST_NONE = 1,	/* disabled */


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle
  2024-04-17 23:13 ` [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub Darrick J. Wong
  2024-04-17 23:13   ` [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub Darrick J. Wong
@ 2024-04-17 23:14   ` Darrick J. Wong
  2024-04-18  4:22     ` Christoph Hellwig
  1 sibling, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:14 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

If a program wants us to perform a scrub on a file handle and the fd
passed to ioctl() is not the file referenced in the handle, iget the
file once and pass it into the scrub code.  This amortizes the untrusted
iget lookup over /all/ the scrubbers mentioned in the scrubv call.

When running fstests in "rebuild all metadata after each test" mode, I
observed a 10% reduction in runtime on account of avoiding repeated
inobt lookups.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/scrub.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)


diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index 78b00ab85c9c9..43af5ce1f99f0 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -792,6 +792,37 @@ xfs_scrubv_check_barrier(
 	return 0;
 }
 
+/*
+ * If the caller provided us with a nonzero inode number that isn't the ioctl
+ * file, try to grab a reference to it to eliminate all further untrusted inode
+ * lookups.  If we can't get the inode, let each scrub function try again.
+ */
+STATIC struct xfs_inode *
+xchk_scrubv_open_by_handle(
+	struct xfs_mount		*mp,
+	const struct xfs_scrub_vec_head	*head)
+{
+	struct xfs_trans		*tp;
+	struct xfs_inode		*ip;
+	int				error;
+
+	error = xfs_trans_alloc_empty(mp, &tp);
+	if (error)
+		return NULL;
+
+	error = xfs_iget(mp, tp, head->svh_ino, XCHK_IGET_FLAGS, 0, &ip);
+	xfs_trans_cancel(tp);
+	if (error)
+		return NULL;
+
+	if (VFS_I(ip)->i_generation != head->svh_gen) {
+		xfs_irele(ip);
+		return NULL;
+	}
+
+	return ip;
+}
+
 /* Vectored scrub implementation to reduce ioctl calls. */
 int
 xfs_ioc_scrubv_metadata(
@@ -804,6 +835,7 @@ xfs_ioc_scrubv_metadata(
 	struct xfs_scrub_vec		__user *uvectors;
 	struct xfs_inode		*ip_in = XFS_I(file_inode(file));
 	struct xfs_mount		*mp = ip_in->i_mount;
+	struct xfs_inode		*handle_ip = NULL;
 	struct xfs_scrub_vec		*v;
 	size_t				vec_bytes;
 	unsigned int			i;
@@ -848,6 +880,17 @@ xfs_ioc_scrubv_metadata(
 		trace_xchk_scrubv_item(mp, &head, i, v);
 	}
 
+	/*
+	 * If the caller wants us to do a scrub-by-handle and the file used to
+	 * call the ioctl is not the same file, load the incore inode and pin
+	 * it across all the scrubv actions to avoid repeated UNTRUSTED
+	 * lookups.  The reference is not passed to deeper layers of scrub
+	 * because each scrubber gets to decide its own strategy and return
+	 * values for getting an inode.
+	 */
+	if (head.svh_ino && head.svh_ino != ip_in->i_ino)
+		handle_ip = xchk_scrubv_open_by_handle(mp, &head);
+
 	/* Run all the scrubbers. */
 	for (i = 0, v = vectors; i < head.svh_nr; i++, v++) {
 		struct xfs_scrub_metadata	sm = {
@@ -895,6 +938,8 @@ xfs_ioc_scrubv_metadata(
 	}
 
 out_free:
+	if (handle_ip)
+		xfs_irele(handle_ip);
 	kfree(vectors);
 	return error;
 }


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
@ 2024-04-17 23:14   ` Darrick J. Wong
  2024-04-18  4:23     ` Christoph Hellwig
  2024-04-17 23:14   ` [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc Darrick J. Wong
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:14 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

If the transaction allocation in the !orphanage_available case of
xrep_nlinks_repair_inode fails, we need to drop the IOLOCK of the file
being scrubbed before exiting.

Found by fuzzing u3.sfdir3.list[1].name = zeroes in xfs/1546.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/nlinks_repair.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/nlinks_repair.c b/fs/xfs/scrub/nlinks_repair.c
index 78d0f650fe897..b3e707f47b7b5 100644
--- a/fs/xfs/scrub/nlinks_repair.c
+++ b/fs/xfs/scrub/nlinks_repair.c
@@ -138,8 +138,10 @@ xrep_nlinks_repair_inode(
 
 		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, 0, 0, 0,
 				&sc->tp);
-		if (error)
+		if (error) {
+			xchk_iunlock(sc, XFS_IOLOCK_EXCL);
 			return error;
+		}
 
 		xchk_ilock(sc, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(sc->tp, ip, 0);


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
  2024-04-17 23:14   ` [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails Darrick J. Wong
@ 2024-04-17 23:14   ` Darrick J. Wong
  2024-04-18  4:23     ` Christoph Hellwig
  2024-04-17 23:14   ` [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic Darrick J. Wong
  2024-04-17 23:15   ` [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage Darrick J. Wong
  3 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:14 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

If the transaction allocation in xrep_adoption_trans_alloc fails, we
should drop only the locks that we took.  In this case this is
ILOCK_EXCL of both the orphanage and the file being repaired.  Dropping
any IOLOCK here is incorrect.

Found by fuzzing u3.sfdir3.list[1].name = zeroes in xfs/1546.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/orphanage.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index b1c6c60ee1da6..2b142e6de8f3f 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -382,7 +382,7 @@ xrep_adoption_trans_alloc(
 out_cancel:
 	xchk_trans_cancel(sc);
 	xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL);
-	xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL);
+	xchk_iunlock(sc, XFS_ILOCK_EXCL);
 	return error;
 }
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
  2024-04-17 23:14   ` [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails Darrick J. Wong
  2024-04-17 23:14   ` [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc Darrick J. Wong
@ 2024-04-17 23:14   ` Darrick J. Wong
  2024-04-18  4:24     ` Christoph Hellwig
  2024-04-17 23:15   ` [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage Darrick J. Wong
  3 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:14 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

The atomic file exchange-range functionality is now a permanent
filesystem feature instead of a dynamic log-incompat feature.  It cannot
be turned on at runtime, so we no longer need the XCHK_FSGATES flags and
whatnot that supported it.  Remove the flag and the enable function, and
move the xfs_has_exchange_range checks to the start of the repair
functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/attr_repair.c      |    3 +++
 fs/xfs/scrub/dir_repair.c       |    3 +++
 fs/xfs/scrub/parent_repair.c    |   10 +++++++---
 fs/xfs/scrub/rtsummary_repair.c |   10 ++++------
 fs/xfs/scrub/scrub.c            |    8 +++-----
 fs/xfs/scrub/scrub.h            |    7 -------
 fs/xfs/scrub/symlink_repair.c   |    3 +++
 fs/xfs/scrub/tempexch.h         |    1 -
 fs/xfs/scrub/tempfile.c         |   24 ++----------------------
 fs/xfs/scrub/trace.h            |    1 -
 10 files changed, 25 insertions(+), 45 deletions(-)


diff --git a/fs/xfs/scrub/attr_repair.c b/fs/xfs/scrub/attr_repair.c
index e059813b92b70..c7eb94069cafc 100644
--- a/fs/xfs/scrub/attr_repair.c
+++ b/fs/xfs/scrub/attr_repair.c
@@ -1630,6 +1630,9 @@ xrep_xattr(
 	/* The rmapbt is required to reap the old attr fork. */
 	if (!xfs_has_rmapbt(sc->mp))
 		return -EOPNOTSUPP;
+	/* We require atomic file exchange range to rebuild anything. */
+	if (!xfs_has_exchange_range(sc->mp))
+		return -EOPNOTSUPP;
 
 	error = xrep_xattr_setup_scan(sc, &rx);
 	if (error)
diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c
index e968150fe0f06..6ad40f8aafb82 100644
--- a/fs/xfs/scrub/dir_repair.c
+++ b/fs/xfs/scrub/dir_repair.c
@@ -1993,6 +1993,9 @@ xrep_directory(
 	/* The rmapbt is required to reap the old data fork. */
 	if (!xfs_has_rmapbt(sc->mp))
 		return -EOPNOTSUPP;
+	/* We require atomic file exchange range to rebuild anything. */
+	if (!xfs_has_exchange_range(sc->mp))
+		return -EOPNOTSUPP;
 
 	error = xrep_dir_setup_scan(rd);
 	if (error)
diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c
index ee88ce5a12b83..7b42b7f65a0bd 100644
--- a/fs/xfs/scrub/parent_repair.c
+++ b/fs/xfs/scrub/parent_repair.c
@@ -1571,10 +1571,14 @@ xrep_parent(
 	/*
 	 * When the parent pointers feature is enabled, repairs are committed
 	 * by atomically committing a new xattr structure and reaping the old
-	 * attr fork.  Reaping requires rmap to be enabled.
+	 * attr fork.  Reaping requires rmap and exchange-range to be enabled.
 	 */
-	if (xfs_has_parent(sc->mp) && !xfs_has_rmapbt(sc->mp))
-		return -EOPNOTSUPP;
+	if (xfs_has_parent(sc->mp)) {
+		if (!xfs_has_rmapbt(sc->mp))
+			return -EOPNOTSUPP;
+		if (!xfs_has_exchange_range(sc->mp))
+			return -EOPNOTSUPP;
+	}
 
 	error = xrep_parent_setup_scan(rp);
 	if (error)
diff --git a/fs/xfs/scrub/rtsummary_repair.c b/fs/xfs/scrub/rtsummary_repair.c
index c8bb6c4f15d05..d9e971c4c79fb 100644
--- a/fs/xfs/scrub/rtsummary_repair.c
+++ b/fs/xfs/scrub/rtsummary_repair.c
@@ -62,12 +62,7 @@ xrep_setup_rtsummary(
 		return -EOPNOTSUPP;
 
 	rts->resblks += blocks;
-
-	/*
-	 * Grab support for atomic file content exchanges before we allocate
-	 * any transactions or grab ILOCKs.
-	 */
-	return xrep_tempexch_enable(sc);
+	return 0;
 }
 
 static int
@@ -111,6 +106,9 @@ xrep_rtsummary(
 	/* We require the rmapbt to rebuild anything. */
 	if (!xfs_has_rmapbt(mp))
 		return -EOPNOTSUPP;
+	/* We require atomic file exchange range to rebuild anything. */
+	if (!xfs_has_exchange_range(mp))
+		return -EOPNOTSUPP;
 
 	/* Walk away if we disagree on the size of the rt bitmap. */
 	if (rts->rbmblocks != mp->m_sb.sb_rbmblocks)
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index 43af5ce1f99f0..c013f0ba4f36b 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -154,15 +154,14 @@ xchk_probe(
 
 /* Scrub setup and teardown */
 
-#define FSGATES_MASK	(XCHK_FSGATES_ALL | XREP_FSGATES_ALL)
 static inline void
 xchk_fsgates_disable(
 	struct xfs_scrub	*sc)
 {
-	if (!(sc->flags & FSGATES_MASK))
+	if (!(sc->flags & XCHK_FSGATES_ALL))
 		return;
 
-	trace_xchk_fsgates_disable(sc, sc->flags & FSGATES_MASK);
+	trace_xchk_fsgates_disable(sc, sc->flags & XCHK_FSGATES_ALL);
 
 	if (sc->flags & XCHK_FSGATES_DRAIN)
 		xfs_drain_wait_disable();
@@ -176,9 +175,8 @@ xchk_fsgates_disable(
 	if (sc->flags & XCHK_FSGATES_RMAP)
 		xfs_rmap_hook_disable();
 
-	sc->flags &= ~FSGATES_MASK;
+	sc->flags &= ~XCHK_FSGATES_ALL;
 }
-#undef FSGATES_MASK
 
 /* Free the resources associated with a scrub subtype. */
 void
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index 1da10182f7f42..1bc33f010d0e7 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -188,7 +188,6 @@ struct xfs_scrub {
 #define XCHK_FSGATES_QUOTA	(1U << 4)  /* quota live update enabled */
 #define XCHK_FSGATES_DIRENTS	(1U << 5)  /* directory live update enabled */
 #define XCHK_FSGATES_RMAP	(1U << 6)  /* rmapbt live update enabled */
-#define XREP_FSGATES_EXCHANGE_RANGE (1U << 29) /* uses file content exchange */
 #define XREP_RESET_PERAG_RESV	(1U << 30) /* must reset AG space reservation */
 #define XREP_ALREADY_FIXED	(1U << 31) /* checking our repair work */
 
@@ -203,12 +202,6 @@ struct xfs_scrub {
 				 XCHK_FSGATES_DIRENTS | \
 				 XCHK_FSGATES_RMAP)
 
-/*
- * The sole XREP_FSGATES* flag reflects a log intent item that is protected
- * by a log-incompat feature flag.  No code patching in use here.
- */
-#define XREP_FSGATES_ALL	(XREP_FSGATES_EXCHANGE_RANGE)
-
 struct xfs_scrub_subord {
 	struct xfs_scrub	sc;
 	struct xfs_scrub	*parent_sc;
diff --git a/fs/xfs/scrub/symlink_repair.c b/fs/xfs/scrub/symlink_repair.c
index c8b5a5b878ac9..d015a86ef460f 100644
--- a/fs/xfs/scrub/symlink_repair.c
+++ b/fs/xfs/scrub/symlink_repair.c
@@ -490,6 +490,9 @@ xrep_symlink(
 	/* The rmapbt is required to reap the old data fork. */
 	if (!xfs_has_rmapbt(sc->mp))
 		return -EOPNOTSUPP;
+	/* We require atomic file exchange range to rebuild anything. */
+	if (!xfs_has_exchange_range(sc->mp))
+		return -EOPNOTSUPP;
 
 	ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL);
 
diff --git a/fs/xfs/scrub/tempexch.h b/fs/xfs/scrub/tempexch.h
index c1dd4adec4f11..995ba187c5aa6 100644
--- a/fs/xfs/scrub/tempexch.h
+++ b/fs/xfs/scrub/tempexch.h
@@ -11,7 +11,6 @@ struct xrep_tempexch {
 	struct xfs_exchmaps_req	req;
 };
 
-int xrep_tempexch_enable(struct xfs_scrub *sc);
 int xrep_tempexch_trans_reserve(struct xfs_scrub *sc, int whichfork,
 		struct xrep_tempexch *ti);
 int xrep_tempexch_trans_alloc(struct xfs_scrub *sc, int whichfork,
diff --git a/fs/xfs/scrub/tempfile.c b/fs/xfs/scrub/tempfile.c
index ddbcccb3dba13..b747b625c5ee4 100644
--- a/fs/xfs/scrub/tempfile.c
+++ b/fs/xfs/scrub/tempfile.c
@@ -486,23 +486,6 @@ xrep_tempfile_roll_trans(
 	return 0;
 }
 
-/* Enable file content exchanges. */
-int
-xrep_tempexch_enable(
-	struct xfs_scrub	*sc)
-{
-	if (sc->flags & XREP_FSGATES_EXCHANGE_RANGE)
-		return 0;
-
-	if (!xfs_has_exchange_range(sc->mp))
-		return -EOPNOTSUPP;
-
-	trace_xchk_fsgates_enable(sc, XREP_FSGATES_EXCHANGE_RANGE);
-
-	sc->flags |= XREP_FSGATES_EXCHANGE_RANGE;
-	return 0;
-}
-
 /*
  * Fill out the mapping exchange request in preparation for atomically
  * committing the contents of a metadata file that we've rebuilt in the temp
@@ -745,6 +728,7 @@ xrep_tempexch_trans_alloc(
 	int			error;
 
 	ASSERT(sc->tp == NULL);
+	ASSERT(xfs_has_exchange_range(sc->mp));
 
 	error = xrep_tempexch_prep_request(sc, whichfork, tx);
 	if (error)
@@ -757,10 +741,6 @@ xrep_tempexch_trans_alloc(
 	if (xfs_has_lazysbcount(sc->mp))
 		flags |= XFS_TRANS_RES_FDBLKS;
 
-	error = xrep_tempexch_enable(sc);
-	if (error)
-		return error;
-
 	error = xfs_trans_alloc(sc->mp, &M_RES(sc->mp)->tr_itruncate,
 			tx->req.resblks, 0, flags, &sc->tp);
 	if (error)
@@ -785,7 +765,7 @@ xrep_tempexch_contents(
 {
 	int			error;
 
-	ASSERT(sc->flags & XREP_FSGATES_EXCHANGE_RANGE);
+	ASSERT(xfs_has_exchange_range(sc->mp));
 
 	xfs_exchange_mappings(sc->tp, &tx->req);
 	error = xfs_defer_finish(&sc->tp);
diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index 8ce74bd8530a6..4b945007ca6ca 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -122,7 +122,6 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_BARRIER);
 	{ XCHK_FSGATES_QUOTA,			"fsgates_quota" }, \
 	{ XCHK_FSGATES_DIRENTS,			"fsgates_dirents" }, \
 	{ XCHK_FSGATES_RMAP,			"fsgates_rmap" }, \
-	{ XREP_FSGATES_EXCHANGE_RANGE,		"fsgates_exchrange" }, \
 	{ XREP_RESET_PERAG_RESV,		"reset_perag_resv" }, \
 	{ XREP_ALREADY_FIXED,			"already_fixed" }
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage
  2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
                     ` (2 preceding siblings ...)
  2024-04-17 23:14   ` [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic Darrick J. Wong
@ 2024-04-17 23:15   ` Darrick J. Wong
  2024-04-18  4:25     ` Christoph Hellwig
  3 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-17 23:15 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs, hch

From: Darrick J. Wong <djwong@kernel.org>

Invalidate the cached dentries that point to the file that we're moving
to lost+found before we actually move it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/orphanage.c |   47 ++++++++++++++++++++--------------------------
 fs/xfs/scrub/trace.h     |    2 --
 2 files changed, 20 insertions(+), 29 deletions(-)


diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index 2b142e6de8f3f..7148d8362db83 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -434,16 +434,17 @@ xrep_adoption_check_dcache(
 {
 	struct qstr		qname = QSTR_INIT(adopt->xname->name,
 						  adopt->xname->len);
+	struct xfs_scrub	*sc = adopt->sc;
 	struct dentry		*d_orphanage, *d_child;
 	int			error = 0;
 
-	d_orphanage = d_find_alias(VFS_I(adopt->sc->orphanage));
+	d_orphanage = d_find_alias(VFS_I(sc->orphanage));
 	if (!d_orphanage)
 		return 0;
 
 	d_child = d_hash_and_lookup(d_orphanage, &qname);
 	if (d_child) {
-		trace_xrep_adoption_check_child(adopt->sc->mp, d_child);
+		trace_xrep_adoption_check_child(sc->mp, d_child);
 
 		if (d_is_positive(d_child)) {
 			ASSERT(d_is_negative(d_child));
@@ -454,33 +455,15 @@ xrep_adoption_check_dcache(
 	}
 
 	dput(d_orphanage);
-	if (error)
-		return error;
-
-	/*
-	 * Do we need to update d_parent of the dentry for the file being
-	 * repaired?  There shouldn't be a hashed dentry with a parent since
-	 * the file had nonzero nlink but wasn't connected to any parent dir.
-	 */
-	d_child = d_find_alias(VFS_I(adopt->sc->ip));
-	if (!d_child)
-		return 0;
-
-	trace_xrep_adoption_check_alias(adopt->sc->mp, d_child);
-
-	if (d_child->d_parent && !d_unhashed(d_child)) {
-		ASSERT(d_child->d_parent == NULL || d_unhashed(d_child));
-		error = -EFSCORRUPTED;
-	}
-
-	dput(d_child);
 	return error;
 }
 
 /*
- * Remove all negative dentries from the dcache.  There should not be any
- * positive entries, since we've maintained our lock on the orphanage
- * directory.
+ * Invalidate all dentries for the name that was added to the orphanage
+ * directory, and all dentries pointing to the child inode that was moved.
+ *
+ * There should not be any positive entries for the name, since we've
+ * maintained our lock on the orphanage directory.
  */
 static void
 xrep_adoption_zap_dcache(
@@ -488,15 +471,17 @@ xrep_adoption_zap_dcache(
 {
 	struct qstr		qname = QSTR_INIT(adopt->xname->name,
 						  adopt->xname->len);
+	struct xfs_scrub	*sc = adopt->sc;
 	struct dentry		*d_orphanage, *d_child;
 
-	d_orphanage = d_find_alias(VFS_I(adopt->sc->orphanage));
+	/* Invalidate all dentries for the adoption name */
+	d_orphanage = d_find_alias(VFS_I(sc->orphanage));
 	if (!d_orphanage)
 		return;
 
 	d_child = d_hash_and_lookup(d_orphanage, &qname);
 	while (d_child != NULL) {
-		trace_xrep_adoption_invalidate_child(adopt->sc->mp, d_child);
+		trace_xrep_adoption_invalidate_child(sc->mp, d_child);
 
 		ASSERT(d_is_negative(d_child));
 		d_invalidate(d_child);
@@ -505,6 +490,14 @@ xrep_adoption_zap_dcache(
 	}
 
 	dput(d_orphanage);
+
+	/* Invalidate all the dentries pointing down to this file. */
+	while ((d_child = d_find_alias(VFS_I(sc->ip))) != NULL) {
+		trace_xrep_adoption_invalidate_child(sc->mp, d_child);
+
+		d_invalidate(d_child);
+		dput(d_child);
+	}
 }
 
 /*
diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index 4b945007ca6ca..e27daa51cab64 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -3265,8 +3265,6 @@ DEFINE_EVENT(xrep_dentry_class, name, \
 	TP_PROTO(struct xfs_mount *mp, const struct dentry *dentry), \
 	TP_ARGS(mp, dentry))
 DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_child);
-DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_alias);
-DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_dentry);
 DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_invalidate_child);
 DEFINE_REPAIR_DENTRY_EVENT(xrep_dirtree_delete_child);
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub
  2024-04-17 23:13   ` [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub Darrick J. Wong
@ 2024-04-18  4:22     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle
  2024-04-17 23:14   ` [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle Darrick J. Wong
@ 2024-04-18  4:22     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails
  2024-04-17 23:14   ` [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails Darrick J. Wong
@ 2024-04-18  4:23     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc
  2024-04-17 23:14   ` [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc Darrick J. Wong
@ 2024-04-18  4:23     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic
  2024-04-17 23:14   ` [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic Darrick J. Wong
@ 2024-04-18  4:24     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:24 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage
  2024-04-17 23:15   ` [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage Darrick J. Wong
@ 2024-04-18  4:25     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2024-04-18  4:25 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, linux-xfs, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub
  2024-04-24  3:07 [PATCHSET v13.4 8/9] xfs: reduce iget overhead in scrub Darrick J. Wong
@ 2024-04-24  3:28 ` Darrick J. Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2024-04-24  3:28 UTC (permalink / raw)
  To: djwong, chandanbabu; +Cc: Christoph Hellwig, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Back when I wrote commit a03297a0ca9f2, I had thought that we'd be doing
users a favor by only marking inodes dontcache at the end of a scrub
operation, and only if there's only one reference to that inode.  This
was more or less true back when I_DONTCACHE was an XFS iflag and the
only thing it did was change the outcome of xfs_fs_drop_inode to 1.

Note: If there are dentries pointing to the inode when scrub finishes,
the inode will have positive i_count and stay around in cache until
dentry reclaim.

But now we have d_mark_dontcache, which cause the inode *and* the
dentries attached to it all to be marked I_DONTCACHE, which means that
we drop the dentries ASAP, which drops the inode ASAP.

This is bad if scrub found problems with the inode, because now they can
be scheduled for inactivation, which can cause inodegc to trip on it and
shut down the filesystem.

Even if the inode isn't bad, this is still suboptimal because phases 3-7
each initiate inode scans.  Dropping the inode immediately during phase
3 is silly because phase 5 will reload it and drop it immediately, etc.
It's fine to mark the inodes dontcache, but if there have been accesses
to the file that set up dentries, we should keep them.

I validated this by setting up ftrace to capture xfs_iget_recycle*
tracepoints and ran xfs/285 for 30 seconds.  With current djwong-wtf I
saw ~30,000 recycle events.  I then dropped the d_mark_dontcache calls
and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
30 seconds.

Therefore, grab the inode with XFS_IGET_DONTCACHE, which only has the
effect of setting I_DONTCACHE for cache misses.  Remove the
d_mark_dontcache call that can happen in xchk_irele.

Fixes: a03297a0ca9f2 ("xfs: manage inode DONTCACHE status at irele time")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/common.c |   12 +++---------
 fs/xfs/scrub/iscan.c  |   13 +++++++++++--
 fs/xfs/scrub/scrub.h  |    7 +++++++
 3 files changed, 21 insertions(+), 11 deletions(-)


diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index a7d3a2279662..1ad8ec63a7f4 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -783,7 +783,7 @@ xchk_iget(
 {
 	ASSERT(sc->tp != NULL);
 
-	return xfs_iget(sc->mp, sc->tp, inum, XFS_IGET_UNTRUSTED, 0, ipp);
+	return xfs_iget(sc->mp, sc->tp, inum, XCHK_IGET_FLAGS, 0, ipp);
 }
 
 /*
@@ -834,8 +834,8 @@ xchk_iget_agi(
 	if (error)
 		return error;
 
-	error = xfs_iget(mp, tp, inum,
-			XFS_IGET_NORETRY | XFS_IGET_UNTRUSTED, 0, ipp);
+	error = xfs_iget(mp, tp, inum, XFS_IGET_NORETRY | XCHK_IGET_FLAGS, 0,
+			ipp);
 	if (error == -EAGAIN) {
 		/*
 		 * The inode may be in core but temporarily unavailable and may
@@ -1062,12 +1062,6 @@ xchk_irele(
 		spin_lock(&VFS_I(ip)->i_lock);
 		VFS_I(ip)->i_state &= ~I_DONTCACHE;
 		spin_unlock(&VFS_I(ip)->i_lock);
-	} else if (atomic_read(&VFS_I(ip)->i_count) == 1) {
-		/*
-		 * If this is the last reference to the inode and the caller
-		 * permits it, set DONTCACHE to avoid thrashing.
-		 */
-		d_mark_dontcache(VFS_I(ip));
 	}
 
 	xfs_irele(ip);
diff --git a/fs/xfs/scrub/iscan.c b/fs/xfs/scrub/iscan.c
index c380207702e2..cf9d983667ce 100644
--- a/fs/xfs/scrub/iscan.c
+++ b/fs/xfs/scrub/iscan.c
@@ -407,6 +407,15 @@ xchk_iscan_iget_retry(
 	return -EAGAIN;
 }
 
+/*
+ * For an inode scan, we hold the AGI and want to try to grab a batch of
+ * inodes.  Holding the AGI prevents inodegc from clearing freed inodes,
+ * so we must use noretry here.  For every inode after the first one in the
+ * batch, we don't want to wait, so we use retry there too.  Finally, use
+ * dontcache to avoid polluting the cache.
+ */
+#define ISCAN_IGET_FLAGS	(XFS_IGET_NORETRY | XFS_IGET_DONTCACHE)
+
 /*
  * Grab an inode as part of an inode scan.  While scanning this inode, the
  * caller must ensure that no other threads can modify the inode until a call
@@ -434,7 +443,7 @@ xchk_iscan_iget(
 	ASSERT(iscan->__inodes[0] == NULL);
 
 	/* Fill the first slot in the inode array. */
-	error = xfs_iget(sc->mp, sc->tp, ino, XFS_IGET_NORETRY, 0,
+	error = xfs_iget(sc->mp, sc->tp, ino, ISCAN_IGET_FLAGS, 0,
 			&iscan->__inodes[idx]);
 
 	trace_xchk_iscan_iget(iscan, error);
@@ -507,7 +516,7 @@ xchk_iscan_iget(
 
 		ASSERT(iscan->__inodes[idx] == NULL);
 
-		error = xfs_iget(sc->mp, sc->tp, ino, XFS_IGET_NORETRY, 0,
+		error = xfs_iget(sc->mp, sc->tp, ino, ISCAN_IGET_FLAGS, 0,
 				&iscan->__inodes[idx]);
 		if (error)
 			break;
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index 4e7e3edb6350..1da10182f7f4 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -60,6 +60,13 @@ static inline int xchk_maybe_relax(struct xchk_relax *widget)
 #define XCHK_GFP_FLAGS	((__force gfp_t)(GFP_KERNEL | __GFP_NOWARN | \
 					 __GFP_RETRY_MAYFAIL))
 
+/*
+ * For opening files by handle for fsck operations, we don't trust the inumber
+ * or the allocation state; therefore, perform an untrusted lookup.  We don't
+ * want these inodes to pollute the cache, so mark them for immediate removal.
+ */
+#define XCHK_IGET_FLAGS	(XFS_IGET_UNTRUSTED | XFS_IGET_DONTCACHE)
+
 /* Type info and names for the scrub types. */
 enum xchk_type {
 	ST_NONE = 1,	/* disabled */


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-24  3:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17 23:10 [PATCHBOMB v13.3] xfs: tweaks and fixes to online repair, part 2 Darrick J. Wong
2024-04-17 23:13 ` [PATCHSET v13.3 1/2] xfs: reduce iget overhead in scrub Darrick J. Wong
2024-04-17 23:13   ` [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub Darrick J. Wong
2024-04-18  4:22     ` Christoph Hellwig
2024-04-17 23:14   ` [PATCH 2/2] xfs: only iget the file once when doing vectored scrub-by-handle Darrick J. Wong
2024-04-18  4:22     ` Christoph Hellwig
2024-04-17 23:13 ` [PATCHSET v13.3 2/2] xfs: minor fixes to online repair Darrick J. Wong
2024-04-17 23:14   ` [PATCH 1/4] xfs: drop the scrub file's iolock when transaction allocation fails Darrick J. Wong
2024-04-18  4:23     ` Christoph Hellwig
2024-04-17 23:14   ` [PATCH 2/4] xfs: fix iunlock calls in xrep_adoption_trans_alloc Darrick J. Wong
2024-04-18  4:23     ` Christoph Hellwig
2024-04-17 23:14   ` [PATCH 3/4] xfs: exchange-range for repairs is no longer dynamic Darrick J. Wong
2024-04-18  4:24     ` Christoph Hellwig
2024-04-17 23:15   ` [PATCH 4/4] xfs: invalidate dentries for a file before moving it to the orphanage Darrick J. Wong
2024-04-18  4:25     ` Christoph Hellwig
2024-04-24  3:07 [PATCHSET v13.4 8/9] xfs: reduce iget overhead in scrub Darrick J. Wong
2024-04-24  3:28 ` [PATCH 1/2] xfs: use dontcache for grabbing inodes during scrub Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.