[PATCHSET v3 0/3] xfs: preserve inode health reports for longer

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCHSET v3 0/3] xfs: preserve inode health reports for longer
@ 2021-06-06 17:54 Darrick J. Wong
  2021-06-06 17:54 ` [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-06 17:54 UTC (permalink / raw)
  To: djwong; +Cc: Brian Foster, Dave Chinner, linux-xfs, david, bfoster

Hi all,

This is a quick series to make sure that inode sickness reports stick
around in memory for some amount of time.

v2: rebase to 5.13-rc4
v3: require explicit request to reclaim sick inodes, drop weird icache
    miss interaction with DONTCACHE

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-inode-health-reports-5.14
---
 fs/xfs/xfs_health.c |    9 +++++++++
 fs/xfs/xfs_icache.c |   50 +++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 50 insertions(+), 9 deletions(-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode
  2021-06-06 17:54 [PATCHSET v3 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
@ 2021-06-06 17:54 ` Darrick J. Wong
  2021-06-08 14:55   ` Carlos Maiolino
  2021-06-06 17:54 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
  2021-06-06 17:54 ` [PATCH 3/3] xfs: selectively keep sick inodes in memory Darrick J. Wong
  2 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-06 17:54 UTC (permalink / raw)
  To: djwong; +Cc: Brian Foster, Dave Chinner, linux-xfs, david, bfoster

From: Darrick J. Wong <djwong@kernel.org>

While running some fuzz tests on inode metadata, I noticed that the
filesystem health report (as provided by xfs_spaceman) failed to report
the file corruption even when spaceman was run immediately after running
xfs_scrub to detect the corruption.  That isn't the intended behavior;
one ought to be able to run scrub to detect errors in the ondisk
metadata and be able to access to those reports for some time after the
scrub.

After running the same sequence through an instrumented kernel, I
discovered the reason why -- scrub igets the file, scans it, marks it
sick, and ireleases the inode.  When the VFS lets go of the incore
inode, it moves to RECLAIMABLE state.  If spaceman igets the incore
inode before it moves to RECLAIM state, iget reinitializes the VFS
state, clears the sick and checked masks, and hands back the inode.  At
this point, the caller has the exact same incore inode, but with all the
health state erased.

In other words, we're erasing the incore inode's health state flags when
we've decided NOT to sever the link between the incore inode and the
ondisk inode.  This is wrong, so we need to remove the lines that zero
the fields from xfs_iget_cache_hit.

As a precaution, we add the same lines into xfs_reclaim_inode just after
we sever the link between incore and ondisk inode.  Strictly speaking
this isn't necessary because once an inode has gone through reclaim it
must go through xfs_inode_alloc (which also zeroes the state) and
xfs_iget is careful to check for mismatches between the inode it pulls
out of the radix tree and the one it wants.

Fixes: 6772c1f11206 ("xfs: track metadata health status")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_icache.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 396cc54ca03f..c3f912a9231b 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -523,9 +523,6 @@ xfs_iget_cache_hit(
 				XFS_INO_TO_AGINO(pag->pag_mount, ino),
 				XFS_ICI_RECLAIM_TAG);
 		inode->i_state = I_NEW;
-		ip->i_sick = 0;
-		ip->i_checked = 0;
-
 		spin_unlock(&ip->i_flags_lock);
 		spin_unlock(&pag->pag_ici_lock);
 	} else {
@@ -979,6 +976,8 @@ xfs_reclaim_inode(
 	spin_lock(&ip->i_flags_lock);
 	ip->i_flags = XFS_IRECLAIM;
 	ip->i_ino = 0;
+	ip->i_sick = 0;
+	ip->i_checked = 0;
 	spin_unlock(&ip->i_flags_lock);

 	xfs_iunlock(ip, XFS_ILOCK_EXCL);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-06 17:54 [PATCHSET v3 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
  2021-06-06 17:54 ` [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode Darrick J. Wong
@ 2021-06-06 17:54 ` Darrick J. Wong
  2021-06-07  1:40   ` Dave Chinner
  2021-06-08 14:59   ` Carlos Maiolino
  2021-06-06 17:54 ` [PATCH 3/3] xfs: selectively keep sick inodes in memory Darrick J. Wong
  2 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-06 17:54 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, david, bfoster

From: Darrick J. Wong <djwong@kernel.org>

When we decide to mark an inode sick, clear the DONTCACHE flag so that
the incore inode will be kept around until memory pressure forces it out
of memory.  This increases the chances that the sick status will be
caught by someone compiling a health report later on.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_health.c |    9 +++++++++
 1 file changed, 9 insertions(+)


diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 8e0cb05a7142..806be8a93ea3 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -231,6 +231,15 @@ xfs_inode_mark_sick(
 	ip->i_sick |= mask;
 	ip->i_checked |= mask;
 	spin_unlock(&ip->i_flags_lock);
+
+	/*
+	 * Keep this inode around so we don't lose the sickness report.  Scrub
+	 * grabs inodes with DONTCACHE assuming that most inode are ok, which
+	 * is not the case here.
+	 */
+	spin_lock(&VFS_I(ip)->i_lock);
+	VFS_I(ip)->i_state &= ~I_DONTCACHE;
+	spin_unlock(&VFS_I(ip)->i_lock);
 }
 
 /* Mark parts of an inode healed. */


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] xfs: selectively keep sick inodes in memory
  2021-06-06 17:54 [PATCHSET v3 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
  2021-06-06 17:54 ` [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode Darrick J. Wong
  2021-06-06 17:54 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
@ 2021-06-06 17:54 ` Darrick J. Wong
  2021-06-07  1:43   ` Dave Chinner
  2021-06-08 15:02   ` Carlos Maiolino
  2 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-06 17:54 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, david, bfoster

From: Darrick J. Wong <djwong@kernel.org>

It's important that the filesystem retain its memory of sick inodes for
a little while after problems are found so that reports can be collected
about what was wrong.  Don't let inode reclamation free sick inodes
unless we're unmounting or the fs already went down.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c |   45 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index c3f912a9231b..53dab8959e1d 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -71,10 +71,13 @@ static int xfs_icwalk_ag(struct xfs_perag *pag,
 /* Stop scanning after icw_scan_limit inodes. */
 #define XFS_ICWALK_FLAG_SCAN_LIMIT	(1U << 28)
 
+#define XFS_ICWALK_FLAG_RECLAIM_SICK	(1U << 27)
+
 #define XFS_ICWALK_PRIVATE_FLAGS	(XFS_ICWALK_FLAG_DROP_UDQUOT | \
 					 XFS_ICWALK_FLAG_DROP_GDQUOT | \
 					 XFS_ICWALK_FLAG_DROP_PDQUOT | \
-					 XFS_ICWALK_FLAG_SCAN_LIMIT)
+					 XFS_ICWALK_FLAG_SCAN_LIMIT | \
+					 XFS_ICWALK_FLAG_RECLAIM_SICK)
 
 /*
  * Allocate and initialise an xfs_inode.
@@ -910,7 +913,8 @@ xfs_dqrele_all_inodes(
  */
 static bool
 xfs_reclaim_igrab(
-	struct xfs_inode	*ip)
+	struct xfs_inode	*ip,
+	struct xfs_eofblocks	*eofb)
 {
 	ASSERT(rcu_read_lock_held());
 
@@ -921,6 +925,14 @@ xfs_reclaim_igrab(
 		spin_unlock(&ip->i_flags_lock);
 		return false;
 	}
+
+	/* Don't reclaim a sick inode unless the caller asked for it. */
+	if (ip->i_sick &&
+	    (!eofb || !(eofb->eof_flags & XFS_ICWALK_FLAG_RECLAIM_SICK))) {
+		spin_unlock(&ip->i_flags_lock);
+		return false;
+	}
+
 	__xfs_iflags_set(ip, XFS_IRECLAIM);
 	spin_unlock(&ip->i_flags_lock);
 	return true;
@@ -1021,13 +1033,30 @@ xfs_reclaim_inode(
 	xfs_iflags_clear(ip, XFS_IRECLAIM);
 }
 
+/* Reclaim sick inodes if we're unmounting or the fs went down. */
+static inline bool
+xfs_want_reclaim_sick(
+	struct xfs_mount	*mp)
+{
+	return (mp->m_flags & XFS_MOUNT_UNMOUNTING) ||
+	       (mp->m_flags & XFS_MOUNT_NORECOVERY) ||
+	       XFS_FORCED_SHUTDOWN(mp);
+}
+
 void
 xfs_reclaim_inodes(
 	struct xfs_mount	*mp)
 {
+	struct xfs_eofblocks	eofb = {
+		.eof_flags	= 0,
+	};
+
+	if (xfs_want_reclaim_sick(mp))
+		eofb.eof_flags |= XFS_ICWALK_FLAG_RECLAIM_SICK;
+
 	while (radix_tree_tagged(&mp->m_perag_tree, XFS_ICI_RECLAIM_TAG)) {
 		xfs_ail_push_all_sync(mp->m_ail);
-		xfs_icwalk(mp, XFS_ICWALK_RECLAIM, NULL);
+		xfs_icwalk(mp, XFS_ICWALK_RECLAIM, &eofb);
 	}
 }
 
@@ -1048,6 +1077,9 @@ xfs_reclaim_inodes_nr(
 		.icw_scan_limit	= nr_to_scan,
 	};
 
+	if (xfs_want_reclaim_sick(mp))
+		eofb.eof_flags |= XFS_ICWALK_FLAG_RECLAIM_SICK;
+
 	/* kick background reclaimer and push the AIL */
 	xfs_reclaim_work_queue(mp);
 	xfs_ail_push_all(mp->m_ail);
@@ -1605,7 +1637,8 @@ xfs_blockgc_free_quota(
 static inline bool
 xfs_icwalk_igrab(
 	enum xfs_icwalk_goal	goal,
-	struct xfs_inode	*ip)
+	struct xfs_inode	*ip,
+	struct xfs_eofblocks	*eofb)
 {
 	switch (goal) {
 	case XFS_ICWALK_DQRELE:
@@ -1613,7 +1646,7 @@ xfs_icwalk_igrab(
 	case XFS_ICWALK_BLOCKGC:
 		return xfs_blockgc_igrab(ip);
 	case XFS_ICWALK_RECLAIM:
-		return xfs_reclaim_igrab(ip);
+		return xfs_reclaim_igrab(ip, eofb);
 	default:
 		return false;
 	}
@@ -1702,7 +1735,7 @@ xfs_icwalk_ag(
 		for (i = 0; i < nr_found; i++) {
 			struct xfs_inode *ip = batch[i];
 
-			if (done || !xfs_icwalk_igrab(goal, ip))
+			if (done || !xfs_icwalk_igrab(goal, ip, eofb))
 				batch[i] = NULL;
 
 			/*


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-06 17:54 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
@ 2021-06-07  1:40   ` Dave Chinner
  2021-06-08 14:59   ` Carlos Maiolino
  1 sibling, 0 replies; 16+ messages in thread
From: Dave Chinner @ 2021-06-07  1:40 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, bfoster

On Sun, Jun 06, 2021 at 10:54:17AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we decide to mark an inode sick, clear the DONTCACHE flag so that
> the incore inode will be kept around until memory pressure forces it out
> of memory.  This increases the chances that the sick status will be
> caught by someone compiling a health report later on.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_health.c |    9 +++++++++
>  1 file changed, 9 insertions(+)

LGTM.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] xfs: selectively keep sick inodes in memory
  2021-06-06 17:54 ` [PATCH 3/3] xfs: selectively keep sick inodes in memory Darrick J. Wong
@ 2021-06-07  1:43   ` Dave Chinner
  2021-06-08 15:02   ` Carlos Maiolino
  1 sibling, 0 replies; 16+ messages in thread
From: Dave Chinner @ 2021-06-07  1:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, bfoster

On Sun, Jun 06, 2021 at 10:54:22AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> It's important that the filesystem retain its memory of sick inodes for
> a little while after problems are found so that reports can be collected
> about what was wrong.  Don't let inode reclamation free sick inodes
> unless we're unmounting or the fs already went down.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_icache.c |   45 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 39 insertions(+), 6 deletions(-)

Looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode
  2021-06-06 17:54 ` [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode Darrick J. Wong
@ 2021-06-08 14:55   ` Carlos Maiolino
  0 siblings, 0 replies; 16+ messages in thread
From: Carlos Maiolino @ 2021-06-08 14:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Brian Foster, Dave Chinner, linux-xfs, david

On Sun, Jun 06, 2021 at 10:54:11AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> While running some fuzz tests on inode metadata, I noticed that the
> filesystem health report (as provided by xfs_spaceman) failed to report
> the file corruption even when spaceman was run immediately after running
> xfs_scrub to detect the corruption.  That isn't the intended behavior;
> one ought to be able to run scrub to detect errors in the ondisk
> metadata and be able to access to those reports for some time after the
> scrub.
> 
> After running the same sequence through an instrumented kernel, I
> discovered the reason why -- scrub igets the file, scans it, marks it
> sick, and ireleases the inode.  When the VFS lets go of the incore
> inode, it moves to RECLAIMABLE state.  If spaceman igets the incore
> inode before it moves to RECLAIM state, iget reinitializes the VFS
> state, clears the sick and checked masks, and hands back the inode.  At
> this point, the caller has the exact same incore inode, but with all the
> health state erased.
> 
> In other words, we're erasing the incore inode's health state flags when
> we've decided NOT to sever the link between the incore inode and the
> ondisk inode.  This is wrong, so we need to remove the lines that zero
> the fields from xfs_iget_cache_hit.
> 
> As a precaution, we add the same lines into xfs_reclaim_inode just after
> we sever the link between incore and ondisk inode.  Strictly speaking
> this isn't necessary because once an inode has gone through reclaim it
> must go through xfs_inode_alloc (which also zeroes the state) and
> xfs_iget is careful to check for mismatches between the inode it pulls
> out of the radix tree and the one it wants.
> 
> ---

Looks good.
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

>  fs/xfs/xfs_icache.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 396cc54ca03f..c3f912a9231b 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -523,9 +523,6 @@ xfs_iget_cache_hit(
>  				XFS_INO_TO_AGINO(pag->pag_mount, ino),
>  				XFS_ICI_RECLAIM_TAG);
>  		inode->i_state = I_NEW;
> -		ip->i_sick = 0;
> -		ip->i_checked = 0;
> -
>  		spin_unlock(&ip->i_flags_lock);
>  		spin_unlock(&pag->pag_ici_lock);
>  	} else {
> @@ -979,6 +976,8 @@ xfs_reclaim_inode(
>  	spin_lock(&ip->i_flags_lock);
>  	ip->i_flags = XFS_IRECLAIM;
>  	ip->i_ino = 0;
> +	ip->i_sick = 0;
> +	ip->i_checked = 0;
>  	spin_unlock(&ip->i_flags_lock);
>  
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> 

-- 
Carlos


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-06 17:54 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
  2021-06-07  1:40   ` Dave Chinner
@ 2021-06-08 14:59   ` Carlos Maiolino
  2021-06-08 15:21     ` Darrick J. Wong
  1 sibling, 1 reply; 16+ messages in thread
From: Carlos Maiolino @ 2021-06-08 14:59 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, bfoster

Hi,

On Sun, Jun 06, 2021 at 10:54:17AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we decide to mark an inode sick, clear the DONTCACHE flag so that
> the incore inode will be kept around until memory pressure forces it out
> of memory.  This increases the chances that the sick status will be
> caught by someone compiling a health report later on.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

The patch looks ok, so you can add:

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>


Now, I have a probably dumb question about this.

by removing the I_DONTCACHE flag, as you said, we are increasing the chances
that the sick status will be caught, so, in either case, it seems not reliable.
So, my dumb question is, is there reason having these inodes around will benefit
us somehow? I haven't read the whole code, but I assume, it can be used as a
fast path while scrubbing the FS?

Cheers.

> ---
>  fs/xfs/xfs_health.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> 
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 8e0cb05a7142..806be8a93ea3 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -231,6 +231,15 @@ xfs_inode_mark_sick(
>  	ip->i_sick |= mask;
>  	ip->i_checked |= mask;
>  	spin_unlock(&ip->i_flags_lock);
> +
> +	/*
> +	 * Keep this inode around so we don't lose the sickness report.  Scrub
> +	 * grabs inodes with DONTCACHE assuming that most inode are ok, which
> +	 * is not the case here.
> +	 */
> +	spin_lock(&VFS_I(ip)->i_lock);
> +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> +	spin_unlock(&VFS_I(ip)->i_lock);
>  }
>  
>  /* Mark parts of an inode healed. */
> 

-- 
Carlos


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] xfs: selectively keep sick inodes in memory
  2021-06-06 17:54 ` [PATCH 3/3] xfs: selectively keep sick inodes in memory Darrick J. Wong
  2021-06-07  1:43   ` Dave Chinner
@ 2021-06-08 15:02   ` Carlos Maiolino
  1 sibling, 0 replies; 16+ messages in thread
From: Carlos Maiolino @ 2021-06-08 15:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, bfoster

On Sun, Jun 06, 2021 at 10:54:22AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> It's important that the filesystem retain its memory of sick inodes for
> a little while after problems are found so that reports can be collected
> about what was wrong.  Don't let inode reclamation free sick inodes
> unless we're unmounting or the fs already went down.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> ---
>  fs/xfs/xfs_icache.c |   45 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 39 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index c3f912a9231b..53dab8959e1d 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -71,10 +71,13 @@ static int xfs_icwalk_ag(struct xfs_perag *pag,
>  /* Stop scanning after icw_scan_limit inodes. */
>  #define XFS_ICWALK_FLAG_SCAN_LIMIT	(1U << 28)
>  
> +#define XFS_ICWALK_FLAG_RECLAIM_SICK	(1U << 27)
> +
>  #define XFS_ICWALK_PRIVATE_FLAGS	(XFS_ICWALK_FLAG_DROP_UDQUOT | \
>  					 XFS_ICWALK_FLAG_DROP_GDQUOT | \
>  					 XFS_ICWALK_FLAG_DROP_PDQUOT | \
> -					 XFS_ICWALK_FLAG_SCAN_LIMIT)
> +					 XFS_ICWALK_FLAG_SCAN_LIMIT | \
> +					 XFS_ICWALK_FLAG_RECLAIM_SICK)
>  
>  /*
>   * Allocate and initialise an xfs_inode.
> @@ -910,7 +913,8 @@ xfs_dqrele_all_inodes(
>   */
>  static bool
>  xfs_reclaim_igrab(
> -	struct xfs_inode	*ip)
> +	struct xfs_inode	*ip,
> +	struct xfs_eofblocks	*eofb)
>  {
>  	ASSERT(rcu_read_lock_held());
>  
> @@ -921,6 +925,14 @@ xfs_reclaim_igrab(
>  		spin_unlock(&ip->i_flags_lock);
>  		return false;
>  	}
> +
> +	/* Don't reclaim a sick inode unless the caller asked for it. */
> +	if (ip->i_sick &&
> +	    (!eofb || !(eofb->eof_flags & XFS_ICWALK_FLAG_RECLAIM_SICK))) {
> +		spin_unlock(&ip->i_flags_lock);
> +		return false;
> +	}
> +
>  	__xfs_iflags_set(ip, XFS_IRECLAIM);
>  	spin_unlock(&ip->i_flags_lock);
>  	return true;
> @@ -1021,13 +1033,30 @@ xfs_reclaim_inode(
>  	xfs_iflags_clear(ip, XFS_IRECLAIM);
>  }
>  
> +/* Reclaim sick inodes if we're unmounting or the fs went down. */
> +static inline bool
> +xfs_want_reclaim_sick(
> +	struct xfs_mount	*mp)
> +{
> +	return (mp->m_flags & XFS_MOUNT_UNMOUNTING) ||
> +	       (mp->m_flags & XFS_MOUNT_NORECOVERY) ||
> +	       XFS_FORCED_SHUTDOWN(mp);
> +}
> +
>  void
>  xfs_reclaim_inodes(
>  	struct xfs_mount	*mp)
>  {
> +	struct xfs_eofblocks	eofb = {
> +		.eof_flags	= 0,
> +	};
> +
> +	if (xfs_want_reclaim_sick(mp))
> +		eofb.eof_flags |= XFS_ICWALK_FLAG_RECLAIM_SICK;
> +
>  	while (radix_tree_tagged(&mp->m_perag_tree, XFS_ICI_RECLAIM_TAG)) {
>  		xfs_ail_push_all_sync(mp->m_ail);
> -		xfs_icwalk(mp, XFS_ICWALK_RECLAIM, NULL);
> +		xfs_icwalk(mp, XFS_ICWALK_RECLAIM, &eofb);
>  	}
>  }
>  
> @@ -1048,6 +1077,9 @@ xfs_reclaim_inodes_nr(
>  		.icw_scan_limit	= nr_to_scan,
>  	};
>  
> +	if (xfs_want_reclaim_sick(mp))
> +		eofb.eof_flags |= XFS_ICWALK_FLAG_RECLAIM_SICK;
> +
>  	/* kick background reclaimer and push the AIL */
>  	xfs_reclaim_work_queue(mp);
>  	xfs_ail_push_all(mp->m_ail);
> @@ -1605,7 +1637,8 @@ xfs_blockgc_free_quota(
>  static inline bool
>  xfs_icwalk_igrab(
>  	enum xfs_icwalk_goal	goal,
> -	struct xfs_inode	*ip)
> +	struct xfs_inode	*ip,
> +	struct xfs_eofblocks	*eofb)
>  {
>  	switch (goal) {
>  	case XFS_ICWALK_DQRELE:
> @@ -1613,7 +1646,7 @@ xfs_icwalk_igrab(
>  	case XFS_ICWALK_BLOCKGC:
>  		return xfs_blockgc_igrab(ip);
>  	case XFS_ICWALK_RECLAIM:
> -		return xfs_reclaim_igrab(ip);
> +		return xfs_reclaim_igrab(ip, eofb);
>  	default:
>  		return false;
>  	}
> @@ -1702,7 +1735,7 @@ xfs_icwalk_ag(
>  		for (i = 0; i < nr_found; i++) {
>  			struct xfs_inode *ip = batch[i];
>  
> -			if (done || !xfs_icwalk_igrab(goal, ip))
> +			if (done || !xfs_icwalk_igrab(goal, ip, eofb))
>  				batch[i] = NULL;
>  
>  			/*
> 

-- 
Carlos


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-08 14:59   ` Carlos Maiolino
@ 2021-06-08 15:21     ` Darrick J. Wong
  2021-06-09  8:55       ` Carlos Maiolino
  0 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-08 15:21 UTC (permalink / raw)
  To: linux-xfs, david, bfoster

On Tue, Jun 08, 2021 at 04:59:48PM +0200, Carlos Maiolino wrote:
> Hi,
> 
> On Sun, Jun 06, 2021 at 10:54:17AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > When we decide to mark an inode sick, clear the DONTCACHE flag so that
> > the incore inode will be kept around until memory pressure forces it out
> > of memory.  This increases the chances that the sick status will be
> > caught by someone compiling a health report later on.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> 
> The patch looks ok, so you can add:
> 
> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> 
> 
> Now, I have a probably dumb question about this.
> 
> by removing the I_DONTCACHE flag, as you said, we are increasing the chances
> that the sick status will be caught, so, in either case, it seems not reliable.
> So, my dumb question is, is there reason having these inodes around will benefit
> us somehow? I haven't read the whole code, but I assume, it can be used as a
> fast path while scrubbing the FS?

Two answers to your question: In the short term, preserving the incore
inode means that a subsequent reporting run (xfs_spaceman -c 'health')
is more likely to pick up the sickness report.

In the longer term, I intend to re-enable reclamation of sick inodes
by aggregating the per-inode sick bit in the per-AG health status so
that reporting won't be interrupted by memory demand:

[1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting

(I haven't rebased that part in quite a while though.)

--D

> 
> Cheers.
> 
> > ---
> >  fs/xfs/xfs_health.c |    9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 8e0cb05a7142..806be8a93ea3 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -231,6 +231,15 @@ xfs_inode_mark_sick(
> >  	ip->i_sick |= mask;
> >  	ip->i_checked |= mask;
> >  	spin_unlock(&ip->i_flags_lock);
> > +
> > +	/*
> > +	 * Keep this inode around so we don't lose the sickness report.  Scrub
> > +	 * grabs inodes with DONTCACHE assuming that most inode are ok, which
> > +	 * is not the case here.
> > +	 */
> > +	spin_lock(&VFS_I(ip)->i_lock);
> > +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> > +	spin_unlock(&VFS_I(ip)->i_lock);
> >  }
> >  
> >  /* Mark parts of an inode healed. */
> > 
> 
> -- 
> Carlos
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-08 15:21     ` Darrick J. Wong
@ 2021-06-09  8:55       ` Carlos Maiolino
  0 siblings, 0 replies; 16+ messages in thread
From: Carlos Maiolino @ 2021-06-09  8:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, bfoster

On Tue, Jun 08, 2021 at 08:21:46AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 08, 2021 at 04:59:48PM +0200, Carlos Maiolino wrote:
> > Hi,
> > 
> > On Sun, Jun 06, 2021 at 10:54:17AM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > When we decide to mark an inode sick, clear the DONTCACHE flag so that
> > > the incore inode will be kept around until memory pressure forces it out
> > > of memory.  This increases the chances that the sick status will be
> > > caught by someone compiling a health report later on.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > 
> > The patch looks ok, so you can add:
> > 
> > Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> > 
> > 
> > Now, I have a probably dumb question about this.
> > 
> > by removing the I_DONTCACHE flag, as you said, we are increasing the chances
> > that the sick status will be caught, so, in either case, it seems not reliable.
> > So, my dumb question is, is there reason having these inodes around will benefit
> > us somehow? I haven't read the whole code, but I assume, it can be used as a
> > fast path while scrubbing the FS?
> 
> Two answers to your question: In the short term, preserving the incore
> inode means that a subsequent reporting run (xfs_spaceman -c 'health')
> is more likely to pick up the sickness report.
> 
> In the longer term, I intend to re-enable reclamation of sick inodes
> by aggregating the per-inode sick bit in the per-AG health status so
> that reporting won't be interrupted by memory demand:
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting
> 
> (I haven't rebased that part in quite a while though.)

Thanks!

> 
> --D
> 
> > 
> > Cheers.
> > 
> > > ---
> > >  fs/xfs/xfs_health.c |    9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > 
> > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > index 8e0cb05a7142..806be8a93ea3 100644
> > > --- a/fs/xfs/xfs_health.c
> > > +++ b/fs/xfs/xfs_health.c
> > > @@ -231,6 +231,15 @@ xfs_inode_mark_sick(
> > >  	ip->i_sick |= mask;
> > >  	ip->i_checked |= mask;
> > >  	spin_unlock(&ip->i_flags_lock);
> > > +
> > > +	/*
> > > +	 * Keep this inode around so we don't lose the sickness report.  Scrub
> > > +	 * grabs inodes with DONTCACHE assuming that most inode are ok, which
> > > +	 * is not the case here.
> > > +	 */
> > > +	spin_lock(&VFS_I(ip)->i_lock);
> > > +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> > > +	spin_unlock(&VFS_I(ip)->i_lock);
> > >  }
> > >  
> > >  /* Mark parts of an inode healed. */
> > > 
> > 
> > -- 
> > Carlos
> > 
> 

-- 
Carlos


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-03  4:34   ` Dave Chinner
@ 2021-06-03 20:49     ` Darrick J. Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-03 20:49 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, bfoster

On Thu, Jun 03, 2021 at 02:34:46PM +1000, Dave Chinner wrote:
> On Wed, Jun 02, 2021 at 08:12:46PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > When we decide to mark an inode sick, clear the DONTCACHE flag so that
> > the incore inode will be kept around until memory pressure forces it out
> > of memory.  This increases the chances that the sick status will be
> > caught by someone compiling a health report later on.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/xfs_health.c |    5 +++++
> >  fs/xfs/xfs_icache.c |    3 ++-
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 8e0cb05a7142..824e0b781290 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -231,6 +231,11 @@ xfs_inode_mark_sick(
> >  	ip->i_sick |= mask;
> >  	ip->i_checked |= mask;
> >  	spin_unlock(&ip->i_flags_lock);
> > +
> > +	/* Keep this inode around so we don't lose the sickness report. */
> > +	spin_lock(&VFS_I(ip)->i_lock);
> > +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> > +	spin_unlock(&VFS_I(ip)->i_lock);
> >  }
> 
> Dentries will still be reclaimed, but the VFS will at least hold on
> to the inode in this case.

Right.

> >  /* Mark parts of an inode healed. */
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index c3f912a9231b..0e2b6c05e604 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -23,6 +23,7 @@
> >  #include "xfs_dquot.h"
> >  #include "xfs_reflink.h"
> >  #include "xfs_ialloc.h"
> > +#include "xfs_health.h"
> >  
> >  #include <linux/iversion.h>
> >  
> > @@ -648,7 +649,7 @@ xfs_iget_cache_miss(
> >  	 * time.
> >  	 */
> >  	iflags = XFS_INEW;
> > -	if (flags & XFS_IGET_DONTCACHE)
> > +	if ((flags & XFS_IGET_DONTCACHE) && xfs_inode_is_healthy(ip))
> 
> Hmmmm. xfs_inode_is_healthy() is kind of heavyweight for just
> checking that ip->i_sick == 0. At this point, nobody else can be
> accessing the inode, so we don't need masks nor a spinlock for
> checking the sick field.
> 
> So why not:
> 
> 	if ((flags & XFS_IGET_DONTCACHE) && !READ_ONCE(ip->i_sick))
> 
> Or maybe still use xfs_inode_is_healthy() but convert it to the
> simpler, lockless sick check?

As Brian points out, it's totally unnecessary.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-03 12:23   ` Brian Foster
@ 2021-06-03 20:48     ` Darrick J. Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-03 20:48 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, david

On Thu, Jun 03, 2021 at 08:23:21AM -0400, Brian Foster wrote:
> On Wed, Jun 02, 2021 at 08:12:46PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > When we decide to mark an inode sick, clear the DONTCACHE flag so that
> > the incore inode will be kept around until memory pressure forces it out
> > of memory.  This increases the chances that the sick status will be
> > caught by someone compiling a health report later on.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/xfs_health.c |    5 +++++
> >  fs/xfs/xfs_icache.c |    3 ++-
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 8e0cb05a7142..824e0b781290 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -231,6 +231,11 @@ xfs_inode_mark_sick(
> >  	ip->i_sick |= mask;
> >  	ip->i_checked |= mask;
> >  	spin_unlock(&ip->i_flags_lock);
> > +
> > +	/* Keep this inode around so we don't lose the sickness report. */
> > +	spin_lock(&VFS_I(ip)->i_lock);
> > +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> > +	spin_unlock(&VFS_I(ip)->i_lock);
> 
> If I follow the scrub code correctly, it will grab a dontcache reference
> on the inode, so presumably the intent here is to clear that status once
> we've identified some problem to keep the inode around. Seems
> reasonable.

<nod> I'll expand the comment:

	/*
	 * Keep this inode around so we don't lose the sickness report.
	 * Scrub grabs inodes with DONTCACHE assuming that most inode
	 * are ok, which is not the case here.
	 */

> 
> >  }
> >  
> >  /* Mark parts of an inode healed. */
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index c3f912a9231b..0e2b6c05e604 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -23,6 +23,7 @@
> >  #include "xfs_dquot.h"
> >  #include "xfs_reflink.h"
> >  #include "xfs_ialloc.h"
> > +#include "xfs_health.h"
> >  
> >  #include <linux/iversion.h>
> >  
> > @@ -648,7 +649,7 @@ xfs_iget_cache_miss(
> >  	 * time.
> >  	 */
> >  	iflags = XFS_INEW;
> > -	if (flags & XFS_IGET_DONTCACHE)
> > +	if ((flags & XFS_IGET_DONTCACHE) && xfs_inode_is_healthy(ip))
> >  		d_mark_dontcache(VFS_I(ip));
> 
> This one I'm less clear on.. we've just allocated ip above and haven't
> made it accessible yet. What's the use case for finding an unhealthy
> inode here?

Hm.  I think I went overboard looking for DONTCACHE here, and it doesn't
make any sense to make this change.  Ok, dropped.

--D

> 
> Brian
> 
> >  	ip->i_udquot = NULL;
> >  	ip->i_gdquot = NULL;
> > 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-03  3:12 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
  2021-06-03  4:34   ` Dave Chinner
@ 2021-06-03 12:23   ` Brian Foster
  2021-06-03 20:48     ` Darrick J. Wong
  1 sibling, 1 reply; 16+ messages in thread
From: Brian Foster @ 2021-06-03 12:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david

On Wed, Jun 02, 2021 at 08:12:46PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we decide to mark an inode sick, clear the DONTCACHE flag so that
> the incore inode will be kept around until memory pressure forces it out
> of memory.  This increases the chances that the sick status will be
> caught by someone compiling a health report later on.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_health.c |    5 +++++
>  fs/xfs/xfs_icache.c |    3 ++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 8e0cb05a7142..824e0b781290 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -231,6 +231,11 @@ xfs_inode_mark_sick(
>  	ip->i_sick |= mask;
>  	ip->i_checked |= mask;
>  	spin_unlock(&ip->i_flags_lock);
> +
> +	/* Keep this inode around so we don't lose the sickness report. */
> +	spin_lock(&VFS_I(ip)->i_lock);
> +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> +	spin_unlock(&VFS_I(ip)->i_lock);

If I follow the scrub code correctly, it will grab a dontcache reference
on the inode, so presumably the intent here is to clear that status once
we've identified some problem to keep the inode around. Seems
reasonable.

>  }
>  
>  /* Mark parts of an inode healed. */
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index c3f912a9231b..0e2b6c05e604 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -23,6 +23,7 @@
>  #include "xfs_dquot.h"
>  #include "xfs_reflink.h"
>  #include "xfs_ialloc.h"
> +#include "xfs_health.h"
>  
>  #include <linux/iversion.h>
>  
> @@ -648,7 +649,7 @@ xfs_iget_cache_miss(
>  	 * time.
>  	 */
>  	iflags = XFS_INEW;
> -	if (flags & XFS_IGET_DONTCACHE)
> +	if ((flags & XFS_IGET_DONTCACHE) && xfs_inode_is_healthy(ip))
>  		d_mark_dontcache(VFS_I(ip));

This one I'm less clear on.. we've just allocated ip above and haven't
made it accessible yet. What's the use case for finding an unhealthy
inode here?

Brian

>  	ip->i_udquot = NULL;
>  	ip->i_gdquot = NULL;
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-03  3:12 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
@ 2021-06-03  4:34   ` Dave Chinner
  2021-06-03 20:49     ` Darrick J. Wong
  2021-06-03 12:23   ` Brian Foster
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Chinner @ 2021-06-03  4:34 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, bfoster

On Wed, Jun 02, 2021 at 08:12:46PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we decide to mark an inode sick, clear the DONTCACHE flag so that
> the incore inode will be kept around until memory pressure forces it out
> of memory.  This increases the chances that the sick status will be
> caught by someone compiling a health report later on.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_health.c |    5 +++++
>  fs/xfs/xfs_icache.c |    3 ++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 8e0cb05a7142..824e0b781290 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -231,6 +231,11 @@ xfs_inode_mark_sick(
>  	ip->i_sick |= mask;
>  	ip->i_checked |= mask;
>  	spin_unlock(&ip->i_flags_lock);
> +
> +	/* Keep this inode around so we don't lose the sickness report. */
> +	spin_lock(&VFS_I(ip)->i_lock);
> +	VFS_I(ip)->i_state &= ~I_DONTCACHE;
> +	spin_unlock(&VFS_I(ip)->i_lock);
>  }

Dentries will still be reclaimed, but the VFS will at least hold on
to the inode in this case.

>  /* Mark parts of an inode healed. */
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index c3f912a9231b..0e2b6c05e604 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -23,6 +23,7 @@
>  #include "xfs_dquot.h"
>  #include "xfs_reflink.h"
>  #include "xfs_ialloc.h"
> +#include "xfs_health.h"
>  
>  #include <linux/iversion.h>
>  
> @@ -648,7 +649,7 @@ xfs_iget_cache_miss(
>  	 * time.
>  	 */
>  	iflags = XFS_INEW;
> -	if (flags & XFS_IGET_DONTCACHE)
> +	if ((flags & XFS_IGET_DONTCACHE) && xfs_inode_is_healthy(ip))

Hmmmm. xfs_inode_is_healthy() is kind of heavyweight for just
checking that ip->i_sick == 0. At this point, nobody else can be
accessing the inode, so we don't need masks nor a spinlock for
checking the sick field.

So why not:

	if ((flags & XFS_IGET_DONTCACHE) && !READ_ONCE(ip->i_sick))

Or maybe still use xfs_inode_is_healthy() but convert it to the
simpler, lockless sick check?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick
  2021-06-03  3:12 [PATCHSET v2 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
@ 2021-06-03  3:12 ` Darrick J. Wong
  2021-06-03  4:34   ` Dave Chinner
  2021-06-03 12:23   ` Brian Foster
  0 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2021-06-03  3:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, david, bfoster

From: Darrick J. Wong <djwong@kernel.org>

When we decide to mark an inode sick, clear the DONTCACHE flag so that
the incore inode will be kept around until memory pressure forces it out
of memory.  This increases the chances that the sick status will be
caught by someone compiling a health report later on.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_health.c |    5 +++++
 fs/xfs/xfs_icache.c |    3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 8e0cb05a7142..824e0b781290 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -231,6 +231,11 @@ xfs_inode_mark_sick(
 	ip->i_sick |= mask;
 	ip->i_checked |= mask;
 	spin_unlock(&ip->i_flags_lock);
+
+	/* Keep this inode around so we don't lose the sickness report. */
+	spin_lock(&VFS_I(ip)->i_lock);
+	VFS_I(ip)->i_state &= ~I_DONTCACHE;
+	spin_unlock(&VFS_I(ip)->i_lock);
 }
 
 /* Mark parts of an inode healed. */
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index c3f912a9231b..0e2b6c05e604 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -23,6 +23,7 @@
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
 #include "xfs_ialloc.h"
+#include "xfs_health.h"
 
 #include <linux/iversion.h>
 
@@ -648,7 +649,7 @@ xfs_iget_cache_miss(
 	 * time.
 	 */
 	iflags = XFS_INEW;
-	if (flags & XFS_IGET_DONTCACHE)
+	if ((flags & XFS_IGET_DONTCACHE) && xfs_inode_is_healthy(ip))
 		d_mark_dontcache(VFS_I(ip));
 	ip->i_udquot = NULL;
 	ip->i_gdquot = NULL;


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-06-09  8:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-06 17:54 [PATCHSET v3 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
2021-06-06 17:54 ` [PATCH 1/3] xfs: only reset incore inode health state flags when reclaiming an inode Darrick J. Wong
2021-06-08 14:55   ` Carlos Maiolino
2021-06-06 17:54 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
2021-06-07  1:40   ` Dave Chinner
2021-06-08 14:59   ` Carlos Maiolino
2021-06-08 15:21     ` Darrick J. Wong
2021-06-09  8:55       ` Carlos Maiolino
2021-06-06 17:54 ` [PATCH 3/3] xfs: selectively keep sick inodes in memory Darrick J. Wong
2021-06-07  1:43   ` Dave Chinner
2021-06-08 15:02   ` Carlos Maiolino
  -- strict thread matches above, loose matches on Subject: below --
2021-06-03  3:12 [PATCHSET v2 0/3] xfs: preserve inode health reports for longer Darrick J. Wong
2021-06-03  3:12 ` [PATCH 2/3] xfs: drop IDONTCACHE on inodes when we mark them sick Darrick J. Wong
2021-06-03  4:34   ` Dave Chinner
2021-06-03 20:49     ` Darrick J. Wong
2021-06-03 12:23   ` Brian Foster
2021-06-03 20:48     ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.