linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] xfs: report corruption to the health trackers
@ 2019-11-14 18:19 Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 1/9] xfs: separate the marking of sick and checked metadata Darrick J. Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

For this series, we add the ability to record hints of corruption errors
in the health tracking system so that administrators can gather reports
about the status of live filesystems.  In the future, we'll be able to
connect the online fsck subsystem to use this information to enable more
targeted scanning of metadata.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This has been lightly tested with fstests.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=corruption-health-reports

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=corruption-health-reports

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/9] xfs: separate the marking of sick and checked metadata
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-20 14:20   ` Brian Foster
  2019-11-14 18:19 ` [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system Darrick J. Wong
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Split the setting of the sick and checked masks into separate functions
as part of preparing to add the ability for regular runtime fs code
(i.e. not scrub) to mark metadata structures sick when corruptions are
found.  Improve the documentation of libxfs' requirements for helper
behavior.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_health.h |   24 ++++++++++++++++++----
 fs/xfs/scrub/health.c      |   20 +++++++++++-------
 fs/xfs/xfs_health.c        |   49 ++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_mount.c         |    5 ++++
 4 files changed, 85 insertions(+), 13 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index 272005ac8c88..3657a9cb8490 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -26,9 +26,11 @@
  * and the "sick" field tells us if that piece was found to need repairs.
  * Therefore we can conclude that for a given sick flag value:
  *
- *  - checked && sick  => metadata needs repair
- *  - checked && !sick => metadata is ok
- *  - !checked         => has not been examined since mount
+ *  - checked && sick   => metadata needs repair
+ *  - checked && !sick  => metadata is ok
+ *  - !checked && sick  => errors have been observed during normal operation,
+ *                         but the metadata has not been checked thoroughly
+ *  - !checked && !sick => has not been examined since mount
  */
 
 struct xfs_mount;
@@ -97,24 +99,38 @@ struct xfs_fsop_geom;
 				 XFS_SICK_INO_SYMLINK | \
 				 XFS_SICK_INO_PARENT)
 
-/* These functions must be provided by the xfs implementation. */
+/*
+ * These functions must be provided by the xfs implementation.  Function
+ * behavior with respect to the first argument should be as follows:
+ *
+ * xfs_*_mark_sick:    set the sick flags and do not set checked flags.
+ * xfs_*_mark_checked: set the checked flags.
+ * xfs_*_mark_healthy: clear the sick flags and set the checked flags.
+ *
+ * xfs_*_measure_sickness: return the sick and check status in the provided
+ * out parameters.
+ */
 
 void xfs_fs_mark_sick(struct xfs_mount *mp, unsigned int mask);
+void xfs_fs_mark_checked(struct xfs_mount *mp, unsigned int mask);
 void xfs_fs_mark_healthy(struct xfs_mount *mp, unsigned int mask);
 void xfs_fs_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
 		unsigned int *checked);
 
 void xfs_rt_mark_sick(struct xfs_mount *mp, unsigned int mask);
+void xfs_rt_mark_checked(struct xfs_mount *mp, unsigned int mask);
 void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
 void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
 		unsigned int *checked);
 
 void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
+void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
 void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
 void xfs_ag_measure_sickness(struct xfs_perag *pag, unsigned int *sick,
 		unsigned int *checked);
 
 void xfs_inode_mark_sick(struct xfs_inode *ip, unsigned int mask);
+void xfs_inode_mark_checked(struct xfs_inode *ip, unsigned int mask);
 void xfs_inode_mark_healthy(struct xfs_inode *ip, unsigned int mask);
 void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
 		unsigned int *checked);
diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c
index 83d27cdf579b..a402f9026d5f 100644
--- a/fs/xfs/scrub/health.c
+++ b/fs/xfs/scrub/health.c
@@ -137,30 +137,34 @@ xchk_update_health(
 	switch (type_to_health_flag[sc->sm->sm_type].group) {
 	case XHG_AG:
 		pag = xfs_perag_get(sc->mp, sc->sm->sm_agno);
-		if (bad)
+		if (bad) {
 			xfs_ag_mark_sick(pag, sc->sick_mask);
-		else
+			xfs_ag_mark_checked(pag, sc->sick_mask);
+		} else
 			xfs_ag_mark_healthy(pag, sc->sick_mask);
 		xfs_perag_put(pag);
 		break;
 	case XHG_INO:
 		if (!sc->ip)
 			return;
-		if (bad)
+		if (bad) {
 			xfs_inode_mark_sick(sc->ip, sc->sick_mask);
-		else
+			xfs_inode_mark_checked(sc->ip, sc->sick_mask);
+		} else
 			xfs_inode_mark_healthy(sc->ip, sc->sick_mask);
 		break;
 	case XHG_FS:
-		if (bad)
+		if (bad) {
 			xfs_fs_mark_sick(sc->mp, sc->sick_mask);
-		else
+			xfs_fs_mark_checked(sc->mp, sc->sick_mask);
+		} else
 			xfs_fs_mark_healthy(sc->mp, sc->sick_mask);
 		break;
 	case XHG_RT:
-		if (bad)
+		if (bad) {
 			xfs_rt_mark_sick(sc->mp, sc->sick_mask);
-		else
+			xfs_rt_mark_checked(sc->mp, sc->sick_mask);
+		} else
 			xfs_rt_mark_healthy(sc->mp, sc->sick_mask);
 		break;
 	default:
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 8e0cb05a7142..860dc70c99e7 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -100,6 +100,18 @@ xfs_fs_mark_sick(
 
 	spin_lock(&mp->m_sb_lock);
 	mp->m_fs_sick |= mask;
+	spin_unlock(&mp->m_sb_lock);
+}
+
+/* Mark per-fs metadata as having been checked. */
+void
+xfs_fs_mark_checked(
+	struct xfs_mount	*mp,
+	unsigned int		mask)
+{
+	ASSERT(!(mask & ~XFS_SICK_FS_PRIMARY));
+
+	spin_lock(&mp->m_sb_lock);
 	mp->m_fs_checked |= mask;
 	spin_unlock(&mp->m_sb_lock);
 }
@@ -143,6 +155,19 @@ xfs_rt_mark_sick(
 
 	spin_lock(&mp->m_sb_lock);
 	mp->m_rt_sick |= mask;
+	spin_unlock(&mp->m_sb_lock);
+}
+
+/* Mark realtime metadata as having been checked. */
+void
+xfs_rt_mark_checked(
+	struct xfs_mount	*mp,
+	unsigned int		mask)
+{
+	ASSERT(!(mask & ~XFS_SICK_RT_PRIMARY));
+	trace_xfs_rt_mark_sick(mp, mask);
+
+	spin_lock(&mp->m_sb_lock);
 	mp->m_rt_checked |= mask;
 	spin_unlock(&mp->m_sb_lock);
 }
@@ -186,6 +211,18 @@ xfs_ag_mark_sick(
 
 	spin_lock(&pag->pag_state_lock);
 	pag->pag_sick |= mask;
+	spin_unlock(&pag->pag_state_lock);
+}
+
+/* Mark per-ag metadata as having been checked. */
+void
+xfs_ag_mark_checked(
+	struct xfs_perag	*pag,
+	unsigned int		mask)
+{
+	ASSERT(!(mask & ~XFS_SICK_AG_PRIMARY));
+
+	spin_lock(&pag->pag_state_lock);
 	pag->pag_checked |= mask;
 	spin_unlock(&pag->pag_state_lock);
 }
@@ -229,6 +266,18 @@ xfs_inode_mark_sick(
 
 	spin_lock(&ip->i_flags_lock);
 	ip->i_sick |= mask;
+	spin_unlock(&ip->i_flags_lock);
+}
+
+/* Mark inode metadata as having been checked. */
+void
+xfs_inode_mark_checked(
+	struct xfs_inode	*ip,
+	unsigned int		mask)
+{
+	ASSERT(!(mask & ~XFS_SICK_INO_PRIMARY));
+
+	spin_lock(&ip->i_flags_lock);
 	ip->i_checked |= mask;
 	spin_unlock(&ip->i_flags_lock);
 }
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index fca65109cf24..27aa143d524b 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -555,8 +555,10 @@ xfs_check_summary_counts(
 	if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) &&
 	    (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks ||
 	     !xfs_verify_icount(mp, mp->m_sb.sb_icount) ||
-	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount))
+	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) {
 		xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
+		xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
+	}
 
 	/*
 	 * We can safely re-initialise incore superblock counters from the
@@ -1322,6 +1324,7 @@ xfs_force_summary_recalc(
 		return;
 
 	xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
+	xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
 }
 
 /*


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 1/9] xfs: separate the marking of sick and checked metadata Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-20 14:20   ` Brian Foster
  2019-11-14 18:19 ` [PATCH 3/9] xfs: report block map " Darrick J. Wong
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter a corrupt AG header, we should report that to the
health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
 fs/xfs/libxfs/xfs_health.h   |    6 ++++++
 fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
 fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
 fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
 fs/xfs/libxfs/xfs_sb.c       |    2 ++
 fs/xfs/xfs_health.c          |   17 +++++++++++++++++
 fs/xfs/xfs_inode.c           |    9 +++++++++
 8 files changed, 51 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index c284e10af491..e75e3ae6c912 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -26,6 +26,7 @@
 #include "xfs_log.h"
 #include "xfs_ag_resv.h"
 #include "xfs_bmap.h"
+#include "xfs_health.h"
 
 extern kmem_zone_t	*xfs_bmap_free_item_zone;
 
@@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
 			mp, tp, mp->m_ddev_targp,
 			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
 			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
 	if (error)
 		return error;
 	xfs_buf_set_ref(bp, XFS_AGFL_REF);
@@ -722,6 +725,7 @@ xfs_alloc_update_counters(
 	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
 		     be32_to_cpu(agf->agf_length))) {
 		xfs_buf_corruption_error(agbp);
+		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
 		return -EFSCORRUPTED;
 	}
 
@@ -2952,6 +2956,8 @@ xfs_read_agf(
 			mp, tp, mp->m_ddev_targp,
 			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
 			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
 	if (error)
 		return error;
 	if (!*bpp)
diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index 3657a9cb8490..ce8954a10c66 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
 void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
 		unsigned int *checked);
 
+void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
+		unsigned int mask);
 void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
 void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
 void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
@@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
 void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
 void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
 
+#define xfs_metadata_is_sick(error) \
+	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
+		  (error) == -EFSBADCRC))
+
 #endif	/* __XFS_HEALTH_H__ */
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 988cde7744e6..c401512a4350 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -27,6 +27,7 @@
 #include "xfs_trace.h"
 #include "xfs_log.h"
 #include "xfs_rmap.h"
+#include "xfs_health.h"
 
 /*
  * Lookup a record by ino in the btree given by cur.
@@ -2635,6 +2636,8 @@ xfs_read_agi(
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
 			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
 			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
 	if (error)
 		return error;
 	if (tp)
diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
index d7d702ee4d1a..25c87834e42a 100644
--- a/fs/xfs/libxfs/xfs_refcount.c
+++ b/fs/xfs/libxfs/xfs_refcount.c
@@ -22,6 +22,7 @@
 #include "xfs_bit.h"
 #include "xfs_refcount.h"
 #include "xfs_rmap.h"
+#include "xfs_health.h"
 
 /* Allowable refcount adjustment amounts. */
 enum xfs_refc_adjust_op {
@@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
 				XFS_ALLOC_FLAG_FREEING, &agbp);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
+		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
+			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
 			return -EFSCORRUPTED;
+		}
 
 		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
 		if (!rcur) {
diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index ff9412f113c4..a54a3c129cce 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -21,6 +21,7 @@
 #include "xfs_errortag.h"
 #include "xfs_error.h"
 #include "xfs_inode.h"
+#include "xfs_health.h"
 
 /*
  * Lookup the first record less than or equal to [bno, len, owner, offset]
@@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
 		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
+		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
+			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
 			return -EFSCORRUPTED;
+		}
 
 		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
 		if (!rcur) {
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 0ac69751fe85..4a923545465d 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
 			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
 			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
 	if (error)
 		return error;
 	xfs_buf_set_ref(bp, XFS_SSB_REF);
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 860dc70c99e7..36c32b108b39 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
 	spin_unlock(&mp->m_sb_lock);
 }
 
+/* Mark unhealthy per-ag metadata given a raw AG number. */
+void
+xfs_agno_mark_sick(
+	struct xfs_mount	*mp,
+	xfs_agnumber_t		agno,
+	unsigned int		mask)
+{
+	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
+
+	/* per-ag structure not set up yet? */
+	if (!pag)
+		return;
+
+	xfs_ag_mark_sick(pag, mask);
+	xfs_perag_put(pag);
+}
+
 /* Mark unhealthy per-ag metadata. */
 void
 xfs_ag_mark_sick(
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 401da197f012..a2812cea748d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -35,6 +35,7 @@
 #include "xfs_log.h"
 #include "xfs_bmap_btree.h"
 #include "xfs_reflink.h"
+#include "xfs_health.h"
 
 kmem_zone_t *xfs_inode_zone;
 
@@ -787,6 +788,8 @@ xfs_ialloc(
 	 */
 	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
 		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
+		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
+				XFS_SICK_AG_INOBT);
 		return -EFSCORRUPTED;
 	}
 
@@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
 	 */
 	if (old_value == new_agino) {
 		xfs_buf_corruption_error(agibp);
+		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
 		return -EFSCORRUPTED;
 	}
 
@@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
 	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
 				sizeof(*dip), __this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
 		if (next_agino != NULLAGINO) {
 			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
 					dip, sizeof(*dip), __this_address);
+			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 			error = -EFSCORRUPTED;
 		}
 		goto out;
@@ -2271,6 +2277,7 @@ xfs_iunlink(
 	if (next_agino == agino ||
 	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
 		xfs_buf_corruption_error(agibp);
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
 		return -EFSCORRUPTED;
 	}
 
@@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
 			XFS_CORRUPTION_ERROR(__func__,
 					XFS_ERRLEVEL_LOW, mp,
 					*dipp, sizeof(**dipp));
+			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
 			error = -EFSCORRUPTED;
 			return error;
 		}
@@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
 	if (!xfs_verify_agino(mp, agno, head_agino)) {
 		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
 				agi, sizeof(*agi));
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
 		return -EFSCORRUPTED;
 	}
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/9] xfs: report block map corruption errors to the health tracking system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 1/9] xfs: separate the marking of sick and checked metadata Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-20 14:21   ` Brian Foster
  2019-11-14 18:19 ` [PATCH 4/9] xfs: report btree block corruption errors to the health system Darrick J. Wong
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter a corrupt block mapping, we should report that to
the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap.c   |   39 +++++++++++++++++++++++++++++++++------
 fs/xfs/libxfs/xfs_health.h |    1 +
 fs/xfs/xfs_health.c        |   26 ++++++++++++++++++++++++++
 fs/xfs/xfs_iomap.c         |   15 +++++++++++----
 4 files changed, 71 insertions(+), 10 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 4acc6e37c31d..c4674fb0bfb4 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -35,7 +35,7 @@
 #include "xfs_refcount.h"
 #include "xfs_icache.h"
 #include "xfs_iomap.h"
-
+#include "xfs_health.h"
 
 kmem_zone_t		*xfs_bmap_free_item_zone;
 
@@ -732,6 +732,7 @@ xfs_bmap_extents_to_btree(
 	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
 	abp = xfs_btree_get_bufl(mp, tp, args.fsbno);
 	if (XFS_IS_CORRUPT(mp, !abp)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		error = -EFSCORRUPTED;
 		goto out_unreserve_dquot;
 	}
@@ -1021,6 +1022,7 @@ xfs_bmap_add_attrfork_local(
 
 	/* should only be called for types that support local format data */
 	ASSERT(0);
+	xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
 	return -EFSCORRUPTED;
 }
 
@@ -1090,6 +1092,7 @@ xfs_bmap_add_attrfork(
 	if (XFS_IFORK_Q(ip))
 		goto trans_cancel;
 	if (XFS_IS_CORRUPT(mp, ip->i_d.di_anextents != 0)) {
+		xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
 		error = -EFSCORRUPTED;
 		goto trans_cancel;
 	}
@@ -1192,6 +1195,7 @@ xfs_iread_bmbt_block(
 				(unsigned long long)ip->i_ino);
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, block,
 				sizeof(*block), __this_address);
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -1207,6 +1211,7 @@ xfs_iread_bmbt_block(
 			xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 					"xfs_iread_extents(2)", frp,
 					sizeof(*frp), fa);
+			xfs_bmap_mark_sick(ip, whichfork);
 			return -EFSCORRUPTED;
 		}
 		xfs_iext_insert(ip, &ir->icur, &new,
@@ -1239,6 +1244,7 @@ xfs_iread_extents(
 	if (XFS_IS_CORRUPT(mp,
 			   XFS_IFORK_FORMAT(ip, whichfork) !=
 			   XFS_DINODE_FMT_BTREE)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -1254,6 +1260,7 @@ xfs_iread_extents(
 
 	if (XFS_IS_CORRUPT(mp,
 			   ir.loaded != XFS_IFORK_NEXTENTS(ip, whichfork))) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -1262,6 +1269,8 @@ xfs_iread_extents(
 	ifp->if_flags |= XFS_IFEXTENTS;
 	return 0;
 out:
+	if (xfs_metadata_is_sick(error))
+		xfs_bmap_mark_sick(ip, whichfork);
 	xfs_iext_destroy(ifp);
 	return error;
 }
@@ -1344,6 +1353,7 @@ xfs_bmap_last_before(
 		break;
 	default:
 		ASSERT(0);
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -1443,8 +1453,11 @@ xfs_bmap_last_offset(
 	if (XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_LOCAL)
 		return 0;
 
-	if (XFS_IS_CORRUPT(ip->i_mount, !xfs_ifork_has_extents(ip, whichfork)))
+	if (XFS_IS_CORRUPT(ip->i_mount,
+	    !xfs_ifork_has_extents(ip, whichfork))) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
+	}
 
 	error = xfs_bmap_last_extent(NULL, ip, whichfork, &rec, &is_empty);
 	if (error || is_empty)
@@ -3905,6 +3918,7 @@ xfs_bmapi_read(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -3935,6 +3949,7 @@ xfs_bmapi_read(
 		xfs_alert(mp, "%s: inode %llu missing fork %d",
 				__func__, ip->i_ino, whichfork);
 #endif /* DEBUG */
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -4414,6 +4429,7 @@ xfs_bmapi_write(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -4621,9 +4637,11 @@ xfs_bmapi_convert_delalloc(
 	error = -ENOSPC;
 	if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK))
 		goto out_finish;
-	error = -EFSCORRUPTED;
-	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock)))
+	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) {
+		xfs_bmap_mark_sick(ip, whichfork);
+		error = -EFSCORRUPTED;
 		goto out_finish;
+	}
 
 	XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, bma.length));
 	XFS_STATS_INC(mp, xs_xstrat_quick);
@@ -4681,6 +4699,7 @@ xfs_bmapi_remap(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -5319,8 +5338,10 @@ __xfs_bunmapi(
 	whichfork = xfs_bmapi_whichfork(flags);
 	ASSERT(whichfork != XFS_COW_FORK);
 	ifp = XFS_IFORK_PTR(ip, whichfork);
-	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)))
+	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork))) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
+	}
 	if (XFS_FORCED_SHUTDOWN(mp))
 		return -EIO;
 
@@ -5815,6 +5836,7 @@ xfs_bmap_collapse_extents(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -5932,6 +5954,7 @@ xfs_bmap_insert_extents(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -6038,6 +6061,7 @@ xfs_bmap_split_extent_at(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
 	}
 
@@ -6253,8 +6277,10 @@ xfs_bmap_finish_one(
 			XFS_FSB_TO_AGBNO(tp->t_mountp, startblock),
 			ip->i_ino, whichfork, startoff, *blockcount, state);
 
-	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK))
+	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK)) {
+		xfs_bmap_mark_sick(ip, whichfork);
 		return -EFSCORRUPTED;
+	}
 
 	if (XFS_TEST_ERROR(false, tp->t_mountp,
 			XFS_ERRTAG_BMAP_FINISH_ONE))
@@ -6272,6 +6298,7 @@ xfs_bmap_finish_one(
 		break;
 	default:
 		ASSERT(0);
+		xfs_bmap_mark_sick(ip, whichfork);
 		error = -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index ce8954a10c66..25b61180b562 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -138,6 +138,7 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
 		unsigned int *checked);
 
 void xfs_health_unmount(struct xfs_mount *mp);
+void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
 
 /* Now some helpers. */
 
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 36c32b108b39..5e5de5338476 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -452,3 +452,29 @@ xfs_bulkstat_health(
 			bs->bs_sick |= m->ioctl_mask;
 	}
 }
+
+/* Mark a block mapping sick. */
+void
+xfs_bmap_mark_sick(
+	struct xfs_inode	*ip,
+	int			whichfork)
+{
+	unsigned int		mask;
+
+	switch (whichfork) {
+	case XFS_DATA_FORK:
+		mask = XFS_SICK_INO_BMBTD;
+		break;
+	case XFS_ATTR_FORK:
+		mask = XFS_SICK_INO_BMBTA;
+		break;
+	case XFS_COW_FORK:
+		mask = XFS_SICK_INO_BMBTC;
+		break;
+	default:
+		ASSERT(0);
+		return;
+	}
+
+	xfs_inode_mark_sick(ip, mask);
+}
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 28e2d1f37267..c1befb899911 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -27,7 +27,7 @@
 #include "xfs_dquot_item.h"
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
-
+#include "xfs_health.h"
 
 #define XFS_ALLOC_ALIGN(mp, off) \
 	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
@@ -59,8 +59,10 @@ xfs_bmbt_to_iomap(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
 
-	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
+	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
+		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
 		return xfs_alert_fsblock_zero(ip, imap);
+	}
 
 	if (imap->br_startblock == HOLESTARTBLOCK) {
 		iomap->addr = IOMAP_NULL_ADDR;
@@ -277,8 +279,10 @@ xfs_iomap_write_direct(
 		goto out_unlock;
 	}
 
-	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
+	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
+		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
 		error = xfs_alert_fsblock_zero(ip, imap);
+	}
 
 out_unlock:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
@@ -598,8 +602,10 @@ xfs_iomap_write_unwritten(
 		if (error)
 			return error;
 
-		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock)))
+		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock))) {
+			xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
 			return xfs_alert_fsblock_zero(ip, &imap);
+		}
 
 		if ((numblks_fsb = imap.br_blockcount) == 0) {
 			/*
@@ -858,6 +864,7 @@ xfs_buffered_write_iomap_begin(
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, XFS_DATA_FORK)) ||
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
+		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
 		error = -EFSCORRUPTED;
 		goto out_unlock;
 	}


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/9] xfs: report btree block corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-11-14 18:19 ` [PATCH 3/9] xfs: report block map " Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 5/9] xfs: report dir/attr " Darrick J. Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever a btree cursor tells us that the btree is corrupt, we should
report that to the health monitoring system for later reporting to the
administrator.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_alloc.c    |  102 +++++++++++++++++++++++++++++++++++-------
 fs/xfs/libxfs/xfs_bmap.c     |   83 ++++++++++++++++++++++++++++++++--
 fs/xfs/libxfs/xfs_btree.c    |   30 ++++++++++++
 fs/xfs/libxfs/xfs_health.h   |    2 +
 fs/xfs/libxfs/xfs_ialloc.c   |   53 ++++++++++++++++++----
 fs/xfs/libxfs/xfs_refcount.c |   35 ++++++++++++++
 fs/xfs/libxfs/xfs_rmap.c     |   85 +++++++++++++++++++++++++++++++++--
 fs/xfs/libxfs/xfs_rmap.h     |    2 -
 fs/xfs/scrub/rmap.c          |    2 -
 fs/xfs/xfs_discard.c         |    2 +
 fs/xfs/xfs_health.c          |   39 ++++++++++++++++
 fs/xfs/xfs_iwalk.c           |    4 +-
 12 files changed, 397 insertions(+), 42 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index e75e3ae6c912..1456b61eaa09 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -261,6 +261,7 @@ xfs_alloc_get_rec(
 		cur->bc_btnum == XFS_BTNUM_BNO ? "Block" : "Size", agno);
 	xfs_warn(mp,
 		"start block 0x%x block count 0x%x", *bno, *len);
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -455,14 +456,18 @@ xfs_alloc_fixup_trees(
 		if (XFS_IS_CORRUPT(mp,
 				   i != 1 ||
 				   nfbno1 != fbno ||
-				   nflen1 != flen))
+				   nflen1 != flen)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 #endif
 	} else {
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, fbno, flen, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 	/*
 	 * Look up the record in the by-block tree if necessary.
@@ -474,14 +479,18 @@ xfs_alloc_fixup_trees(
 		if (XFS_IS_CORRUPT(mp,
 				   i != 1 ||
 				   nfbno1 != fbno ||
-				   nflen1 != flen))
+				   nflen1 != flen)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 #endif
 	} else {
 		if ((error = xfs_alloc_lookup_eq(bno_cur, fbno, flen, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 
 #ifdef DEBUG
@@ -494,8 +503,10 @@ xfs_alloc_fixup_trees(
 
 		if (XFS_IS_CORRUPT(mp,
 				   bnoblock->bb_numrecs !=
-				   cntblock->bb_numrecs))
+				   cntblock->bb_numrecs)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 #endif
 
@@ -525,30 +536,40 @@ xfs_alloc_fixup_trees(
 	 */
 	if ((error = xfs_btree_delete(cnt_cur, &i)))
 		return error;
-	if (XFS_IS_CORRUPT(mp, i != 1))
+	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cnt_cur);
 		return -EFSCORRUPTED;
+	}
 	/*
 	 * Add new by-size btree entry(s).
 	 */
 	if (nfbno1 != NULLAGBLOCK) {
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, nfbno1, nflen1, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 0))
+		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 		if ((error = xfs_btree_insert(cnt_cur, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 	if (nfbno2 != NULLAGBLOCK) {
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, nfbno2, nflen2, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 0))
+		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 		if ((error = xfs_btree_insert(cnt_cur, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 	/*
 	 * Fix up the by-block btree entry(s).
@@ -559,8 +580,10 @@ xfs_alloc_fixup_trees(
 		 */
 		if ((error = xfs_btree_delete(bno_cur, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 	} else {
 		/*
 		 * Update the by-block entry to start later|be shorter.
@@ -574,12 +597,16 @@ xfs_alloc_fixup_trees(
 		 */
 		if ((error = xfs_alloc_lookup_eq(bno_cur, nfbno2, nflen2, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 0))
+		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 		if ((error = xfs_btree_insert(bno_cur, &i)))
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			return -EFSCORRUPTED;
+		}
 	}
 	return 0;
 }
@@ -843,8 +870,10 @@ xfs_alloc_cur_check(
 	error = xfs_alloc_get_rec(cur, &bno, &len, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(args->mp, i != 1))
+	if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	/*
 	 * Check minlen and deactivate a cntbt cursor if out of acceptable size
@@ -1050,6 +1079,7 @@ xfs_alloc_ag_vextent_small(
 		if (error)
 			goto error;
 		if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+			xfs_btree_mark_sick(ccur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -1076,6 +1106,7 @@ xfs_alloc_ag_vextent_small(
 
 		bp = xfs_btree_get_bufs(args->mp, args->tp, args->agno, fbno);
 		if (XFS_IS_CORRUPT(args->mp, !bp)) {
+			xfs_btree_mark_sick(ccur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -1086,6 +1117,7 @@ xfs_alloc_ag_vextent_small(
 	if (XFS_IS_CORRUPT(args->mp,
 			   fbno >= be32_to_cpu(
 				   XFS_BUF_TO_AGF(args->agbp)->agf_length))) {
+		xfs_btree_mark_sick(ccur);
 		error = -EFSCORRUPTED;
 		goto error;
 	}
@@ -1244,6 +1276,7 @@ xfs_alloc_ag_vextent_exact(
 	if (error)
 		goto error0;
 	if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+		xfs_btree_mark_sick(bno_cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1525,8 +1558,10 @@ xfs_alloc_ag_vextent_lastblock(
 			error = xfs_alloc_get_rec(acur->cnt, bno, len, &i);
 			if (error)
 				return error;
-			if (XFS_IS_CORRUPT(args->mp, i != 1))
+			if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+				xfs_btree_mark_sick(acur->cnt);
 				return -EFSCORRUPTED;
+			}
 			if (*len >= args->minlen)
 				break;
 			error = xfs_btree_increment(acur->cnt, 0, &i);
@@ -1721,6 +1756,7 @@ xfs_alloc_ag_vextent_size(
 			if (error)
 				goto error0;
 			if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+				xfs_btree_mark_sick(cnt_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -1761,6 +1797,7 @@ xfs_alloc_ag_vextent_size(
 			   rlen != 0 &&
 			   (rlen > flen ||
 			    rbno + rlen > fbno + flen))) {
+		xfs_btree_mark_sick(cnt_cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1783,6 +1820,7 @@ xfs_alloc_ag_vextent_size(
 					&i)))
 				goto error0;
 			if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+				xfs_btree_mark_sick(cnt_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -1795,6 +1833,7 @@ xfs_alloc_ag_vextent_size(
 					   rlen != 0 &&
 					   (rlen > flen ||
 					    rbno + rlen > fbno + flen))) {
+				xfs_btree_mark_sick(cnt_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -1811,6 +1850,7 @@ xfs_alloc_ag_vextent_size(
 				&i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(args->mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1837,6 +1877,7 @@ xfs_alloc_ag_vextent_size(
 
 	rlen = args->len;
 	if (XFS_IS_CORRUPT(args->mp, rlen > flen)) {
+		xfs_btree_mark_sick(cnt_cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1857,6 +1898,7 @@ xfs_alloc_ag_vextent_size(
 			   args->agbno + args->len >
 			   be32_to_cpu(
 				   XFS_BUF_TO_AGF(args->agbp)->agf_length))) {
+		xfs_ag_mark_sick(args->pag, XFS_SICK_AG_BNOBT);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1932,6 +1974,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_get_rec(bno_cur, &ltbno, &ltlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1947,6 +1990,7 @@ xfs_free_ag_extent(
 			 * Very bad.
 			 */
 			if (XFS_IS_CORRUPT(mp, ltbno + ltlen > bno)) {
+				xfs_btree_mark_sick(bno_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -1965,6 +2009,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_get_rec(bno_cur, &gtbno, &gtlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1980,6 +2025,7 @@ xfs_free_ag_extent(
 			 * Very bad.
 			 */
 			if (XFS_IS_CORRUPT(mp, bno + len > gtbno)) {
+				xfs_btree_mark_sick(bno_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -2000,12 +2046,14 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
 		if ((error = xfs_btree_delete(cnt_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2015,12 +2063,14 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
 		if ((error = xfs_btree_delete(cnt_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2030,6 +2080,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_btree_delete(bno_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2039,6 +2090,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_btree_decrement(bno_cur, 0, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2058,6 +2110,7 @@ xfs_free_ag_extent(
 					   i != 1 ||
 					   xxbno != ltbno ||
 					   xxlen != ltlen)) {
+				xfs_btree_mark_sick(bno_cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -2082,12 +2135,14 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
 		if ((error = xfs_btree_delete(cnt_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2098,6 +2153,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_btree_decrement(bno_cur, 0, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2117,12 +2173,14 @@ xfs_free_ag_extent(
 		if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
 		if ((error = xfs_btree_delete(cnt_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cnt_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2145,6 +2203,7 @@ xfs_free_ag_extent(
 		if ((error = xfs_btree_insert(bno_cur, &i)))
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(bno_cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2157,12 +2216,14 @@ xfs_free_ag_extent(
 	if ((error = xfs_alloc_lookup_eq(cnt_cur, nbno, nlen, &i)))
 		goto error0;
 	if (XFS_IS_CORRUPT(mp, i != 0)) {
+		xfs_btree_mark_sick(cnt_cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
 	if ((error = xfs_btree_insert(cnt_cur, &i)))
 		goto error0;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cnt_cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -2344,8 +2405,10 @@ xfs_free_agfl_block(
 		return error;
 
 	bp = xfs_btree_get_bufs(tp->t_mountp, tp, agno, agbno);
-	if (XFS_IS_CORRUPT(tp->t_mountp, !bp))
+	if (XFS_IS_CORRUPT(tp->t_mountp, !bp)) {
+		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGFL);
 		return -EFSCORRUPTED;
+	}
 	xfs_trans_binval(tp, bp);
 
 	return 0;
@@ -3287,10 +3350,14 @@ __xfs_free_extent(
 		return -EIO;
 
 	error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
-	if (error)
+	if (error) {
+		if (xfs_metadata_is_sick(error))
+			xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_BNOBT);
 		return error;
+	}
 
 	if (XFS_IS_CORRUPT(mp, agbno >= mp->m_sb.sb_agblocks)) {
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_BNOBT);
 		error = -EFSCORRUPTED;
 		goto err;
 	}
@@ -3299,6 +3366,7 @@ __xfs_free_extent(
 	if (XFS_IS_CORRUPT(mp,
 			   agbno + len >
 			   be32_to_cpu(XFS_BUF_TO_AGF(agbp)->agf_length))) {
+		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_BNOBT);
 		error = -EFSCORRUPTED;
 		goto err;
 	}
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index c4674fb0bfb4..4a13db25054b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -369,6 +369,8 @@ xfs_bmap_check_leaf_extents(
 			error = xfs_btree_read_bufl(mp, NULL, bno, &bp,
 						XFS_BMAP_BTREE_REF,
 						&xfs_bmbt_buf_ops);
+			if (xfs_metadata_is_sick(error))
+				xfs_btree_mark_sick(cur);
 			if (error)
 				goto error_norelse;
 		}
@@ -385,6 +387,7 @@ xfs_bmap_check_leaf_extents(
 		pp = XFS_BMBT_PTR_ADDR(mp, block, 1, mp->m_bmap_dmxr[1]);
 		bno = be64_to_cpu(*pp);
 		if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbno(mp, bno))) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -455,6 +458,8 @@ xfs_bmap_check_leaf_extents(
 			error = xfs_btree_read_bufl(mp, NULL, bno, &bp,
 						XFS_BMAP_BTREE_REF,
 						&xfs_bmbt_buf_ops);
+			if (xfs_metadata_is_sick(error))
+				xfs_btree_mark_sick(cur);
 			if (error)
 				goto error_norelse;
 		}
@@ -614,11 +619,15 @@ xfs_bmap_btree_to_extents(
 	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, rblock, 1, ifp->if_broot_bytes);
 	cbno = be64_to_cpu(*pp);
 #ifdef DEBUG
-	if (XFS_IS_CORRUPT(cur->bc_mp, !xfs_btree_check_lptr(cur, cbno, 1)))
+	if (XFS_IS_CORRUPT(cur->bc_mp, !xfs_btree_check_lptr(cur, cbno, 1))) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 #endif
 	error = xfs_btree_read_bufl(mp, tp, cbno, &cbp, XFS_BMAP_BTREE_REF,
 				&xfs_bmbt_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_btree_mark_sick(cur);
 	if (error)
 		return error;
 	cblock = XFS_BUF_TO_BLOCK(cbp);
@@ -941,6 +950,7 @@ xfs_bmap_add_attrfork_btree(
 			goto error0;
 		/* must be at least one entry */
 		if (XFS_IS_CORRUPT(mp, stat != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1635,6 +1645,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1642,6 +1653,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1649,6 +1661,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1678,6 +1691,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1711,6 +1725,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1739,6 +1754,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1746,6 +1762,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1780,6 +1797,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1804,6 +1822,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1811,6 +1830,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1855,6 +1875,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1890,6 +1911,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1897,6 +1919,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1976,6 +1999,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -1983,6 +2007,7 @@ xfs_bmap_add_extent_delay_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(bma->cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2180,30 +2205,35 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_delete(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_decrement(cur, 0, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_delete(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_decrement(cur, 0, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2233,18 +2263,21 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_delete(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_decrement(cur, 0, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2277,18 +2310,21 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_delete(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_decrement(cur, 0, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2315,6 +2351,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2348,6 +2385,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2385,6 +2423,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2395,6 +2434,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if ((error = xfs_btree_insert(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2425,6 +2465,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2462,6 +2503,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2472,12 +2514,14 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
 			if ((error = xfs_btree_insert(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2515,6 +2559,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2527,6 +2572,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if ((error = xfs_btree_insert(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2539,6 +2585,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2546,6 +2593,7 @@ xfs_bmap_add_extent_unwritten_real(
 			if ((error = xfs_btree_insert(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2832,6 +2880,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2839,6 +2888,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2846,6 +2896,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2875,6 +2926,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2905,6 +2957,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2931,6 +2984,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 0)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -2938,6 +2992,7 @@ xfs_bmap_add_extent_hole_real(
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -5131,6 +5186,7 @@ xfs_bmap_del_extent_real(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -5158,6 +5214,7 @@ xfs_bmap_del_extent_real(
 		if ((error = xfs_btree_delete(cur, &i)))
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -5232,6 +5289,7 @@ xfs_bmap_del_extent_real(
 				if (error)
 					goto done;
 				if (XFS_IS_CORRUPT(mp, i != 1)) {
+					xfs_btree_mark_sick(cur);
 					error = -EFSCORRUPTED;
 					goto done;
 				}
@@ -5252,6 +5310,7 @@ xfs_bmap_del_extent_real(
 				goto done;
 			}
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto done;
 			}
@@ -5735,21 +5794,27 @@ xfs_bmse_merge(
 	error = xfs_bmbt_lookup_eq(cur, got, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(mp, i != 1))
+	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	error = xfs_btree_delete(cur, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(mp, i != 1))
+	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	/* lookup and update size of the previous extent */
 	error = xfs_bmbt_lookup_eq(cur, left, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(mp, i != 1))
+	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	error = xfs_bmbt_update(cur, &new);
 	if (error)
@@ -5797,8 +5862,10 @@ xfs_bmap_shift_update_extent(
 		error = xfs_bmbt_lookup_eq(cur, &prev, &i);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(mp, i != 1))
+		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			return -EFSCORRUPTED;
+		}
 
 		error = xfs_bmbt_update(cur, got);
 		if (error)
@@ -5861,6 +5928,7 @@ xfs_bmap_collapse_extents(
 		goto del_cursor;
 	}
 	if (XFS_IS_CORRUPT(mp, isnullstartblock(got.br_startblock))) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto del_cursor;
 	}
@@ -5988,12 +6056,14 @@ xfs_bmap_insert_extents(
 		}
 	}
 	if (XFS_IS_CORRUPT(mp, isnullstartblock(got.br_startblock))) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto del_cursor;
 	}
 
 	if (XFS_IS_CORRUPT(mp,
 			   stop_fsb >= got.br_startoff + got.br_blockcount)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto del_cursor;
 	}
@@ -6096,6 +6166,7 @@ xfs_bmap_split_extent_at(
 		if (error)
 			goto del_cursor;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto del_cursor;
 		}
@@ -6124,6 +6195,7 @@ xfs_bmap_split_extent_at(
 		if (error)
 			goto del_cursor;
 		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto del_cursor;
 		}
@@ -6131,6 +6203,7 @@ xfs_bmap_split_extent_at(
 		if (error)
 			goto del_cursor;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto del_cursor;
 		}
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 8f0e3a368f38..893368357cce 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -20,6 +20,7 @@
 #include "xfs_trace.h"
 #include "xfs_alloc.h"
 #include "xfs_log.h"
+#include "xfs_health.h"
 
 /*
  * Cursor allocation zone.
@@ -109,6 +110,7 @@ xfs_btree_check_lblock(
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BTREE_CHECK_LBLOCK)) {
 		if (bp)
 			trace_xfs_btree_corrupt(bp, _RET_IP_);
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
 	}
 	return 0;
@@ -172,6 +174,7 @@ xfs_btree_check_sblock(
 	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BTREE_CHECK_SBLOCK)) {
 		if (bp)
 			trace_xfs_btree_corrupt(bp, _RET_IP_);
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
 	}
 	return 0;
@@ -247,6 +250,7 @@ xfs_btree_check_ptr(
 				level, index);
 	}
 
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -426,6 +430,8 @@ xfs_btree_dup_cursor(
 						   XFS_BUF_ADDR(bp), mp->m_bsize,
 						   0, &bp,
 						   cur->bc_ops->buf_ops);
+			if (xfs_metadata_is_sick(error))
+				xfs_btree_mark_sick(new);
 			if (error) {
 				xfs_btree_del_cursor(new, error);
 				*ncur = NULL;
@@ -1306,6 +1312,8 @@ xfs_btree_read_buf_block(
 	error = xfs_trans_read_buf(mp, cur->bc_tp, mp->m_ddev_targp, d,
 				   mp->m_bsize, flags, bpp,
 				   cur->bc_ops->buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_btree_mark_sick(cur);
 	if (error)
 		return error;
 
@@ -1616,6 +1624,7 @@ xfs_btree_increment(
 		if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
 			goto out0;
 		ASSERT(0);
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1709,6 +1718,7 @@ xfs_btree_decrement(
 		if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
 			goto out0;
 		ASSERT(0);
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1801,6 +1811,7 @@ xfs_btree_lookup_get_block(
 	*blkp = NULL;
 	xfs_buf_corruption_error(bp);
 	xfs_trans_brelse(cur->bc_tp, bp);
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -1847,8 +1858,10 @@ xfs_btree_lookup(
 	XFS_BTREE_STATS_INC(cur, lookup);
 
 	/* No such thing as a zero-level tree. */
-	if (XFS_IS_CORRUPT(cur->bc_mp, cur->bc_nlevels == 0))
+	if (XFS_IS_CORRUPT(cur->bc_mp, cur->bc_nlevels == 0)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	block = NULL;
 	keyno = 0;
@@ -1891,6 +1904,7 @@ xfs_btree_lookup(
 							XFS_ERRLEVEL_LOW,
 							cur->bc_mp, block,
 							sizeof(*block));
+					xfs_btree_mark_sick(cur);
 					return -EFSCORRUPTED;
 				}
 
@@ -1967,8 +1981,10 @@ xfs_btree_lookup(
 			error = xfs_btree_increment(cur, 0, &i);
 			if (error)
 				goto error0;
-			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				return -EFSCORRUPTED;
+			}
 			*stat = 1;
 			return 0;
 		}
@@ -2424,6 +2440,7 @@ xfs_btree_lshift(
 			goto error0;
 		i = xfs_btree_firstrec(tcur, level);
 		if (XFS_IS_CORRUPT(tcur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -2594,6 +2611,7 @@ xfs_btree_rshift(
 		goto error0;
 	i = xfs_btree_lastrec(tcur, level);
 	if (XFS_IS_CORRUPT(tcur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -3451,6 +3469,7 @@ xfs_btree_insert(
 		}
 
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -3858,6 +3877,7 @@ xfs_btree_delrec(
 		 */
 		i = xfs_btree_lastrec(tcur, level);
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -3866,12 +3886,14 @@ xfs_btree_delrec(
 		if (error)
 			goto error0;
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
 
 		i = xfs_btree_lastrec(tcur, level);
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -3919,6 +3941,7 @@ xfs_btree_delrec(
 		if (!xfs_btree_ptr_is_null(cur, &lptr)) {
 			i = xfs_btree_firstrec(tcur, level);
 			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -3927,6 +3950,7 @@ xfs_btree_delrec(
 			if (error)
 				goto error0;
 			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto error0;
 			}
@@ -3944,6 +3968,7 @@ xfs_btree_delrec(
 		 */
 		i = xfs_btree_firstrec(tcur, level);
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -3953,6 +3978,7 @@ xfs_btree_delrec(
 			goto error0;
 		i = xfs_btree_firstrec(tcur, level);
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index 25b61180b562..2049419e9555 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -37,6 +37,7 @@ struct xfs_mount;
 struct xfs_perag;
 struct xfs_inode;
 struct xfs_fsop_geom;
+struct xfs_btree_cur;
 
 /* Observable health issues for metadata spanning the entire filesystem. */
 #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
@@ -139,6 +140,7 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
 
 void xfs_health_unmount(struct xfs_mount *mp);
 void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
+void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
 
 /* Now some helpers. */
 
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index c401512a4350..3571fe2f3113 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -143,6 +143,7 @@ xfs_inobt_get_rec(
 "start inode 0x%x, count 0x%x, free 0x%x freemask 0x%llx, holemask 0x%x",
 		irec->ir_startino, irec->ir_count, irec->ir_freecount,
 		irec->ir_free, irec->ir_holemask);
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -546,6 +547,7 @@ xfs_inobt_insert_sprec(
 		if (error)
 			goto error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -562,10 +564,12 @@ xfs_inobt_insert_sprec(
 		if (error)
 			goto error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
 		if (XFS_IS_CORRUPT(mp, rec.ir_startino != nrec->ir_startino)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -575,6 +579,7 @@ xfs_inobt_insert_sprec(
 		 * cannot merge, something is seriously wrong.
 		 */
 		if (XFS_IS_CORRUPT(mp, !__xfs_inobt_can_merge(nrec, &rec))) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -1067,8 +1072,10 @@ xfs_ialloc_next_rec(
 		error = xfs_inobt_get_rec(cur, rec, &i);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			return -EFSCORRUPTED;
+		}
 	}
 
 	return 0;
@@ -1092,8 +1099,10 @@ xfs_ialloc_get_rec(
 		error = xfs_inobt_get_rec(cur, rec, &i);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			return -EFSCORRUPTED;
+		}
 	}
 
 	return 0;
@@ -1174,6 +1183,7 @@ xfs_dialloc_ag_inobt(
 		if (error)
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1182,6 +1192,7 @@ xfs_dialloc_ag_inobt(
 		if (error)
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, j != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1340,6 +1351,7 @@ xfs_dialloc_ag_inobt(
 	if (error)
 		goto error0;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1349,6 +1361,7 @@ xfs_dialloc_ag_inobt(
 		if (error)
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1358,6 +1371,7 @@ xfs_dialloc_ag_inobt(
 		if (error)
 			goto error0;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error0;
 		}
@@ -1420,8 +1434,10 @@ xfs_dialloc_ag_finobt_near(
 		error = xfs_inobt_get_rec(lcur, rec, &i);
 		if (error)
 			return error;
-		if (XFS_IS_CORRUPT(lcur->bc_mp, i != 1))
+		if (XFS_IS_CORRUPT(lcur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(lcur);
 			return -EFSCORRUPTED;
+		}
 
 		/*
 		 * See if we've landed in the parent inode record. The finobt
@@ -1445,12 +1461,14 @@ xfs_dialloc_ag_finobt_near(
 		if (error)
 			goto error_rcur;
 		if (XFS_IS_CORRUPT(lcur->bc_mp, j != 1)) {
+			xfs_btree_mark_sick(lcur);
 			error = -EFSCORRUPTED;
 			goto error_rcur;
 		}
 	}
 
 	if (XFS_IS_CORRUPT(lcur->bc_mp, i != 1 && j != 1)) {
+		xfs_btree_mark_sick(lcur);
 		error = -EFSCORRUPTED;
 		goto error_rcur;
 	}
@@ -1506,8 +1524,10 @@ xfs_dialloc_ag_finobt_newino(
 			error = xfs_inobt_get_rec(cur, rec, &i);
 			if (error)
 				return error;
-			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+			if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				return -EFSCORRUPTED;
+			}
 			return 0;
 		}
 	}
@@ -1518,14 +1538,18 @@ xfs_dialloc_ag_finobt_newino(
 	error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	error = xfs_inobt_get_rec(cur, rec, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	return 0;
 }
@@ -1547,14 +1571,18 @@ xfs_dialloc_ag_update_inobt(
 	error = xfs_inobt_lookup(cur, frec->ir_startino, XFS_LOOKUP_EQ, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	error = xfs_inobt_get_rec(cur, &rec, &i);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1))
+	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 	ASSERT((XFS_AGINO_TO_OFFSET(cur->bc_mp, rec.ir_startino) %
 				   XFS_INODES_PER_CHUNK) == 0);
 
@@ -1563,8 +1591,10 @@ xfs_dialloc_ag_update_inobt(
 
 	if (XFS_IS_CORRUPT(cur->bc_mp,
 			   rec.ir_free != frec->ir_free ||
-			   rec.ir_freecount != frec->ir_freecount))
+			   rec.ir_freecount != frec->ir_freecount)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	return xfs_inobt_update(cur, &rec);
 }
@@ -1975,6 +2005,7 @@ xfs_difree_inobt(
 		goto error0;
 	}
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -1985,6 +2016,7 @@ xfs_difree_inobt(
 		goto error0;
 	}
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error0;
 	}
@@ -2100,6 +2132,7 @@ xfs_difree_finobt(
 		 * something is out of sync.
 		 */
 		if (XFS_IS_CORRUPT(mp, ibtrec->ir_freecount != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto error;
 		}
@@ -2126,6 +2159,7 @@ xfs_difree_finobt(
 	if (error)
 		goto error;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error;
 	}
@@ -2136,6 +2170,7 @@ xfs_difree_finobt(
 	if (XFS_IS_CORRUPT(mp,
 			   rec.ir_free != ibtrec->ir_free ||
 			   rec.ir_freecount != ibtrec->ir_freecount)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto error;
 	}
diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
index 25c87834e42a..8e90eb1aedcd 100644
--- a/fs/xfs/libxfs/xfs_refcount.c
+++ b/fs/xfs/libxfs/xfs_refcount.c
@@ -154,6 +154,7 @@ xfs_refcount_get_rec(
 	xfs_warn(mp,
 		"Start block 0x%x, block count 0x%x, references 0x%x",
 		irec->rc_startblock, irec->rc_blockcount, irec->rc_refcount);
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -202,6 +203,7 @@ xfs_refcount_insert(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, *i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -232,12 +234,14 @@ xfs_refcount_delete(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
 	trace_xfs_refcount_delete(cur->bc_mp, cur->bc_private.a.agno, &irec);
 	error = xfs_btree_delete(cur, i);
 	if (XFS_IS_CORRUPT(cur->bc_mp, *i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -360,6 +364,7 @@ xfs_refcount_split_extent(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -385,6 +390,7 @@ xfs_refcount_split_extent(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -427,6 +433,7 @@ xfs_refcount_merge_center_extents(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -435,6 +442,7 @@ xfs_refcount_merge_center_extents(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -444,6 +452,7 @@ xfs_refcount_merge_center_extents(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -455,6 +464,7 @@ xfs_refcount_merge_center_extents(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -497,6 +507,7 @@ xfs_refcount_merge_left_extent(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -505,6 +516,7 @@ xfs_refcount_merge_left_extent(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -516,6 +528,7 @@ xfs_refcount_merge_left_extent(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -561,6 +574,7 @@ xfs_refcount_merge_right_extent(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -569,6 +583,7 @@ xfs_refcount_merge_right_extent(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -580,6 +595,7 @@ xfs_refcount_merge_right_extent(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -629,6 +645,7 @@ xfs_refcount_find_left_extents(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -650,6 +667,7 @@ xfs_refcount_find_left_extents(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -718,6 +736,7 @@ xfs_refcount_find_right_extents(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -739,6 +758,7 @@ xfs_refcount_find_right_extents(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -966,6 +986,7 @@ xfs_refcount_adjust_extents(
 					goto out_error;
 				if (XFS_IS_CORRUPT(cur->bc_mp,
 						   found_tmp != 1)) {
+					xfs_btree_mark_sick(cur);
 					error = -EFSCORRUPTED;
 					goto out_error;
 				}
@@ -1010,6 +1031,7 @@ xfs_refcount_adjust_extents(
 			if (error)
 				goto out_error;
 			if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto out_error;
 			}
@@ -1331,6 +1353,7 @@ xfs_refcount_find_shared(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -1346,6 +1369,7 @@ xfs_refcount_find_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1377,6 +1401,7 @@ xfs_refcount_find_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1482,6 +1507,7 @@ xfs_refcount_adjust_cow_extents(
 		/* Adding a CoW reservation, there should be nothing here. */
 		if (XFS_IS_CORRUPT(cur->bc_mp,
 				   agbno + aglen > ext.rc_startblock)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1497,6 +1523,7 @@ xfs_refcount_adjust_cow_extents(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_tmp != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1504,14 +1531,17 @@ xfs_refcount_adjust_cow_extents(
 	case XFS_REFCOUNT_ADJUST_COW_FREE:
 		/* Removing a CoW reservation, there should be one extent. */
 		if (XFS_IS_CORRUPT(cur->bc_mp, ext.rc_startblock != agbno)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
 		if (XFS_IS_CORRUPT(cur->bc_mp, ext.rc_blockcount != aglen)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
 		if (XFS_IS_CORRUPT(cur->bc_mp, ext.rc_refcount != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1523,6 +1553,7 @@ xfs_refcount_adjust_cow_extents(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(cur->bc_mp, found_rec != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1672,8 +1703,10 @@ xfs_refcount_recover_extent(
 	struct xfs_refcount_recovery	*rr;
 
 	if (XFS_IS_CORRUPT(cur->bc_mp,
-			   be32_to_cpu(rec->refc.rc_refcount) != 1))
+			   be32_to_cpu(rec->refc.rc_refcount) != 1)) {
+		xfs_btree_mark_sick(cur);
 		return -EFSCORRUPTED;
+	}
 
 	rr = kmem_alloc(sizeof(struct xfs_refcount_recovery), 0);
 	xfs_refcount_btrec_to_irec(rec, &rr->rr_rrec);
diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index a54a3c129cce..41d6b20e370c 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -115,6 +115,7 @@ xfs_rmap_insert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(rcur->bc_mp, i != 0)) {
+		xfs_btree_mark_sick(rcur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -128,6 +129,7 @@ xfs_rmap_insert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(rcur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(rcur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -157,6 +159,7 @@ xfs_rmap_delete(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(rcur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(rcur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -165,6 +168,7 @@ xfs_rmap_delete(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(rcur->bc_mp, i != 1)) {
+		xfs_btree_mark_sick(rcur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -178,14 +182,20 @@ xfs_rmap_delete(
 /* Convert an internal btree record to an rmap record. */
 int
 xfs_rmap_btrec_to_irec(
+	struct xfs_btree_cur	*cur,
 	union xfs_btree_rec	*rec,
 	struct xfs_rmap_irec	*irec)
 {
+	int			error;
+
 	irec->rm_startblock = be32_to_cpu(rec->rmap.rm_startblock);
 	irec->rm_blockcount = be32_to_cpu(rec->rmap.rm_blockcount);
 	irec->rm_owner = be64_to_cpu(rec->rmap.rm_owner);
-	return xfs_rmap_irec_offset_unpack(be64_to_cpu(rec->rmap.rm_offset),
+	error = xfs_rmap_irec_offset_unpack(be64_to_cpu(rec->rmap.rm_offset),
 			irec);
+	if (xfs_metadata_is_sick(error))
+		xfs_btree_mark_sick(cur);
+	return error;
 }
 
 /*
@@ -206,7 +216,7 @@ xfs_rmap_get_rec(
 	if (error || !*stat)
 		return error;
 
-	if (xfs_rmap_btrec_to_irec(rec, irec))
+	if (xfs_rmap_btrec_to_irec(cur, rec, irec))
 		goto out_bad_rec;
 
 	if (irec->rm_blockcount == 0)
@@ -242,6 +252,7 @@ xfs_rmap_get_rec(
 		"Owner 0x%llx, flags 0x%x, start block 0x%x block count 0x%x",
 		irec->rm_owner, irec->rm_flags, irec->rm_startblock,
 		irec->rm_blockcount);
+	xfs_btree_mark_sick(cur);
 	return -EFSCORRUPTED;
 }
 
@@ -405,7 +416,7 @@ xfs_rmap_lookup_le_range(
  */
 static int
 xfs_rmap_free_check_owner(
-	struct xfs_mount	*mp,
+	struct xfs_btree_cur	*cur,
 	uint64_t		ltoff,
 	struct xfs_rmap_irec	*rec,
 	xfs_filblks_t		len,
@@ -413,6 +424,7 @@ xfs_rmap_free_check_owner(
 	uint64_t		offset,
 	unsigned int		flags)
 {
+	struct xfs_mount	*mp = cur->bc_mp;
 	int			error = 0;
 
 	if (owner == XFS_RMAP_OWN_UNKNOWN)
@@ -422,12 +434,14 @@ xfs_rmap_free_check_owner(
 	if (XFS_IS_CORRUPT(mp,
 			   (flags & XFS_RMAP_UNWRITTEN) !=
 			   (rec->rm_flags & XFS_RMAP_UNWRITTEN))) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
 
 	/* Make sure the owner matches what we expect to find in the tree. */
 	if (XFS_IS_CORRUPT(mp, owner != rec->rm_owner)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -439,16 +453,19 @@ xfs_rmap_free_check_owner(
 	if (flags & XFS_RMAP_BMBT_BLOCK) {
 		if (XFS_IS_CORRUPT(mp,
 				   !(rec->rm_flags & XFS_RMAP_BMBT_BLOCK))) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
 	} else {
 		if (XFS_IS_CORRUPT(mp, rec->rm_offset > offset)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
 		if (XFS_IS_CORRUPT(mp,
 				   offset + len > ltoff + rec->rm_blockcount)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -511,6 +528,7 @@ xfs_rmap_unmap(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -519,6 +537,7 @@ xfs_rmap_unmap(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -539,6 +558,7 @@ xfs_rmap_unmap(
 		if (XFS_IS_CORRUPT(mp,
 				   bno <
 				   ltrec.rm_startblock + ltrec.rm_blockcount)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -565,6 +585,7 @@ xfs_rmap_unmap(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -577,12 +598,13 @@ xfs_rmap_unmap(
 			   ltrec.rm_startblock > bno ||
 			   ltrec.rm_startblock + ltrec.rm_blockcount <
 			   bno + len)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
 
 	/* Check owner information. */
-	error = xfs_rmap_free_check_owner(mp, ltoff, &ltrec, len, owner,
+	error = xfs_rmap_free_check_owner(cur, ltoff, &ltrec, len, owner,
 			offset, flags);
 	if (error)
 		goto out_error;
@@ -597,6 +619,7 @@ xfs_rmap_unmap(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -792,6 +815,7 @@ xfs_rmap_map(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, have_lt != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -807,6 +831,7 @@ xfs_rmap_map(
 	if (XFS_IS_CORRUPT(mp,
 			   have_lt != 0 &&
 			   ltrec.rm_startblock + ltrec.rm_blockcount > bno)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -824,10 +849,12 @@ xfs_rmap_map(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, have_gt != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
 		if (XFS_IS_CORRUPT(mp, bno + len > gtrec.rm_startblock)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -881,6 +908,7 @@ xfs_rmap_map(
 			if (error)
 				goto out_error;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
+				xfs_btree_mark_sick(cur);
 				error = -EFSCORRUPTED;
 				goto out_error;
 			}
@@ -928,6 +956,7 @@ xfs_rmap_map(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -1023,6 +1052,7 @@ xfs_rmap_convert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -1031,6 +1061,7 @@ xfs_rmap_convert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -1067,12 +1098,14 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
 		if (XFS_IS_CORRUPT(mp,
 				   LEFT.rm_startblock + LEFT.rm_blockcount >
 				   bno)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1095,6 +1128,7 @@ xfs_rmap_convert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -1107,10 +1141,12 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
 		if (XFS_IS_CORRUPT(mp, bno + len > RIGHT.rm_startblock)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1141,6 +1177,7 @@ xfs_rmap_convert(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -1160,6 +1197,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1171,6 +1209,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1178,6 +1217,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1189,6 +1229,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1196,6 +1237,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1219,6 +1261,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1226,6 +1269,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1245,6 +1289,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1256,6 +1301,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1263,6 +1309,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1333,6 +1380,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1375,6 +1423,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1390,6 +1439,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1423,6 +1473,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1436,6 +1487,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 0)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1448,6 +1500,7 @@ xfs_rmap_convert(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1520,6 +1573,7 @@ xfs_rmap_convert_shared(
 	if (error)
 		goto done;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto done;
 	}
@@ -1548,6 +1602,7 @@ xfs_rmap_convert_shared(
 		if (XFS_IS_CORRUPT(mp,
 				   LEFT.rm_startblock + LEFT.rm_blockcount >
 				   bno)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1566,10 +1621,12 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
 		if (XFS_IS_CORRUPT(mp, bno + len > RIGHT.rm_startblock)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1620,6 +1677,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1646,6 +1704,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1672,6 +1731,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1695,6 +1755,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1730,6 +1791,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1775,6 +1837,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1810,6 +1873,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1848,6 +1912,7 @@ xfs_rmap_convert_shared(
 		if (error)
 			goto done;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -1937,6 +2002,7 @@ xfs_rmap_unmap_shared(
 	if (error)
 		goto out_error;
 	if (XFS_IS_CORRUPT(mp, i != 1)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -1947,12 +2013,14 @@ xfs_rmap_unmap_shared(
 			   ltrec.rm_startblock > bno ||
 			   ltrec.rm_startblock + ltrec.rm_blockcount <
 			   bno + len)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
 
 	/* Make sure the owner matches what we expect to find in the tree. */
 	if (XFS_IS_CORRUPT(mp, owner != ltrec.rm_owner)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -1961,16 +2029,19 @@ xfs_rmap_unmap_shared(
 	if (XFS_IS_CORRUPT(mp,
 			   (flags & XFS_RMAP_UNWRITTEN) !=
 			   (ltrec.rm_flags & XFS_RMAP_UNWRITTEN))) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
 
 	/* Check the offset. */
 	if (XFS_IS_CORRUPT(mp, ltrec.rm_offset > offset)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
 	if (XFS_IS_CORRUPT(mp, offset > ltoff + ltrec.rm_blockcount)) {
+		xfs_btree_mark_sick(cur);
 		error = -EFSCORRUPTED;
 		goto out_error;
 	}
@@ -2027,6 +2098,7 @@ xfs_rmap_unmap_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -2056,6 +2128,7 @@ xfs_rmap_unmap_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -2135,6 +2208,7 @@ xfs_rmap_map_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, have_gt != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -2187,6 +2261,7 @@ xfs_rmap_map_shared(
 		if (error)
 			goto out_error;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_error;
 		}
@@ -2285,7 +2360,7 @@ xfs_rmap_query_range_helper(
 	struct xfs_rmap_irec			irec;
 	int					error;
 
-	error = xfs_rmap_btrec_to_irec(rec, &irec);
+	error = xfs_rmap_btrec_to_irec(cur, rec, &irec);
 	if (error)
 		return error;
 	return query->fn(cur, &irec, query->priv);
diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h
index abe633403fd1..e756989d0da5 100644
--- a/fs/xfs/libxfs/xfs_rmap.h
+++ b/fs/xfs/libxfs/xfs_rmap.h
@@ -190,7 +190,7 @@ int xfs_rmap_lookup_le_range(struct xfs_btree_cur *cur, xfs_agblock_t bno,
 int xfs_rmap_compare(const struct xfs_rmap_irec *a,
 		const struct xfs_rmap_irec *b);
 union xfs_btree_rec;
-int xfs_rmap_btrec_to_irec(union xfs_btree_rec *rec,
+int xfs_rmap_btrec_to_irec(struct xfs_btree_cur *cur, union xfs_btree_rec *rec,
 		struct xfs_rmap_irec *irec);
 int xfs_rmap_has_record(struct xfs_btree_cur *cur, xfs_agblock_t bno,
 		xfs_extlen_t len, bool *exists);
diff --git a/fs/xfs/scrub/rmap.c b/fs/xfs/scrub/rmap.c
index 8d4cefd761c1..eb92ccb67a98 100644
--- a/fs/xfs/scrub/rmap.c
+++ b/fs/xfs/scrub/rmap.c
@@ -99,7 +99,7 @@ xchk_rmapbt_rec(
 	bool			is_attr;
 	int			error;
 
-	error = xfs_rmap_btrec_to_irec(rec, &irec);
+	error = xfs_rmap_btrec_to_irec(bs->cur, rec, &irec);
 	if (!xchk_btree_process_error(bs->sc, bs->cur, 0, &error))
 		goto out;
 
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index cae613620175..c6a43b4bd9c2 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -18,6 +18,7 @@
 #include "xfs_extent_busy.h"
 #include "xfs_trace.h"
 #include "xfs_log.h"
+#include "xfs_health.h"
 
 STATIC int
 xfs_trim_extents(
@@ -72,6 +73,7 @@ xfs_trim_extents(
 		if (error)
 			goto out_del_cursor;
 		if (XFS_IS_CORRUPT(mp, i != 1)) {
+			xfs_btree_mark_sick(cur);
 			error = -EFSCORRUPTED;
 			goto out_del_cursor;
 		}
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 5e5de5338476..1f09027c55ad 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -14,6 +14,7 @@
 #include "xfs_inode.h"
 #include "xfs_trace.h"
 #include "xfs_health.h"
+#include "xfs_btree.h"
 
 /*
  * Warn about metadata corruption that we detected but haven't fixed, and
@@ -478,3 +479,41 @@ xfs_bmap_mark_sick(
 
 	xfs_inode_mark_sick(ip, mask);
 }
+
+/* Record observations of btree corruption with the health tracking system. */
+void
+xfs_btree_mark_sick(
+	struct xfs_btree_cur		*cur)
+{
+	unsigned int			mask;
+
+	switch (cur->bc_btnum) {
+	case XFS_BTNUM_BMAP:
+		xfs_bmap_mark_sick(cur->bc_private.b.ip,
+				   cur->bc_private.b.whichfork);
+		return;
+	case XFS_BTNUM_BNO:
+		mask = XFS_SICK_AG_BNOBT;
+		break;
+	case XFS_BTNUM_CNT:
+		mask = XFS_SICK_AG_CNTBT;
+		break;
+	case XFS_BTNUM_INO:
+		mask = XFS_SICK_AG_INOBT;
+		break;
+	case XFS_BTNUM_FINO:
+		mask = XFS_SICK_AG_FINOBT;
+		break;
+	case XFS_BTNUM_RMAP:
+		mask = XFS_SICK_AG_RMAPBT;
+		break;
+	case XFS_BTNUM_REFC:
+		mask = XFS_SICK_AG_REFCNTBT;
+		break;
+	default:
+		ASSERT(0);
+		return;
+	}
+
+	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
+}
diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c
index 233dcc8784db..5981beb179ca 100644
--- a/fs/xfs/xfs_iwalk.c
+++ b/fs/xfs/xfs_iwalk.c
@@ -298,8 +298,10 @@ xfs_iwalk_ag_start(
 	error = xfs_inobt_get_rec(*curpp, irec, has_more);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(mp, *has_more != 1))
+	if (XFS_IS_CORRUPT(mp, *has_more != 1)) {
+		xfs_btree_mark_sick(*curpp);
 		return -EFSCORRUPTED;
+	}
 
 	/*
 	 * If the LE lookup yielded an inobt record before the cursor position,


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (3 preceding siblings ...)
  2019-11-14 18:19 ` [PATCH 4/9] xfs: report btree block corruption errors to the health system Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-20 16:11   ` Brian Foster
  2019-11-14 18:19 ` [PATCH 6/9] xfs: report symlink " Darrick J. Wong
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter corrupt directory or extended attribute blocks, we
should report that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
 fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
 fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
 fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
 fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
 fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
 fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
 fs/xfs/libxfs/xfs_health.h      |    3 +++
 fs/xfs/xfs_attr_inactive.c      |    4 ++++
 fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
 fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
 fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
 12 files changed, 126 insertions(+), 20 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 85ec5945d29f..e347b5dabded 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -27,7 +27,7 @@
 #include "xfs_buf_item.h"
 #include "xfs_dir2.h"
 #include "xfs_log.h"
-
+#include "xfs_health.h"
 
 /*
  * xfs_attr_leaf.c
@@ -2346,6 +2346,7 @@ xfs_attr3_leaf_lookup_int(
 	entries = xfs_attr3_leaf_entryp(leaf);
 	if (ichdr.count >= args->geo->blksize / 8) {
 		xfs_buf_corruption_error(bp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 
@@ -2365,10 +2366,12 @@ xfs_attr3_leaf_lookup_int(
 	}
 	if (!(probe >= 0 && (!ichdr.count || probe < ichdr.count))) {
 		xfs_buf_corruption_error(bp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 	if (!(span <= 4 || be32_to_cpu(entry->hashval) == hashval)) {
 		xfs_buf_corruption_error(bp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index a6ef5df42669..ef34d90501a7 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -22,6 +22,7 @@
 #include "xfs_attr_remote.h"
 #include "xfs_trace.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 #define ATTR_RMTVALUE_MAPSIZE	1	/* # of map entries at once */
 
@@ -261,17 +262,18 @@ xfs_attr3_rmt_hdr_set(
  */
 STATIC int
 xfs_attr_rmtval_copyout(
-	struct xfs_mount *mp,
-	struct xfs_buf	*bp,
-	xfs_ino_t	ino,
-	int		*offset,
-	int		*valuelen,
-	uint8_t		**dst)
+	struct xfs_mount	*mp,
+	struct xfs_buf		*bp,
+	struct xfs_inode	*dp,
+	int			*offset,
+	int			*valuelen,
+	uint8_t			**dst)
 {
-	char		*src = bp->b_addr;
-	xfs_daddr_t	bno = bp->b_bn;
-	int		len = BBTOB(bp->b_length);
-	int		blksize = mp->m_attr_geo->blksize;
+	char			*src = bp->b_addr;
+	xfs_ino_t		ino = dp->i_ino;
+	xfs_daddr_t		bno = bp->b_bn;
+	int			len = BBTOB(bp->b_length);
+	int			blksize = mp->m_attr_geo->blksize;
 
 	ASSERT(len >= blksize);
 
@@ -287,6 +289,7 @@ xfs_attr_rmtval_copyout(
 				xfs_alert(mp,
 "remote attribute header mismatch bno/off/len/owner (0x%llx/0x%x/Ox%x/0x%llx)",
 					bno, *offset, byte_cnt, ino);
+				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 				return -EFSCORRUPTED;
 			}
 			hdr_size = sizeof(struct xfs_attr3_rmt_hdr);
@@ -405,10 +408,12 @@ xfs_attr_rmtval_get(
 						   mp->m_ddev_targp,
 						   dblkno, dblkcnt, 0, &bp,
 						   &xfs_attr3_rmt_buf_ops);
+			if (xfs_metadata_is_sick(error))
+				xfs_da_mark_sick(args);
 			if (error)
 				return error;
 
-			error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino,
+			error = xfs_attr_rmtval_copyout(mp, bp, args->dp,
 							&offset, &valuelen,
 							&dst);
 			xfs_trans_brelse(args->trans, bp);
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index e424b004e3cb..a17622dadf00 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -22,6 +22,7 @@
 #include "xfs_trace.h"
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
+#include "xfs_health.h"
 
 /*
  * xfs_da_btree.c
@@ -359,6 +360,7 @@ xfs_da3_node_read(
 					tp->t_mountp, info, sizeof(*info));
 			xfs_trans_brelse(tp, *bpp);
 			*bpp = NULL;
+			xfs_dirattr_mark_sick(dp, which_fork);
 			return -EFSCORRUPTED;
 		}
 		xfs_trans_buf_set_type(tp, *bpp, type);
@@ -554,6 +556,7 @@ xfs_da3_split(
 	if (node->hdr.info.forw) {
 		if (be32_to_cpu(node->hdr.info.forw) != addblk->blkno) {
 			xfs_buf_corruption_error(oldblk->bp);
+			xfs_da_mark_sick(state->args);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -567,6 +570,7 @@ xfs_da3_split(
 	if (node->hdr.info.back) {
 		if (be32_to_cpu(node->hdr.info.back) != addblk->blkno) {
 			xfs_buf_corruption_error(oldblk->bp);
+			xfs_da_mark_sick(state->args);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
 
 		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
 			xfs_buf_corruption_error(blk->bp);
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
 		}
 
@@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
 		/* Tree taller than we can handle; bail out! */
 		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
 			xfs_buf_corruption_error(blk->bp);
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
 		}
 
@@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
 			expected_level = nodehdr.level - 1;
 		else if (expected_level != nodehdr.level) {
 			xfs_buf_corruption_error(blk->bp);
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
 		} else
 			expected_level--;
@@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
 		}
 
 		/* We can't point back to the root. */
-		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
+		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
+		}
 	}
 
-	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
+	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
+	}
 
 	/*
 	 * A leaf block that ends in the hashval that we are interested in
@@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
 			args->blkno = blk->blkno;
 		} else {
 			ASSERT(0);
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
 		}
 		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
@@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
 	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
 	if (error)
 		return error;
-	if (XFS_IS_CORRUPT(mp, lastoff == 0))
+	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
+	}
 	/*
 	 * Read the last block in the btree space.
 	 */
@@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
 		if (XFS_IS_CORRUPT(mp,
 				   be32_to_cpu(sib_info->forw) != last_blkno ||
 				   sib_info->magic != dead_info->magic)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
 		if (XFS_IS_CORRUPT(mp,
 				   be32_to_cpu(sib_info->back) != last_blkno ||
 				   sib_info->magic != dead_info->magic)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
 		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
 		if (XFS_IS_CORRUPT(mp,
 				   level >= 0 && level != par_hdr.level + 1)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
 		     entno++)
 			continue;
 		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
 		xfs_trans_brelse(tp, par_buf);
 		par_buf = NULL;
 		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
 		par_node = par_buf->b_addr;
 		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
 		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
+			xfs_da_mark_sick(args);
 			error = -EFSCORRUPTED;
 			goto done;
 		}
@@ -2601,6 +2621,7 @@ xfs_dabuf_map(
 					irecs[i].br_state);
 			}
 		}
+		xfs_dirattr_mark_sick(dp, whichfork);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -2693,6 +2714,8 @@ xfs_da_read_buf(
 	error = xfs_trans_read_buf_map(dp->i_mount, trans,
 					dp->i_mount->m_ddev_targp,
 					mapp, nmap, 0, &bp, ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_dirattr_mark_sick(dp, whichfork);
 	if (error)
 		goto out_free;
 
diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 0aa87cbde49e..e1aa411a1b8b 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -18,6 +18,7 @@
 #include "xfs_errortag.h"
 #include "xfs_error.h"
 #include "xfs_trace.h"
+#include "xfs_health.h"
 
 struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
 
@@ -608,8 +609,10 @@ xfs_dir2_isblock(
 	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
 	if (XFS_IS_CORRUPT(args->dp->i_mount,
 			   rval != 0 &&
-			   args->dp->i_d.di_size != args->geo->blksize))
+			   args->dp->i_d.di_size != args->geo->blksize)) {
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
+	}
 	*vp = rval;
 	return 0;
 }
diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
index a6eb71a62b53..80cc9c7ea4e5 100644
--- a/fs/xfs/libxfs/xfs_dir2_data.c
+++ b/fs/xfs/libxfs/xfs_dir2_data.c
@@ -18,6 +18,7 @@
 #include "xfs_trans.h"
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
+#include "xfs_health.h"
 
 static xfs_failaddr_t xfs_dir2_data_freefind_verify(
 		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
@@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
 corrupt:
 	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
 			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
+	xfs_da_mark_sick(args);
 	return -EFSCORRUPTED;
 }
 
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index 73edd96ce0ac..32d17420fff3 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -19,6 +19,7 @@
 #include "xfs_trace.h"
 #include "xfs_trans.h"
 #include "xfs_buf_item.h"
+#include "xfs_health.h"
 
 /*
  * Local function declarations.
@@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
 	bestsp = xfs_dir2_leaf_bests_p(ltp);
 	if (be16_to_cpu(bestsp[db]) != oldbest) {
 		xfs_buf_corruption_error(lbp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
+
 	/*
 	 * Mark the former data entry unused.
 	 */
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 3a8b0625a08b..e0f3ab254a1a 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -20,6 +20,7 @@
 #include "xfs_trans.h"
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
+#include "xfs_health.h"
 
 /*
  * Function declarations.
@@ -228,6 +229,7 @@ __xfs_dir3_free_read(
 	if (fa) {
 		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
 		xfs_trans_brelse(tp, *bpp);
+		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
 		return -EFSCORRUPTED;
 	}
 
@@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
 	if (be32_to_cpu(ltp->bestcount) >
 				(uint)dp->i_d.di_size / args->geo->blksize) {
 		xfs_buf_corruption_error(lbp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 
@@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
 	 */
 	if (index < 0) {
 		xfs_buf_corruption_error(bp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 
@@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
 					   cpu_to_be16(NULLDATAOFF))) {
 				if (curfdb != newfdb)
 					xfs_trans_brelse(tp, curbp);
+				xfs_da_mark_sick(args);
 				return -EFSCORRUPTED;
 			}
 			curfdb = newfdb;
@@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
 	xfs_dir3_leaf_check(dp, bp);
 	if (leafhdr.count <= 0) {
 		xfs_buf_corruption_error(bp);
+		xfs_da_mark_sick(args);
 		return -EFSCORRUPTED;
 	}
 
@@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
 			} else {
 				xfs_alert(mp, " ... fblk is NULL");
 			}
+			xfs_da_mark_sick(args);
 			return -EFSCORRUPTED;
 		}
 
diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index 2049419e9555..d9404cd3d09b 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -38,6 +38,7 @@ struct xfs_perag;
 struct xfs_inode;
 struct xfs_fsop_geom;
 struct xfs_btree_cur;
+struct xfs_da_args;
 
 /* Observable health issues for metadata spanning the entire filesystem. */
 #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
@@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
 void xfs_health_unmount(struct xfs_mount *mp);
 void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
 void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
+void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
+void xfs_da_mark_sick(struct xfs_da_args *args);
 
 /* Now some helpers. */
 
diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
index a78c501f6fb1..429a97494ffa 100644
--- a/fs/xfs/xfs_attr_inactive.c
+++ b/fs/xfs/xfs_attr_inactive.c
@@ -23,6 +23,7 @@
 #include "xfs_quota.h"
 #include "xfs_dir2.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 /*
  * Look at all the extents for this logical region,
@@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
 	if (level > XFS_DA_NODE_MAXDEPTH) {
 		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
 		xfs_buf_corruption_error(bp);
+		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 		return -EFSCORRUPTED;
 	}
 
@@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
 			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
 			break;
 		default:
+			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 			xfs_buf_corruption_error(child_bp);
 			xfs_trans_brelse(*trans, child_bp);
 			error = -EFSCORRUPTED;
@@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
 		error = xfs_attr3_leaf_inactive(trans, dp, bp);
 		break;
 	default:
+		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 		error = -EFSCORRUPTED;
 		xfs_buf_corruption_error(bp);
 		xfs_trans_brelse(*trans, bp);
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index 7a099df88a0c..1a2a3d4ce422 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -21,6 +21,7 @@
 #include "xfs_error.h"
 #include "xfs_trace.h"
 #include "xfs_dir2.h"
+#include "xfs_health.h"
 
 STATIC int
 xfs_attr_shortform_compare(const void *a, const void *b)
@@ -88,8 +89,10 @@ xfs_attr_shortform_list(
 		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
 			if (XFS_IS_CORRUPT(context->dp->i_mount,
 					   !xfs_attr_namecheck(sfe->nameval,
-							       sfe->namelen)))
+							       sfe->namelen))) {
+				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 				return -EFSCORRUPTED;
+			}
 			context->put_listent(context,
 					     sfe->flags,
 					     sfe->nameval,
@@ -131,6 +134,7 @@ xfs_attr_shortform_list(
 					     context->dp->i_mount, sfe,
 					     sizeof(*sfe));
 			kmem_free(sbuf);
+			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 			return -EFSCORRUPTED;
 		}
 
@@ -181,6 +185,7 @@ xfs_attr_shortform_list(
 		if (XFS_IS_CORRUPT(context->dp->i_mount,
 				   !xfs_attr_namecheck(sbp->name,
 						       sbp->namelen))) {
+			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
 			return 0;
 
 		/* We can't point back to the root. */
-		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
+		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
+			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 			return -EFSCORRUPTED;
+		}
 	}
 
 	if (expected_level != 0)
@@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
 out_corruptbuf:
 	xfs_buf_corruption_error(bp);
 	xfs_trans_brelse(tp, bp);
+	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
 	return -EFSCORRUPTED;
 }
 
@@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
 		}
 
 		if (XFS_IS_CORRUPT(context->dp->i_mount,
-				   !xfs_attr_namecheck(name, namelen)))
+				   !xfs_attr_namecheck(name, namelen))) {
+			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
 			return -EFSCORRUPTED;
+		}
 		context->put_listent(context, entry->flags,
 					      name, namelen, valuelen);
 		if (context->seen_enough)
diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
index 95bc9ef8f5f9..715ded503334 100644
--- a/fs/xfs/xfs_dir2_readdir.c
+++ b/fs/xfs/xfs_dir2_readdir.c
@@ -18,6 +18,7 @@
 #include "xfs_bmap.h"
 #include "xfs_trans.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 /*
  * Directory file type support functions
@@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
 		ctx->pos = off & 0x7fffffff;
 		if (XFS_IS_CORRUPT(dp->i_mount,
 				   !xfs_dir2_namecheck(sfep->name,
-						       sfep->namelen)))
+						       sfep->namelen))) {
+			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
 			return -EFSCORRUPTED;
+		}
 		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
 			    xfs_dir3_get_dtype(mp, filetype)))
 			return 0;
@@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
 		if (XFS_IS_CORRUPT(dp->i_mount,
 				   !xfs_dir2_namecheck(dep->name,
 						       dep->namelen))) {
+			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
 			error = -EFSCORRUPTED;
 			break;
 		}
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index 1f09027c55ad..c1b6e8fb72ec 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -15,6 +15,8 @@
 #include "xfs_trace.h"
 #include "xfs_health.h"
 #include "xfs_btree.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 
 /*
  * Warn about metadata corruption that we detected but haven't fixed, and
@@ -517,3 +519,40 @@ xfs_btree_mark_sick(
 
 	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
 }
+
+/*
+ * Record observations of dir/attr btree corruption with the health tracking
+ * system.
+ */
+void
+xfs_dirattr_mark_sick(
+	struct xfs_inode	*ip,
+	int			whichfork)
+{
+	unsigned int		mask;
+
+	switch (whichfork) {
+	case XFS_DATA_FORK:
+		mask = XFS_SICK_INO_DIR;
+		break;
+	case XFS_ATTR_FORK:
+		mask = XFS_SICK_INO_XATTR;
+		break;
+	default:
+		ASSERT(0);
+		return;
+	}
+
+	xfs_inode_mark_sick(ip, mask);
+}
+
+/*
+ * Record observations of dir/attr btree corruption with the health tracking
+ * system.
+ */
+void
+xfs_da_mark_sick(
+	struct xfs_da_args	*args)
+{
+	xfs_dirattr_mark_sick(args->dp, args->whichfork);
+}


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/9] xfs: report symlink block corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (4 preceding siblings ...)
  2019-11-14 18:19 ` [PATCH 5/9] xfs: report dir/attr " Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-14 18:19 ` [PATCH 7/9] xfs: report inode " Darrick J. Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter corrupt symbolic link blocks, we should report
that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_iops.c    |    5 ++++-
 fs/xfs/xfs_symlink.c |    6 ++++++
 2 files changed, 10 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 8afe69ca188b..f698351cee5d 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -21,6 +21,7 @@
 #include "xfs_dir2.h"
 #include "xfs_iomap.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 #include <linux/xattr.h>
 #include <linux/posix_acl.h>
@@ -481,8 +482,10 @@ xfs_vn_get_link_inline(
 	 * if_data is junk.
 	 */
 	link = ip->i_df.if_u1.if_data;
-	if (XFS_IS_CORRUPT(ip->i_mount, !link))
+	if (XFS_IS_CORRUPT(ip->i_mount, !link)) {
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_SYMLINK);
 		return ERR_PTR(-EFSCORRUPTED);
+	}
 	return link;
 }
 
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index a25502bc2071..f5926985d24f 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -21,6 +21,7 @@
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
 #include "xfs_trans.h"
+#include "xfs_health.h"
 
 /* ----- Kernel only functions below ----- */
 int
@@ -63,6 +64,8 @@ xfs_readlink_bmap_ilocked(
 			xfs_buf_relse(bp);
 
 			/* bad CRC means corrupted metadata */
+			if (xfs_metadata_is_sick(error))
+				xfs_inode_mark_sick(ip, XFS_SICK_INO_SYMLINK);
 			if (error == -EFSBADCRC)
 				error = -EFSCORRUPTED;
 			goto out;
@@ -75,6 +78,7 @@ xfs_readlink_bmap_ilocked(
 		if (xfs_sb_version_hascrc(&mp->m_sb)) {
 			if (!xfs_symlink_hdr_ok(ip->i_ino, offset,
 							byte_cnt, bp)) {
+				xfs_inode_mark_sick(ip, XFS_SICK_INO_SYMLINK);
 				error = -EFSCORRUPTED;
 				xfs_alert(mp,
 "symlink header does not match required off/len/owner (0x%x/Ox%x,0x%llx)",
@@ -130,6 +134,7 @@ xfs_readlink(
 			 __func__, (unsigned long long) ip->i_ino,
 			 (long long) pathlen);
 		ASSERT(0);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_SYMLINK);
 		error = -EFSCORRUPTED;
 		goto out;
 	}
@@ -502,6 +507,7 @@ xfs_inactive_symlink(
 			 __func__, (unsigned long long)ip->i_ino, pathlen);
 		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 		ASSERT(0);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_SYMLINK);
 		return -EFSCORRUPTED;
 	}
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/9] xfs: report inode corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (5 preceding siblings ...)
  2019-11-14 18:19 ` [PATCH 6/9] xfs: report symlink " Darrick J. Wong
@ 2019-11-14 18:19 ` Darrick J. Wong
  2019-11-14 18:20 ` [PATCH 8/9] xfs: report quota block " Darrick J. Wong
  2019-11-14 18:20 ` [PATCH 9/9] xfs: report realtime metadata " Darrick J. Wong
  8 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:19 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter corrupt inode records, we should report that to
the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_inode_buf.c  |    5 +++++
 fs/xfs/libxfs/xfs_inode_fork.c |    8 ++++++++
 fs/xfs/xfs_icache.c            |    4 ++++
 fs/xfs/xfs_inode.c             |    2 ++
 4 files changed, 19 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 8afacfe4be0a..10dab755abe0 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -17,6 +17,7 @@
 #include "xfs_trans.h"
 #include "xfs_ialloc.h"
 #include "xfs_dir2.h"
+#include "xfs_health.h"
 
 #include <linux/iversion.h>
 
@@ -182,6 +183,9 @@ xfs_imap_to_bp(
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap->im_blkno,
 				   (int)imap->im_len, buf_flags, &bp,
 				   &xfs_inode_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_agno_mark_sick(mp, xfs_daddr_to_agno(mp, imap->im_blkno),
+				XFS_SICK_AG_INOBT);
 	if (error) {
 		if (error == -EAGAIN) {
 			ASSERT(buf_flags & XBF_TRYLOCK);
@@ -648,6 +652,7 @@ xfs_iread(
 	if (fa) {
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
 				sizeof(*dip), fa);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		error = -EFSCORRUPTED;
 		goto out_brelse;
 	}
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 15d6f947620f..6698161b581b 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -23,6 +23,7 @@
 #include "xfs_da_btree.h"
 #include "xfs_dir2_priv.h"
 #include "xfs_attr_leaf.h"
+#include "xfs_health.h"
 
 kmem_zone_t *xfs_ifork_zone;
 
@@ -77,6 +78,7 @@ xfs_iformat_fork(
 		default:
 			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
 					dip, sizeof(*dip), __this_address);
+			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 			return -EFSCORRUPTED;
 		}
 		break;
@@ -84,6 +86,7 @@ xfs_iformat_fork(
 	default:
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
 				sizeof(*dip), __this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		return -EFSCORRUPTED;
 	}
 	if (error)
@@ -116,6 +119,7 @@ xfs_iformat_fork(
 	default:
 		xfs_inode_verifier_error(ip, error, __func__, dip,
 				sizeof(*dip), __this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		error = -EFSCORRUPTED;
 		break;
 	}
@@ -189,6 +193,7 @@ xfs_iformat_local(
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 				"xfs_iformat_local", dip, sizeof(*dip),
 				__this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		return -EFSCORRUPTED;
 	}
 
@@ -226,6 +231,7 @@ xfs_iformat_extents(
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 				"xfs_iformat_extents(1)", dip, sizeof(*dip),
 				__this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		return -EFSCORRUPTED;
 	}
 
@@ -245,6 +251,7 @@ xfs_iformat_extents(
 				xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 						"xfs_iformat_extents(2)",
 						dp, sizeof(*dp), fa);
+				xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 				return -EFSCORRUPTED;
 			}
 
@@ -304,6 +311,7 @@ xfs_iformat_btree(
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 				"xfs_iformat_btree", dfp, size,
 				__this_address);
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index ec302b7e48f3..3e6c27d69132 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -22,6 +22,7 @@
 #include "xfs_dquot_item.h"
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
+#include "xfs_health.h"
 
 #include <linux/iversion.h>
 
@@ -321,6 +322,7 @@ xfs_iget_check_free_state(
 			xfs_warn(ip->i_mount,
 "Corruption detected! Free inode 0x%llx not marked free! (mode 0x%x)",
 				ip->i_ino, VFS_I(ip)->i_mode);
+			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 			return -EFSCORRUPTED;
 		}
 
@@ -328,6 +330,7 @@ xfs_iget_check_free_state(
 			xfs_warn(ip->i_mount,
 "Corruption detected! Free inode 0x%llx has blocks allocated!",
 				ip->i_ino);
+			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 			return -EFSCORRUPTED;
 		}
 		return 0;
@@ -511,6 +514,7 @@ xfs_iget_cache_miss(
 		goto out_destroy;
 
 	if (!xfs_inode_verify_forks(ip)) {
+		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 		error = -EFSCORRUPTED;
 		goto out_destroy;
 	}
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index a2812cea748d..e49712f073ff 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3653,6 +3653,7 @@ xfs_iflush_cluster(
 	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 
 	/* abort the corrupt inode, as it was not attached to the buffer */
+	xfs_inode_mark_sick(cip, XFS_SICK_INO_CORE);
 	xfs_iflush_abort(cip, false);
 	kmem_free(cilist);
 	xfs_perag_put(pag);
@@ -3950,6 +3951,7 @@ xfs_iflush_int(
 	return 0;
 
 corrupt_out:
+	xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
 	return -EFSCORRUPTED;
 }
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 8/9] xfs: report quota block corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (6 preceding siblings ...)
  2019-11-14 18:19 ` [PATCH 7/9] xfs: report inode " Darrick J. Wong
@ 2019-11-14 18:20 ` Darrick J. Wong
  2019-11-14 18:20 ` [PATCH 9/9] xfs: report realtime metadata " Darrick J. Wong
  8 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:20 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter corrupt quota blocks, we should report that to the
health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_health.h |    1 +
 fs/xfs/xfs_dquot.c         |    6 ++++++
 fs/xfs/xfs_health.c        |   15 +++++++++++++++
 fs/xfs/xfs_qm.c            |    9 +++++++--
 4 files changed, 29 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index d9404cd3d09b..69e7d97ed480 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -144,6 +144,7 @@ void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
 void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
 void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
 void xfs_da_mark_sick(struct xfs_da_args *args);
+void xfs_quota_mark_sick(struct xfs_mount *mp, uint dq_flags);
 
 /* Now some helpers. */
 
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 1d97e897ebde..35f1d794f952 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -23,6 +23,7 @@
 #include "xfs_trace.h"
 #include "xfs_log.h"
 #include "xfs_bmap_btree.h"
+#include "xfs_health.h"
 
 /*
  * Lock order:
@@ -419,6 +420,8 @@ xfs_dquot_disk_read(
 	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
 			mp->m_quotainfo->qi_dqchunklen, 0, &bp,
 			&xfs_dquot_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_quota_mark_sick(mp, dqp->dq_flags);
 	if (error) {
 		ASSERT(bp == NULL);
 		return error;
@@ -1107,6 +1110,8 @@ xfs_qm_dqflush(
 	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
 				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
 				   &xfs_dquot_buf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_quota_mark_sick(mp, dqp->dq_flags);
 	if (error)
 		goto out_unlock;
 
@@ -1126,6 +1131,7 @@ xfs_qm_dqflush(
 		xfs_buf_relse(bp);
 		xfs_dqfunlock(dqp);
 		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
+		xfs_quota_mark_sick(mp, dqp->dq_flags);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index c1b6e8fb72ec..2d3da765722e 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -17,6 +17,7 @@
 #include "xfs_btree.h"
 #include "xfs_da_format.h"
 #include "xfs_da_btree.h"
+#include "xfs_quota_defs.h"
 
 /*
  * Warn about metadata corruption that we detected but haven't fixed, and
@@ -556,3 +557,17 @@ xfs_da_mark_sick(
 {
 	xfs_dirattr_mark_sick(args->dp, args->whichfork);
 }
+
+/* Record observations of quota corruption with the health tracking system. */
+void
+xfs_quota_mark_sick(
+	struct xfs_mount	*mp,
+	uint			dq_flags)
+{
+	if (dq_flags & XFS_DQ_USER)
+		xfs_fs_mark_sick(mp, XFS_SICK_FS_UQUOTA);
+	if (dq_flags & XFS_DQ_GROUP)
+		xfs_fs_mark_sick(mp, XFS_SICK_FS_GQUOTA);
+	if (dq_flags & XFS_DQ_PROJ)
+		xfs_fs_mark_sick(mp, XFS_SICK_FS_PQUOTA);
+}
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 0b0909657bad..ed6cc943db92 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -23,6 +23,7 @@
 #include "xfs_trace.h"
 #include "xfs_icache.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 /*
  * The global quota manager. There is only one of these for the entire
@@ -756,14 +757,18 @@ xfs_qm_qino_alloc(
 			     (mp->m_sb.sb_gquotino != NULLFSINO)) {
 			ino = mp->m_sb.sb_gquotino;
 			if (XFS_IS_CORRUPT(mp,
-					   mp->m_sb.sb_pquotino != NULLFSINO))
+					   mp->m_sb.sb_pquotino != NULLFSINO)) {
+				xfs_quota_mark_sick(mp, XFS_DQ_PROJ);
 				return -EFSCORRUPTED;
+			}
 		} else if ((flags & XFS_QMOPT_GQUOTA) &&
 			     (mp->m_sb.sb_pquotino != NULLFSINO)) {
 			ino = mp->m_sb.sb_pquotino;
 			if (XFS_IS_CORRUPT(mp,
-					   mp->m_sb.sb_gquotino != NULLFSINO))
+					   mp->m_sb.sb_gquotino != NULLFSINO)) {
+				xfs_quota_mark_sick(mp, XFS_DQ_GROUP);
 				return -EFSCORRUPTED;
+			}
 		}
 		if (ino != NULLFSINO) {
 			error = xfs_iget(mp, NULL, ino, 0, 0, ip);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 9/9] xfs: report realtime metadata corruption errors to the health system
  2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
                   ` (7 preceding siblings ...)
  2019-11-14 18:20 ` [PATCH 8/9] xfs: report quota block " Darrick J. Wong
@ 2019-11-14 18:20 ` Darrick J. Wong
  8 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-14 18:20 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Whenever we encounter corrupt realtime metadat blocks, we should report
that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_rtbitmap.c |    9 ++++++++-
 fs/xfs/xfs_rtalloc.c         |    6 +++++-
 2 files changed, 13 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index f42c74cb8be5..88a87526280e 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -16,6 +16,7 @@
 #include "xfs_trans.h"
 #include "xfs_rtalloc.h"
 #include "xfs_error.h"
+#include "xfs_health.h"
 
 /*
  * Realtime allocator bitmap functions shared with userspace.
@@ -70,13 +71,19 @@ xfs_rtbuf_get(
 	if (error)
 		return error;
 
-	if (XFS_IS_CORRUPT(mp, nmap == 0 || !xfs_bmap_is_real_extent(&map)))
+	if (XFS_IS_CORRUPT(mp, nmap == 0 || !xfs_bmap_is_real_extent(&map))) {
+		xfs_rt_mark_sick(mp, issum ? XFS_SICK_RT_SUMMARY :
+					     XFS_SICK_RT_BITMAP);
 		return -EFSCORRUPTED;
+	}
 
 	ASSERT(map.br_startblock != NULLFSBLOCK);
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
 				   XFS_FSB_TO_DADDR(mp, map.br_startblock),
 				   mp->m_bsize, 0, &bp, &xfs_rtbuf_ops);
+	if (xfs_metadata_is_sick(error))
+		xfs_rt_mark_sick(mp, issum ? XFS_SICK_RT_SUMMARY :
+					     XFS_SICK_RT_BITMAP);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d42b5a2047e0..4ec0fead3177 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -18,7 +18,7 @@
 #include "xfs_trans_space.h"
 #include "xfs_icache.h"
 #include "xfs_rtalloc.h"
-
+#include "xfs_health.h"
 
 /*
  * Read and return the summary information for a given extent size,
@@ -1235,11 +1235,15 @@ xfs_rtmount_inodes(
 
 	sbp = &mp->m_sb;
 	error = xfs_iget(mp, NULL, sbp->sb_rbmino, 0, 0, &mp->m_rbmip);
+	if (xfs_metadata_is_sick(error))
+		xfs_rt_mark_sick(mp, XFS_SICK_RT_BITMAP);
 	if (error)
 		return error;
 	ASSERT(mp->m_rbmip != NULL);
 
 	error = xfs_iget(mp, NULL, sbp->sb_rsumino, 0, 0, &mp->m_rsumip);
+	if (xfs_metadata_is_sick(error))
+		xfs_rt_mark_sick(mp, XFS_SICK_RT_SUMMARY);
 	if (error) {
 		xfs_irele(mp->m_rbmip);
 		return error;


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/9] xfs: separate the marking of sick and checked metadata
  2019-11-14 18:19 ` [PATCH 1/9] xfs: separate the marking of sick and checked metadata Darrick J. Wong
@ 2019-11-20 14:20   ` Brian Foster
  2019-11-20 16:12     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-20 14:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 14, 2019 at 10:19:20AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Split the setting of the sick and checked masks into separate functions
> as part of preparing to add the ability for regular runtime fs code
> (i.e. not scrub) to mark metadata structures sick when corruptions are
> found.  Improve the documentation of libxfs' requirements for helper
> behavior.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_health.h |   24 ++++++++++++++++++----
>  fs/xfs/scrub/health.c      |   20 +++++++++++-------
>  fs/xfs/xfs_health.c        |   49 ++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_mount.c         |    5 ++++
>  4 files changed, 85 insertions(+), 13 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> index 272005ac8c88..3657a9cb8490 100644
> --- a/fs/xfs/libxfs/xfs_health.h
> +++ b/fs/xfs/libxfs/xfs_health.h
> @@ -26,9 +26,11 @@
>   * and the "sick" field tells us if that piece was found to need repairs.
>   * Therefore we can conclude that for a given sick flag value:
>   *
> - *  - checked && sick  => metadata needs repair
> - *  - checked && !sick => metadata is ok
> - *  - !checked         => has not been examined since mount
> + *  - checked && sick   => metadata needs repair
> + *  - checked && !sick  => metadata is ok
> + *  - !checked && sick  => errors have been observed during normal operation,
> + *                         but the metadata has not been checked thoroughly
> + *  - !checked && !sick => has not been examined since mount
>   */
>  

I don't see this change in the provided repo. Which is the right patch?

>  struct xfs_mount;
> @@ -97,24 +99,38 @@ struct xfs_fsop_geom;
>  				 XFS_SICK_INO_SYMLINK | \
>  				 XFS_SICK_INO_PARENT)
>  
> -/* These functions must be provided by the xfs implementation. */
> +/*
> + * These functions must be provided by the xfs implementation.  Function
> + * behavior with respect to the first argument should be as follows:
> + *
> + * xfs_*_mark_sick:    set the sick flags and do not set checked flags.

Nit: It's probably not necessary to say that we don't set the checked
flags here given the comment/function below.

Brian

> + * xfs_*_mark_checked: set the checked flags.
> + * xfs_*_mark_healthy: clear the sick flags and set the checked flags.
> + *
> + * xfs_*_measure_sickness: return the sick and check status in the provided
> + * out parameters.
> + */
>  
>  void xfs_fs_mark_sick(struct xfs_mount *mp, unsigned int mask);
> +void xfs_fs_mark_checked(struct xfs_mount *mp, unsigned int mask);
>  void xfs_fs_mark_healthy(struct xfs_mount *mp, unsigned int mask);
>  void xfs_fs_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
>  		unsigned int *checked);
>  
>  void xfs_rt_mark_sick(struct xfs_mount *mp, unsigned int mask);
> +void xfs_rt_mark_checked(struct xfs_mount *mp, unsigned int mask);
>  void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
>  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
>  		unsigned int *checked);
>  
>  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> +void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
>  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
>  void xfs_ag_measure_sickness(struct xfs_perag *pag, unsigned int *sick,
>  		unsigned int *checked);
>  
>  void xfs_inode_mark_sick(struct xfs_inode *ip, unsigned int mask);
> +void xfs_inode_mark_checked(struct xfs_inode *ip, unsigned int mask);
>  void xfs_inode_mark_healthy(struct xfs_inode *ip, unsigned int mask);
>  void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
>  		unsigned int *checked);
> diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c
> index 83d27cdf579b..a402f9026d5f 100644
> --- a/fs/xfs/scrub/health.c
> +++ b/fs/xfs/scrub/health.c
> @@ -137,30 +137,34 @@ xchk_update_health(
>  	switch (type_to_health_flag[sc->sm->sm_type].group) {
>  	case XHG_AG:
>  		pag = xfs_perag_get(sc->mp, sc->sm->sm_agno);
> -		if (bad)
> +		if (bad) {
>  			xfs_ag_mark_sick(pag, sc->sick_mask);
> -		else
> +			xfs_ag_mark_checked(pag, sc->sick_mask);
> +		} else
>  			xfs_ag_mark_healthy(pag, sc->sick_mask);
>  		xfs_perag_put(pag);
>  		break;
>  	case XHG_INO:
>  		if (!sc->ip)
>  			return;
> -		if (bad)
> +		if (bad) {
>  			xfs_inode_mark_sick(sc->ip, sc->sick_mask);
> -		else
> +			xfs_inode_mark_checked(sc->ip, sc->sick_mask);
> +		} else
>  			xfs_inode_mark_healthy(sc->ip, sc->sick_mask);
>  		break;
>  	case XHG_FS:
> -		if (bad)
> +		if (bad) {
>  			xfs_fs_mark_sick(sc->mp, sc->sick_mask);
> -		else
> +			xfs_fs_mark_checked(sc->mp, sc->sick_mask);
> +		} else
>  			xfs_fs_mark_healthy(sc->mp, sc->sick_mask);
>  		break;
>  	case XHG_RT:
> -		if (bad)
> +		if (bad) {
>  			xfs_rt_mark_sick(sc->mp, sc->sick_mask);
> -		else
> +			xfs_rt_mark_checked(sc->mp, sc->sick_mask);
> +		} else
>  			xfs_rt_mark_healthy(sc->mp, sc->sick_mask);
>  		break;
>  	default:
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 8e0cb05a7142..860dc70c99e7 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -100,6 +100,18 @@ xfs_fs_mark_sick(
>  
>  	spin_lock(&mp->m_sb_lock);
>  	mp->m_fs_sick |= mask;
> +	spin_unlock(&mp->m_sb_lock);
> +}
> +
> +/* Mark per-fs metadata as having been checked. */
> +void
> +xfs_fs_mark_checked(
> +	struct xfs_mount	*mp,
> +	unsigned int		mask)
> +{
> +	ASSERT(!(mask & ~XFS_SICK_FS_PRIMARY));
> +
> +	spin_lock(&mp->m_sb_lock);
>  	mp->m_fs_checked |= mask;
>  	spin_unlock(&mp->m_sb_lock);
>  }
> @@ -143,6 +155,19 @@ xfs_rt_mark_sick(
>  
>  	spin_lock(&mp->m_sb_lock);
>  	mp->m_rt_sick |= mask;
> +	spin_unlock(&mp->m_sb_lock);
> +}
> +
> +/* Mark realtime metadata as having been checked. */
> +void
> +xfs_rt_mark_checked(
> +	struct xfs_mount	*mp,
> +	unsigned int		mask)
> +{
> +	ASSERT(!(mask & ~XFS_SICK_RT_PRIMARY));
> +	trace_xfs_rt_mark_sick(mp, mask);
> +
> +	spin_lock(&mp->m_sb_lock);
>  	mp->m_rt_checked |= mask;
>  	spin_unlock(&mp->m_sb_lock);
>  }
> @@ -186,6 +211,18 @@ xfs_ag_mark_sick(
>  
>  	spin_lock(&pag->pag_state_lock);
>  	pag->pag_sick |= mask;
> +	spin_unlock(&pag->pag_state_lock);
> +}
> +
> +/* Mark per-ag metadata as having been checked. */
> +void
> +xfs_ag_mark_checked(
> +	struct xfs_perag	*pag,
> +	unsigned int		mask)
> +{
> +	ASSERT(!(mask & ~XFS_SICK_AG_PRIMARY));
> +
> +	spin_lock(&pag->pag_state_lock);
>  	pag->pag_checked |= mask;
>  	spin_unlock(&pag->pag_state_lock);
>  }
> @@ -229,6 +266,18 @@ xfs_inode_mark_sick(
>  
>  	spin_lock(&ip->i_flags_lock);
>  	ip->i_sick |= mask;
> +	spin_unlock(&ip->i_flags_lock);
> +}
> +
> +/* Mark inode metadata as having been checked. */
> +void
> +xfs_inode_mark_checked(
> +	struct xfs_inode	*ip,
> +	unsigned int		mask)
> +{
> +	ASSERT(!(mask & ~XFS_SICK_INO_PRIMARY));
> +
> +	spin_lock(&ip->i_flags_lock);
>  	ip->i_checked |= mask;
>  	spin_unlock(&ip->i_flags_lock);
>  }
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index fca65109cf24..27aa143d524b 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -555,8 +555,10 @@ xfs_check_summary_counts(
>  	if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) &&
>  	    (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks ||
>  	     !xfs_verify_icount(mp, mp->m_sb.sb_icount) ||
> -	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount))
> +	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) {
>  		xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
> +		xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
> +	}
>  
>  	/*
>  	 * We can safely re-initialise incore superblock counters from the
> @@ -1322,6 +1324,7 @@ xfs_force_summary_recalc(
>  		return;
>  
>  	xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
> +	xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-14 18:19 ` [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system Darrick J. Wong
@ 2019-11-20 14:20   ` Brian Foster
  2019-11-20 16:43     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-20 14:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Whenever we encounter a corrupt AG header, we should report that to the
> health monitoring system for later reporting.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
>  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
>  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
>  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
>  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
>  fs/xfs/libxfs/xfs_sb.c       |    2 ++
>  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
>  fs/xfs/xfs_inode.c           |    9 +++++++++
>  8 files changed, 51 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index c284e10af491..e75e3ae6c912 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -26,6 +26,7 @@
>  #include "xfs_log.h"
>  #include "xfs_ag_resv.h"
>  #include "xfs_bmap.h"
> +#include "xfs_health.h"
>  
>  extern kmem_zone_t	*xfs_bmap_free_item_zone;
>  
> @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
>  			mp, tp, mp->m_ddev_targp,
>  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
>  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> +	if (xfs_metadata_is_sick(error))
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);

Any reason we couldn't do some of these in verifiers? I'm assuming we'd
still need calls in various external corruption checks, but at least we
wouldn't add a requirement to check all future buffer reads, etc.

>  	if (error)
>  		return error;
>  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
>  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
>  		     be32_to_cpu(agf->agf_length))) {
>  		xfs_buf_corruption_error(agbp);
> +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -2952,6 +2956,8 @@ xfs_read_agf(
>  			mp, tp, mp->m_ddev_targp,
>  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
>  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> +	if (xfs_metadata_is_sick(error))
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
>  	if (error)
>  		return error;
>  	if (!*bpp)
> diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> index 3657a9cb8490..ce8954a10c66 100644
> --- a/fs/xfs/libxfs/xfs_health.h
> +++ b/fs/xfs/libxfs/xfs_health.h
> @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
>  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
>  		unsigned int *checked);
>  
> +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> +		unsigned int mask);
>  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
>  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
>  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
>  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
>  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
>  
> +#define xfs_metadata_is_sick(error) \
> +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> +		  (error) == -EFSBADCRC))

Why is -EIO considered sick? My understanding is that once something is
marked sick, scrub is the only way to clear that state. -EIO can be
transient, so afaict that means we could mark a persistent in-core state
based on a transient/resolved issue.

Along similar lines, what's the expected behavior in the event of any of
these errors for a kernel that might not support
CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
used for anything? If so, that seems Ok I suppose.. but it's a little
awkward if we'd see the tracepoints and such associated with the state
changes.

Brian

> +
>  #endif	/* __XFS_HEALTH_H__ */
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index 988cde7744e6..c401512a4350 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
> @@ -27,6 +27,7 @@
>  #include "xfs_trace.h"
>  #include "xfs_log.h"
>  #include "xfs_rmap.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Lookup a record by ino in the btree given by cur.
> @@ -2635,6 +2636,8 @@ xfs_read_agi(
>  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
>  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
>  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> +	if (xfs_metadata_is_sick(error))
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
>  	if (error)
>  		return error;
>  	if (tp)
> diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> index d7d702ee4d1a..25c87834e42a 100644
> --- a/fs/xfs/libxfs/xfs_refcount.c
> +++ b/fs/xfs/libxfs/xfs_refcount.c
> @@ -22,6 +22,7 @@
>  #include "xfs_bit.h"
>  #include "xfs_refcount.h"
>  #include "xfs_rmap.h"
> +#include "xfs_health.h"
>  
>  /* Allowable refcount adjustment amounts. */
>  enum xfs_refc_adjust_op {
> @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
>  				XFS_ALLOC_FLAG_FREEING, &agbp);
>  		if (error)
>  			return error;
> -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
>  			return -EFSCORRUPTED;
> +		}
>  
>  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
>  		if (!rcur) {
> diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> index ff9412f113c4..a54a3c129cce 100644
> --- a/fs/xfs/libxfs/xfs_rmap.c
> +++ b/fs/xfs/libxfs/xfs_rmap.c
> @@ -21,6 +21,7 @@
>  #include "xfs_errortag.h"
>  #include "xfs_error.h"
>  #include "xfs_inode.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Lookup the first record less than or equal to [bno, len, owner, offset]
> @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
>  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
>  		if (error)
>  			return error;
> -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
>  			return -EFSCORRUPTED;
> +		}
>  
>  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
>  		if (!rcur) {
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 0ac69751fe85..4a923545465d 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
>  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
>  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
>  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> +	if (xfs_metadata_is_sick(error))
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
>  	if (error)
>  		return error;
>  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 860dc70c99e7..36c32b108b39 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
>  	spin_unlock(&mp->m_sb_lock);
>  }
>  
> +/* Mark unhealthy per-ag metadata given a raw AG number. */
> +void
> +xfs_agno_mark_sick(
> +	struct xfs_mount	*mp,
> +	xfs_agnumber_t		agno,
> +	unsigned int		mask)
> +{
> +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> +
> +	/* per-ag structure not set up yet? */
> +	if (!pag)
> +		return;
> +
> +	xfs_ag_mark_sick(pag, mask);
> +	xfs_perag_put(pag);
> +}
> +
>  /* Mark unhealthy per-ag metadata. */
>  void
>  xfs_ag_mark_sick(
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 401da197f012..a2812cea748d 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -35,6 +35,7 @@
>  #include "xfs_log.h"
>  #include "xfs_bmap_btree.h"
>  #include "xfs_reflink.h"
> +#include "xfs_health.h"
>  
>  kmem_zone_t *xfs_inode_zone;
>  
> @@ -787,6 +788,8 @@ xfs_ialloc(
>  	 */
>  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
>  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> +				XFS_SICK_AG_INOBT);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
>  	 */
>  	if (old_value == new_agino) {
>  		xfs_buf_corruption_error(agibp);
> +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
>  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
>  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
>  				sizeof(*dip), __this_address);
> +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
>  		error = -EFSCORRUPTED;
>  		goto out;
>  	}
> @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
>  		if (next_agino != NULLAGINO) {
>  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
>  					dip, sizeof(*dip), __this_address);
> +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
>  			error = -EFSCORRUPTED;
>  		}
>  		goto out;
> @@ -2271,6 +2277,7 @@ xfs_iunlink(
>  	if (next_agino == agino ||
>  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
>  		xfs_buf_corruption_error(agibp);
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
>  			XFS_CORRUPTION_ERROR(__func__,
>  					XFS_ERRLEVEL_LOW, mp,
>  					*dipp, sizeof(**dipp));
> +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
>  			error = -EFSCORRUPTED;
>  			return error;
>  		}
> @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
>  	if (!xfs_verify_agino(mp, agno, head_agino)) {
>  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
>  				agi, sizeof(*agi));
> +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
>  		return -EFSCORRUPTED;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/9] xfs: report block map corruption errors to the health tracking system
  2019-11-14 18:19 ` [PATCH 3/9] xfs: report block map " Darrick J. Wong
@ 2019-11-20 14:21   ` Brian Foster
  2019-11-20 16:57     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-20 14:21 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 14, 2019 at 10:19:33AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Whenever we encounter a corrupt block mapping, we should report that to
> the health monitoring system for later reporting.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c   |   39 +++++++++++++++++++++++++++++++++------
>  fs/xfs/libxfs/xfs_health.h |    1 +
>  fs/xfs/xfs_health.c        |   26 ++++++++++++++++++++++++++
>  fs/xfs/xfs_iomap.c         |   15 +++++++++++----
>  4 files changed, 71 insertions(+), 10 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 4acc6e37c31d..c4674fb0bfb4 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -35,7 +35,7 @@
>  #include "xfs_refcount.h"
>  #include "xfs_icache.h"
>  #include "xfs_iomap.h"
> -
> +#include "xfs_health.h"
>  
>  kmem_zone_t		*xfs_bmap_free_item_zone;
>  
> @@ -732,6 +732,7 @@ xfs_bmap_extents_to_btree(
>  	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
>  	abp = xfs_btree_get_bufl(mp, tp, args.fsbno);
>  	if (XFS_IS_CORRUPT(mp, !abp)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		error = -EFSCORRUPTED;
>  		goto out_unreserve_dquot;
>  	}
> @@ -1021,6 +1022,7 @@ xfs_bmap_add_attrfork_local(
>  
>  	/* should only be called for types that support local format data */
>  	ASSERT(0);
> +	xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
>  	return -EFSCORRUPTED;
>  }

Is it really the attr fork that's corrupt if we get here?

>  
> @@ -1090,6 +1092,7 @@ xfs_bmap_add_attrfork(
>  	if (XFS_IFORK_Q(ip))
>  		goto trans_cancel;
>  	if (XFS_IS_CORRUPT(mp, ip->i_d.di_anextents != 0)) {
> +		xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);

Similar question here given we haven't added the fork yet. di_anextents
is at least related I suppose, but it's not clear that
scrubbing/repairing the attr fork is what needs to happen.

>  		error = -EFSCORRUPTED;
>  		goto trans_cancel;
>  	}
...
> @@ -1239,6 +1244,7 @@ xfs_iread_extents(
>  	if (XFS_IS_CORRUPT(mp,
>  			   XFS_IFORK_FORMAT(ip, whichfork) !=
>  			   XFS_DINODE_FMT_BTREE)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		error = -EFSCORRUPTED;
>  		goto out;
>  	}
> @@ -1254,6 +1260,7 @@ xfs_iread_extents(
>  
>  	if (XFS_IS_CORRUPT(mp,
>  			   ir.loaded != XFS_IFORK_NEXTENTS(ip, whichfork))) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		error = -EFSCORRUPTED;
>  		goto out;
>  	}
> @@ -1262,6 +1269,8 @@ xfs_iread_extents(
>  	ifp->if_flags |= XFS_IFEXTENTS;
>  	return 0;
>  out:
> +	if (xfs_metadata_is_sick(error))
> +		xfs_bmap_mark_sick(ip, whichfork);
>  	xfs_iext_destroy(ifp);
>  	return error;
>  }

Duplicate calls in xfs_iread_extents()?

Brian

> @@ -1344,6 +1353,7 @@ xfs_bmap_last_before(
>  		break;
>  	default:
>  		ASSERT(0);
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -1443,8 +1453,11 @@ xfs_bmap_last_offset(
>  	if (XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_LOCAL)
>  		return 0;
>  
> -	if (XFS_IS_CORRUPT(ip->i_mount, !xfs_ifork_has_extents(ip, whichfork)))
> +	if (XFS_IS_CORRUPT(ip->i_mount,
> +	    !xfs_ifork_has_extents(ip, whichfork))) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
> +	}
>  
>  	error = xfs_bmap_last_extent(NULL, ip, whichfork, &rec, &is_empty);
>  	if (error || is_empty)
> @@ -3905,6 +3918,7 @@ xfs_bmapi_read(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -3935,6 +3949,7 @@ xfs_bmapi_read(
>  		xfs_alert(mp, "%s: inode %llu missing fork %d",
>  				__func__, ip->i_ino, whichfork);
>  #endif /* DEBUG */
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -4414,6 +4429,7 @@ xfs_bmapi_write(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -4621,9 +4637,11 @@ xfs_bmapi_convert_delalloc(
>  	error = -ENOSPC;
>  	if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK))
>  		goto out_finish;
> -	error = -EFSCORRUPTED;
> -	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock)))
> +	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) {
> +		xfs_bmap_mark_sick(ip, whichfork);
> +		error = -EFSCORRUPTED;
>  		goto out_finish;
> +	}
>  
>  	XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, bma.length));
>  	XFS_STATS_INC(mp, xs_xstrat_quick);
> @@ -4681,6 +4699,7 @@ xfs_bmapi_remap(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -5319,8 +5338,10 @@ __xfs_bunmapi(
>  	whichfork = xfs_bmapi_whichfork(flags);
>  	ASSERT(whichfork != XFS_COW_FORK);
>  	ifp = XFS_IFORK_PTR(ip, whichfork);
> -	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)))
> +	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork))) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
> +	}
>  	if (XFS_FORCED_SHUTDOWN(mp))
>  		return -EIO;
>  
> @@ -5815,6 +5836,7 @@ xfs_bmap_collapse_extents(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -5932,6 +5954,7 @@ xfs_bmap_insert_extents(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -6038,6 +6061,7 @@ xfs_bmap_split_extent_at(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -6253,8 +6277,10 @@ xfs_bmap_finish_one(
>  			XFS_FSB_TO_AGBNO(tp->t_mountp, startblock),
>  			ip->i_ino, whichfork, startoff, *blockcount, state);
>  
> -	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK))
> +	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK)) {
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		return -EFSCORRUPTED;
> +	}
>  
>  	if (XFS_TEST_ERROR(false, tp->t_mountp,
>  			XFS_ERRTAG_BMAP_FINISH_ONE))
> @@ -6272,6 +6298,7 @@ xfs_bmap_finish_one(
>  		break;
>  	default:
>  		ASSERT(0);
> +		xfs_bmap_mark_sick(ip, whichfork);
>  		error = -EFSCORRUPTED;
>  	}
>  
> diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> index ce8954a10c66..25b61180b562 100644
> --- a/fs/xfs/libxfs/xfs_health.h
> +++ b/fs/xfs/libxfs/xfs_health.h
> @@ -138,6 +138,7 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
>  		unsigned int *checked);
>  
>  void xfs_health_unmount(struct xfs_mount *mp);
> +void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
>  
>  /* Now some helpers. */
>  
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 36c32b108b39..5e5de5338476 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -452,3 +452,29 @@ xfs_bulkstat_health(
>  			bs->bs_sick |= m->ioctl_mask;
>  	}
>  }
> +
> +/* Mark a block mapping sick. */
> +void
> +xfs_bmap_mark_sick(
> +	struct xfs_inode	*ip,
> +	int			whichfork)
> +{
> +	unsigned int		mask;
> +
> +	switch (whichfork) {
> +	case XFS_DATA_FORK:
> +		mask = XFS_SICK_INO_BMBTD;
> +		break;
> +	case XFS_ATTR_FORK:
> +		mask = XFS_SICK_INO_BMBTA;
> +		break;
> +	case XFS_COW_FORK:
> +		mask = XFS_SICK_INO_BMBTC;
> +		break;
> +	default:
> +		ASSERT(0);
> +		return;
> +	}
> +
> +	xfs_inode_mark_sick(ip, mask);
> +}
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 28e2d1f37267..c1befb899911 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -27,7 +27,7 @@
>  #include "xfs_dquot_item.h"
>  #include "xfs_dquot.h"
>  #include "xfs_reflink.h"
> -
> +#include "xfs_health.h"
>  
>  #define XFS_ALLOC_ALIGN(mp, off) \
>  	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
> @@ -59,8 +59,10 @@ xfs_bmbt_to_iomap(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
>  
> -	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
> +	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
> +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
>  		return xfs_alert_fsblock_zero(ip, imap);
> +	}
>  
>  	if (imap->br_startblock == HOLESTARTBLOCK) {
>  		iomap->addr = IOMAP_NULL_ADDR;
> @@ -277,8 +279,10 @@ xfs_iomap_write_direct(
>  		goto out_unlock;
>  	}
>  
> -	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
> +	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
> +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
>  		error = xfs_alert_fsblock_zero(ip, imap);
> +	}
>  
>  out_unlock:
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> @@ -598,8 +602,10 @@ xfs_iomap_write_unwritten(
>  		if (error)
>  			return error;
>  
> -		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock)))
> +		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock))) {
> +			xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
>  			return xfs_alert_fsblock_zero(ip, &imap);
> +		}
>  
>  		if ((numblks_fsb = imap.br_blockcount) == 0) {
>  			/*
> @@ -858,6 +864,7 @@ xfs_buffered_write_iomap_begin(
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, XFS_DATA_FORK)) ||
>  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
>  		error = -EFSCORRUPTED;
>  		goto out_unlock;
>  	}
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-14 18:19 ` [PATCH 5/9] xfs: report dir/attr " Darrick J. Wong
@ 2019-11-20 16:11   ` Brian Foster
  2019-11-20 16:55     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-20 16:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Whenever we encounter corrupt directory or extended attribute blocks, we
> should report that to the health monitoring system for later reporting.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
>  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
>  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
>  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
>  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
>  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
>  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
>  fs/xfs/libxfs/xfs_health.h      |    3 +++
>  fs/xfs/xfs_attr_inactive.c      |    4 ++++
>  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
>  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
>  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
>  12 files changed, 126 insertions(+), 20 deletions(-)
> 
> 
...
> diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> index e424b004e3cb..a17622dadf00 100644
> --- a/fs/xfs/libxfs/xfs_da_btree.c
> +++ b/fs/xfs/libxfs/xfs_da_btree.c
...
> @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
>  
>  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
>  			xfs_buf_corruption_error(blk->bp);
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
>  		}
>  
> @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
>  		/* Tree taller than we can handle; bail out! */
>  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
>  			xfs_buf_corruption_error(blk->bp);
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
>  		}
>  
> @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
>  			expected_level = nodehdr.level - 1;
>  		else if (expected_level != nodehdr.level) {
>  			xfs_buf_corruption_error(blk->bp);
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
>  		} else
>  			expected_level--;
> @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
>  		}
>  
>  		/* We can't point back to the root. */
> -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
> +		}
>  	}
>  
> -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
> +	}
>  
>  	/*
>  	 * A leaf block that ends in the hashval that we are interested in
> @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
>  			args->blkno = blk->blkno;
>  		} else {
>  			ASSERT(0);
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
>  		}

I'm just kind of skimming through the rest for general feedback at this
point given previous comments, but it might be nice to start using exit
labels at some of these places where we're enlarging and duplicating the
error path for particular errors. It's not so much about the code in
these patches, but rather to hopefully ease maintaining these state bits
properly in new code where devs/reviewers might not know much about
scrub state or have it in mind. Short of having some kind of generic
helper to handle corruption state, ISTM that the combination of using
verifiers where possible and common exit labels anywhere else we
generate -EFSCORRUPTED at multiple places within some function could
shrink these patches a bit..

Brian

>  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
>  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
>  	if (error)
>  		return error;
> -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
> +	}
>  	/*
>  	 * Read the last block in the btree space.
>  	 */
> @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
>  		if (XFS_IS_CORRUPT(mp,
>  				   be32_to_cpu(sib_info->forw) != last_blkno ||
>  				   sib_info->magic != dead_info->magic)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
>  		if (XFS_IS_CORRUPT(mp,
>  				   be32_to_cpu(sib_info->back) != last_blkno ||
>  				   sib_info->magic != dead_info->magic)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
>  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
>  		if (XFS_IS_CORRUPT(mp,
>  				   level >= 0 && level != par_hdr.level + 1)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
>  		     entno++)
>  			continue;
>  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
>  		xfs_trans_brelse(tp, par_buf);
>  		par_buf = NULL;
>  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
>  		par_node = par_buf->b_addr;
>  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
>  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> +			xfs_da_mark_sick(args);
>  			error = -EFSCORRUPTED;
>  			goto done;
>  		}
> @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
>  					irecs[i].br_state);
>  			}
>  		}
> +		xfs_dirattr_mark_sick(dp, whichfork);
>  		error = -EFSCORRUPTED;
>  		goto out;
>  	}
> @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
>  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
>  					dp->i_mount->m_ddev_targp,
>  					mapp, nmap, 0, &bp, ops);
> +	if (xfs_metadata_is_sick(error))
> +		xfs_dirattr_mark_sick(dp, whichfork);
>  	if (error)
>  		goto out_free;
>  
> diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> index 0aa87cbde49e..e1aa411a1b8b 100644
> --- a/fs/xfs/libxfs/xfs_dir2.c
> +++ b/fs/xfs/libxfs/xfs_dir2.c
> @@ -18,6 +18,7 @@
>  #include "xfs_errortag.h"
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
> +#include "xfs_health.h"
>  
>  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
>  
> @@ -608,8 +609,10 @@ xfs_dir2_isblock(
>  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
>  	if (XFS_IS_CORRUPT(args->dp->i_mount,
>  			   rval != 0 &&
> -			   args->dp->i_d.di_size != args->geo->blksize))
> +			   args->dp->i_d.di_size != args->geo->blksize)) {
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
> +	}
>  	*vp = rval;
>  	return 0;
>  }
> diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> index a6eb71a62b53..80cc9c7ea4e5 100644
> --- a/fs/xfs/libxfs/xfs_dir2_data.c
> +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> @@ -18,6 +18,7 @@
>  #include "xfs_trans.h"
>  #include "xfs_buf_item.h"
>  #include "xfs_log.h"
> +#include "xfs_health.h"
>  
>  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
>  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
>  corrupt:
>  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
>  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> +	xfs_da_mark_sick(args);
>  	return -EFSCORRUPTED;
>  }
>  
> diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> index 73edd96ce0ac..32d17420fff3 100644
> --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> @@ -19,6 +19,7 @@
>  #include "xfs_trace.h"
>  #include "xfs_trans.h"
>  #include "xfs_buf_item.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Local function declarations.
> @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
>  	bestsp = xfs_dir2_leaf_bests_p(ltp);
>  	if (be16_to_cpu(bestsp[db]) != oldbest) {
>  		xfs_buf_corruption_error(lbp);
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
>  	}
> +
>  	/*
>  	 * Mark the former data entry unused.
>  	 */
> diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> index 3a8b0625a08b..e0f3ab254a1a 100644
> --- a/fs/xfs/libxfs/xfs_dir2_node.c
> +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> @@ -20,6 +20,7 @@
>  #include "xfs_trans.h"
>  #include "xfs_buf_item.h"
>  #include "xfs_log.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Function declarations.
> @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
>  	if (fa) {
>  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
>  		xfs_trans_brelse(tp, *bpp);
> +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
>  	if (be32_to_cpu(ltp->bestcount) >
>  				(uint)dp->i_d.di_size / args->geo->blksize) {
>  		xfs_buf_corruption_error(lbp);
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
>  	 */
>  	if (index < 0) {
>  		xfs_buf_corruption_error(bp);
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
>  					   cpu_to_be16(NULLDATAOFF))) {
>  				if (curfdb != newfdb)
>  					xfs_trans_brelse(tp, curbp);
> +				xfs_da_mark_sick(args);
>  				return -EFSCORRUPTED;
>  			}
>  			curfdb = newfdb;
> @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
>  	xfs_dir3_leaf_check(dp, bp);
>  	if (leafhdr.count <= 0) {
>  		xfs_buf_corruption_error(bp);
> +		xfs_da_mark_sick(args);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
>  			} else {
>  				xfs_alert(mp, " ... fblk is NULL");
>  			}
> +			xfs_da_mark_sick(args);
>  			return -EFSCORRUPTED;
>  		}
>  
> diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> index 2049419e9555..d9404cd3d09b 100644
> --- a/fs/xfs/libxfs/xfs_health.h
> +++ b/fs/xfs/libxfs/xfs_health.h
> @@ -38,6 +38,7 @@ struct xfs_perag;
>  struct xfs_inode;
>  struct xfs_fsop_geom;
>  struct xfs_btree_cur;
> +struct xfs_da_args;
>  
>  /* Observable health issues for metadata spanning the entire filesystem. */
>  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
>  void xfs_health_unmount(struct xfs_mount *mp);
>  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
>  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> +void xfs_da_mark_sick(struct xfs_da_args *args);
>  
>  /* Now some helpers. */
>  
> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> index a78c501f6fb1..429a97494ffa 100644
> --- a/fs/xfs/xfs_attr_inactive.c
> +++ b/fs/xfs/xfs_attr_inactive.c
> @@ -23,6 +23,7 @@
>  #include "xfs_quota.h"
>  #include "xfs_dir2.h"
>  #include "xfs_error.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Look at all the extents for this logical region,
> @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
>  	if (level > XFS_DA_NODE_MAXDEPTH) {
>  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
>  		xfs_buf_corruption_error(bp);
> +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  		return -EFSCORRUPTED;
>  	}
>  
> @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
>  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
>  			break;
>  		default:
> +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  			xfs_buf_corruption_error(child_bp);
>  			xfs_trans_brelse(*trans, child_bp);
>  			error = -EFSCORRUPTED;
> @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
>  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
>  		break;
>  	default:
> +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  		error = -EFSCORRUPTED;
>  		xfs_buf_corruption_error(bp);
>  		xfs_trans_brelse(*trans, bp);
> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> index 7a099df88a0c..1a2a3d4ce422 100644
> --- a/fs/xfs/xfs_attr_list.c
> +++ b/fs/xfs/xfs_attr_list.c
> @@ -21,6 +21,7 @@
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
>  #include "xfs_dir2.h"
> +#include "xfs_health.h"
>  
>  STATIC int
>  xfs_attr_shortform_compare(const void *a, const void *b)
> @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
>  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
>  			if (XFS_IS_CORRUPT(context->dp->i_mount,
>  					   !xfs_attr_namecheck(sfe->nameval,
> -							       sfe->namelen)))
> +							       sfe->namelen))) {
> +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  				return -EFSCORRUPTED;
> +			}
>  			context->put_listent(context,
>  					     sfe->flags,
>  					     sfe->nameval,
> @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
>  					     context->dp->i_mount, sfe,
>  					     sizeof(*sfe));
>  			kmem_free(sbuf);
> +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  			return -EFSCORRUPTED;
>  		}
>  
> @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
>  		if (XFS_IS_CORRUPT(context->dp->i_mount,
>  				   !xfs_attr_namecheck(sbp->name,
>  						       sbp->namelen))) {
> +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  			error = -EFSCORRUPTED;
>  			goto out;
>  		}
> @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
>  			return 0;
>  
>  		/* We can't point back to the root. */
> -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  			return -EFSCORRUPTED;
> +		}
>  	}
>  
>  	if (expected_level != 0)
> @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
>  out_corruptbuf:
>  	xfs_buf_corruption_error(bp);
>  	xfs_trans_brelse(tp, bp);
> +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
>  	return -EFSCORRUPTED;
>  }
>  
> @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
>  		}
>  
>  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> -				   !xfs_attr_namecheck(name, namelen)))
> +				   !xfs_attr_namecheck(name, namelen))) {
> +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
>  			return -EFSCORRUPTED;
> +		}
>  		context->put_listent(context, entry->flags,
>  					      name, namelen, valuelen);
>  		if (context->seen_enough)
> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> index 95bc9ef8f5f9..715ded503334 100644
> --- a/fs/xfs/xfs_dir2_readdir.c
> +++ b/fs/xfs/xfs_dir2_readdir.c
> @@ -18,6 +18,7 @@
>  #include "xfs_bmap.h"
>  #include "xfs_trans.h"
>  #include "xfs_error.h"
> +#include "xfs_health.h"
>  
>  /*
>   * Directory file type support functions
> @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
>  		ctx->pos = off & 0x7fffffff;
>  		if (XFS_IS_CORRUPT(dp->i_mount,
>  				   !xfs_dir2_namecheck(sfep->name,
> -						       sfep->namelen)))
> +						       sfep->namelen))) {
> +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
>  			return -EFSCORRUPTED;
> +		}
>  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
>  			    xfs_dir3_get_dtype(mp, filetype)))
>  			return 0;
> @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
>  		if (XFS_IS_CORRUPT(dp->i_mount,
>  				   !xfs_dir2_namecheck(dep->name,
>  						       dep->namelen))) {
> +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
>  			error = -EFSCORRUPTED;
>  			break;
>  		}
> diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> index 1f09027c55ad..c1b6e8fb72ec 100644
> --- a/fs/xfs/xfs_health.c
> +++ b/fs/xfs/xfs_health.c
> @@ -15,6 +15,8 @@
>  #include "xfs_trace.h"
>  #include "xfs_health.h"
>  #include "xfs_btree.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  
>  /*
>   * Warn about metadata corruption that we detected but haven't fixed, and
> @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
>  
>  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
>  }
> +
> +/*
> + * Record observations of dir/attr btree corruption with the health tracking
> + * system.
> + */
> +void
> +xfs_dirattr_mark_sick(
> +	struct xfs_inode	*ip,
> +	int			whichfork)
> +{
> +	unsigned int		mask;
> +
> +	switch (whichfork) {
> +	case XFS_DATA_FORK:
> +		mask = XFS_SICK_INO_DIR;
> +		break;
> +	case XFS_ATTR_FORK:
> +		mask = XFS_SICK_INO_XATTR;
> +		break;
> +	default:
> +		ASSERT(0);
> +		return;
> +	}
> +
> +	xfs_inode_mark_sick(ip, mask);
> +}
> +
> +/*
> + * Record observations of dir/attr btree corruption with the health tracking
> + * system.
> + */
> +void
> +xfs_da_mark_sick(
> +	struct xfs_da_args	*args)
> +{
> +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> +}
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/9] xfs: separate the marking of sick and checked metadata
  2019-11-20 14:20   ` Brian Foster
@ 2019-11-20 16:12     ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-20 16:12 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 09:20:35AM -0500, Brian Foster wrote:
> On Thu, Nov 14, 2019 at 10:19:20AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Split the setting of the sick and checked masks into separate functions
> > as part of preparing to add the ability for regular runtime fs code
> > (i.e. not scrub) to mark metadata structures sick when corruptions are
> > found.  Improve the documentation of libxfs' requirements for helper
> > behavior.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_health.h |   24 ++++++++++++++++++----
> >  fs/xfs/scrub/health.c      |   20 +++++++++++-------
> >  fs/xfs/xfs_health.c        |   49 ++++++++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_mount.c         |    5 ++++
> >  4 files changed, 85 insertions(+), 13 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > index 272005ac8c88..3657a9cb8490 100644
> > --- a/fs/xfs/libxfs/xfs_health.h
> > +++ b/fs/xfs/libxfs/xfs_health.h
> > @@ -26,9 +26,11 @@
> >   * and the "sick" field tells us if that piece was found to need repairs.
> >   * Therefore we can conclude that for a given sick flag value:
> >   *
> > - *  - checked && sick  => metadata needs repair
> > - *  - checked && !sick => metadata is ok
> > - *  - !checked         => has not been examined since mount
> > + *  - checked && sick   => metadata needs repair
> > + *  - checked && !sick  => metadata is ok
> > + *  - !checked && sick  => errors have been observed during normal operation,
> > + *                         but the metadata has not been checked thoroughly
> > + *  - !checked && !sick => has not been examined since mount
> >   */
> >  
> 
> I don't see this change in the provided repo. Which is the right patch?

Hmm, I guess I need to update the repo again. :/

> >  struct xfs_mount;
> > @@ -97,24 +99,38 @@ struct xfs_fsop_geom;
> >  				 XFS_SICK_INO_SYMLINK | \
> >  				 XFS_SICK_INO_PARENT)
> >  
> > -/* These functions must be provided by the xfs implementation. */
> > +/*
> > + * These functions must be provided by the xfs implementation.  Function
> > + * behavior with respect to the first argument should be as follows:
> > + *
> > + * xfs_*_mark_sick:    set the sick flags and do not set checked flags.
> 
> Nit: It's probably not necessary to say that we don't set the checked
> flags here given the comment/function below.

Ok.

--D

> Brian
> 
> > + * xfs_*_mark_checked: set the checked flags.
> > + * xfs_*_mark_healthy: clear the sick flags and set the checked flags.
> > + *
> > + * xfs_*_measure_sickness: return the sick and check status in the provided
> > + * out parameters.
> > + */
> >  
> >  void xfs_fs_mark_sick(struct xfs_mount *mp, unsigned int mask);
> > +void xfs_fs_mark_checked(struct xfs_mount *mp, unsigned int mask);
> >  void xfs_fs_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> >  void xfs_fs_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> >  		unsigned int *checked);
> >  
> >  void xfs_rt_mark_sick(struct xfs_mount *mp, unsigned int mask);
> > +void xfs_rt_mark_checked(struct xfs_mount *mp, unsigned int mask);
> >  void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> >  		unsigned int *checked);
> >  
> >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> > +void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> >  void xfs_ag_measure_sickness(struct xfs_perag *pag, unsigned int *sick,
> >  		unsigned int *checked);
> >  
> >  void xfs_inode_mark_sick(struct xfs_inode *ip, unsigned int mask);
> > +void xfs_inode_mark_checked(struct xfs_inode *ip, unsigned int mask);
> >  void xfs_inode_mark_healthy(struct xfs_inode *ip, unsigned int mask);
> >  void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> >  		unsigned int *checked);
> > diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c
> > index 83d27cdf579b..a402f9026d5f 100644
> > --- a/fs/xfs/scrub/health.c
> > +++ b/fs/xfs/scrub/health.c
> > @@ -137,30 +137,34 @@ xchk_update_health(
> >  	switch (type_to_health_flag[sc->sm->sm_type].group) {
> >  	case XHG_AG:
> >  		pag = xfs_perag_get(sc->mp, sc->sm->sm_agno);
> > -		if (bad)
> > +		if (bad) {
> >  			xfs_ag_mark_sick(pag, sc->sick_mask);
> > -		else
> > +			xfs_ag_mark_checked(pag, sc->sick_mask);
> > +		} else
> >  			xfs_ag_mark_healthy(pag, sc->sick_mask);
> >  		xfs_perag_put(pag);
> >  		break;
> >  	case XHG_INO:
> >  		if (!sc->ip)
> >  			return;
> > -		if (bad)
> > +		if (bad) {
> >  			xfs_inode_mark_sick(sc->ip, sc->sick_mask);
> > -		else
> > +			xfs_inode_mark_checked(sc->ip, sc->sick_mask);
> > +		} else
> >  			xfs_inode_mark_healthy(sc->ip, sc->sick_mask);
> >  		break;
> >  	case XHG_FS:
> > -		if (bad)
> > +		if (bad) {
> >  			xfs_fs_mark_sick(sc->mp, sc->sick_mask);
> > -		else
> > +			xfs_fs_mark_checked(sc->mp, sc->sick_mask);
> > +		} else
> >  			xfs_fs_mark_healthy(sc->mp, sc->sick_mask);
> >  		break;
> >  	case XHG_RT:
> > -		if (bad)
> > +		if (bad) {
> >  			xfs_rt_mark_sick(sc->mp, sc->sick_mask);
> > -		else
> > +			xfs_rt_mark_checked(sc->mp, sc->sick_mask);
> > +		} else
> >  			xfs_rt_mark_healthy(sc->mp, sc->sick_mask);
> >  		break;
> >  	default:
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 8e0cb05a7142..860dc70c99e7 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -100,6 +100,18 @@ xfs_fs_mark_sick(
> >  
> >  	spin_lock(&mp->m_sb_lock);
> >  	mp->m_fs_sick |= mask;
> > +	spin_unlock(&mp->m_sb_lock);
> > +}
> > +
> > +/* Mark per-fs metadata as having been checked. */
> > +void
> > +xfs_fs_mark_checked(
> > +	struct xfs_mount	*mp,
> > +	unsigned int		mask)
> > +{
> > +	ASSERT(!(mask & ~XFS_SICK_FS_PRIMARY));
> > +
> > +	spin_lock(&mp->m_sb_lock);
> >  	mp->m_fs_checked |= mask;
> >  	spin_unlock(&mp->m_sb_lock);
> >  }
> > @@ -143,6 +155,19 @@ xfs_rt_mark_sick(
> >  
> >  	spin_lock(&mp->m_sb_lock);
> >  	mp->m_rt_sick |= mask;
> > +	spin_unlock(&mp->m_sb_lock);
> > +}
> > +
> > +/* Mark realtime metadata as having been checked. */
> > +void
> > +xfs_rt_mark_checked(
> > +	struct xfs_mount	*mp,
> > +	unsigned int		mask)
> > +{
> > +	ASSERT(!(mask & ~XFS_SICK_RT_PRIMARY));
> > +	trace_xfs_rt_mark_sick(mp, mask);
> > +
> > +	spin_lock(&mp->m_sb_lock);
> >  	mp->m_rt_checked |= mask;
> >  	spin_unlock(&mp->m_sb_lock);
> >  }
> > @@ -186,6 +211,18 @@ xfs_ag_mark_sick(
> >  
> >  	spin_lock(&pag->pag_state_lock);
> >  	pag->pag_sick |= mask;
> > +	spin_unlock(&pag->pag_state_lock);
> > +}
> > +
> > +/* Mark per-ag metadata as having been checked. */
> > +void
> > +xfs_ag_mark_checked(
> > +	struct xfs_perag	*pag,
> > +	unsigned int		mask)
> > +{
> > +	ASSERT(!(mask & ~XFS_SICK_AG_PRIMARY));
> > +
> > +	spin_lock(&pag->pag_state_lock);
> >  	pag->pag_checked |= mask;
> >  	spin_unlock(&pag->pag_state_lock);
> >  }
> > @@ -229,6 +266,18 @@ xfs_inode_mark_sick(
> >  
> >  	spin_lock(&ip->i_flags_lock);
> >  	ip->i_sick |= mask;
> > +	spin_unlock(&ip->i_flags_lock);
> > +}
> > +
> > +/* Mark inode metadata as having been checked. */
> > +void
> > +xfs_inode_mark_checked(
> > +	struct xfs_inode	*ip,
> > +	unsigned int		mask)
> > +{
> > +	ASSERT(!(mask & ~XFS_SICK_INO_PRIMARY));
> > +
> > +	spin_lock(&ip->i_flags_lock);
> >  	ip->i_checked |= mask;
> >  	spin_unlock(&ip->i_flags_lock);
> >  }
> > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > index fca65109cf24..27aa143d524b 100644
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -555,8 +555,10 @@ xfs_check_summary_counts(
> >  	if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) &&
> >  	    (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks ||
> >  	     !xfs_verify_icount(mp, mp->m_sb.sb_icount) ||
> > -	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount))
> > +	     mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) {
> >  		xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
> > +		xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
> > +	}
> >  
> >  	/*
> >  	 * We can safely re-initialise incore superblock counters from the
> > @@ -1322,6 +1324,7 @@ xfs_force_summary_recalc(
> >  		return;
> >  
> >  	xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
> > +	xfs_fs_mark_checked(mp, XFS_SICK_FS_COUNTERS);
> >  }
> >  
> >  /*
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-20 14:20   ` Brian Foster
@ 2019-11-20 16:43     ` Darrick J. Wong
  2019-11-21 13:26       ` Brian Foster
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-20 16:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 09:20:47AM -0500, Brian Foster wrote:
> On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Whenever we encounter a corrupt AG header, we should report that to the
> > health monitoring system for later reporting.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
> >  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
> >  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
> >  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
> >  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
> >  fs/xfs/libxfs/xfs_sb.c       |    2 ++
> >  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
> >  fs/xfs/xfs_inode.c           |    9 +++++++++
> >  8 files changed, 51 insertions(+), 2 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index c284e10af491..e75e3ae6c912 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -26,6 +26,7 @@
> >  #include "xfs_log.h"
> >  #include "xfs_ag_resv.h"
> >  #include "xfs_bmap.h"
> > +#include "xfs_health.h"
> >  
> >  extern kmem_zone_t	*xfs_bmap_free_item_zone;
> >  
> > @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
> >  			mp, tp, mp->m_ddev_targp,
> >  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
> >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
> 
> Any reason we couldn't do some of these in verifiers? I'm assuming we'd
> still need calls in various external corruption checks, but at least we
> wouldn't add a requirement to check all future buffer reads, etc.

I thought about that.  It would be wonderful if C had a syntactically
slick method to package a function + execution scope and pass that
through other functions to be called later. :)

For the per-AG stuff it wouldn't be hard to make the verifier functions
derive the AG number and call xfs_agno_mark_sick directly in the
verifier.  For per-inode metadata, we'd have to find a way to pass the
struct xfs_inode pointer to the verifier, which means that we'd have to
add that to struct xfs_buf.

xfs_buf is ~384 bytes so maybe adding another pointer for read context
wouldn't be terrible?  That would add a fair amount of ugly special
casing in the btree code to decide if we have an inode to pass through,
though it would solve the problem of the bmbt verifier not being able to
check the owner field in the btree block header.

OTOH that's 8 bytes of overhead that we can never get rid of even though
we only really need it the first time the buffer gets read in from disk.

Thoughts?

> >  	if (error)
> >  		return error;
> >  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> > @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
> >  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
> >  		     be32_to_cpu(agf->agf_length))) {
> >  		xfs_buf_corruption_error(agbp);
> > +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -2952,6 +2956,8 @@ xfs_read_agf(
> >  			mp, tp, mp->m_ddev_targp,
> >  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> >  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
> >  	if (error)
> >  		return error;
> >  	if (!*bpp)
> > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > index 3657a9cb8490..ce8954a10c66 100644
> > --- a/fs/xfs/libxfs/xfs_health.h
> > +++ b/fs/xfs/libxfs/xfs_health.h
> > @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> >  		unsigned int *checked);
> >  
> > +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> > +		unsigned int mask);
> >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> >  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> > @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
> >  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
> >  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
> >  
> > +#define xfs_metadata_is_sick(error) \
> > +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> > +		  (error) == -EFSBADCRC))
> 
> Why is -EIO considered sick? My understanding is that once something is
> marked sick, scrub is the only way to clear that state. -EIO can be
> transient, so afaict that means we could mark a persistent in-core state
> based on a transient/resolved issue.

I think it sounds reasonable that if the fs hits a metadata IO error
then the administrator should scrub that data structure to make sure
it's ok, and if so, clear the sick state.

Though I realized just now that if scrub isn't enabled then it's an
unfixable dead end so the EIO check should be gated on
CONFIG_XFS_ONLINE_SCRUB=y.

> Along similar lines, what's the expected behavior in the event of any of
> these errors for a kernel that might not support
> CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
> used for anything? If so, that seems Ok I suppose.. but it's a little
> awkward if we'd see the tracepoints and such associated with the state
> changes.

Even if scrub is disabled, the kernel will still set the sick state, and
later the administrator can query the filesystem with xfs_spaceman to
observe that sick state.

In the future, I will also use the per-AG sick states to steer
allocations away from known problematic AGs to try to avoid
unexpected shutdown in the middle of a transaction.

--D

> 
> Brian
> 
> > +
> >  #endif	/* __XFS_HEALTH_H__ */
> > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > index 988cde7744e6..c401512a4350 100644
> > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > @@ -27,6 +27,7 @@
> >  #include "xfs_trace.h"
> >  #include "xfs_log.h"
> >  #include "xfs_rmap.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Lookup a record by ino in the btree given by cur.
> > @@ -2635,6 +2636,8 @@ xfs_read_agi(
> >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> >  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
> >  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> >  	if (error)
> >  		return error;
> >  	if (tp)
> > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > index d7d702ee4d1a..25c87834e42a 100644
> > --- a/fs/xfs/libxfs/xfs_refcount.c
> > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > @@ -22,6 +22,7 @@
> >  #include "xfs_bit.h"
> >  #include "xfs_refcount.h"
> >  #include "xfs_rmap.h"
> > +#include "xfs_health.h"
> >  
> >  /* Allowable refcount adjustment amounts. */
> >  enum xfs_refc_adjust_op {
> > @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
> >  				XFS_ALLOC_FLAG_FREEING, &agbp);
> >  		if (error)
> >  			return error;
> > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> >  			return -EFSCORRUPTED;
> > +		}
> >  
> >  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
> >  		if (!rcur) {
> > diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> > index ff9412f113c4..a54a3c129cce 100644
> > --- a/fs/xfs/libxfs/xfs_rmap.c
> > +++ b/fs/xfs/libxfs/xfs_rmap.c
> > @@ -21,6 +21,7 @@
> >  #include "xfs_errortag.h"
> >  #include "xfs_error.h"
> >  #include "xfs_inode.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Lookup the first record less than or equal to [bno, len, owner, offset]
> > @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
> >  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
> >  		if (error)
> >  			return error;
> > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> >  			return -EFSCORRUPTED;
> > +		}
> >  
> >  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
> >  		if (!rcur) {
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index 0ac69751fe85..4a923545465d 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
> >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> >  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
> >  	if (error)
> >  		return error;
> >  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 860dc70c99e7..36c32b108b39 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
> >  	spin_unlock(&mp->m_sb_lock);
> >  }
> >  
> > +/* Mark unhealthy per-ag metadata given a raw AG number. */
> > +void
> > +xfs_agno_mark_sick(
> > +	struct xfs_mount	*mp,
> > +	xfs_agnumber_t		agno,
> > +	unsigned int		mask)
> > +{
> > +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> > +
> > +	/* per-ag structure not set up yet? */
> > +	if (!pag)
> > +		return;
> > +
> > +	xfs_ag_mark_sick(pag, mask);
> > +	xfs_perag_put(pag);
> > +}
> > +
> >  /* Mark unhealthy per-ag metadata. */
> >  void
> >  xfs_ag_mark_sick(
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 401da197f012..a2812cea748d 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -35,6 +35,7 @@
> >  #include "xfs_log.h"
> >  #include "xfs_bmap_btree.h"
> >  #include "xfs_reflink.h"
> > +#include "xfs_health.h"
> >  
> >  kmem_zone_t *xfs_inode_zone;
> >  
> > @@ -787,6 +788,8 @@ xfs_ialloc(
> >  	 */
> >  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
> >  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> > +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> > +				XFS_SICK_AG_INOBT);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
> >  	 */
> >  	if (old_value == new_agino) {
> >  		xfs_buf_corruption_error(agibp);
> > +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
> >  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
> >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> >  				sizeof(*dip), __this_address);
> > +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> >  		error = -EFSCORRUPTED;
> >  		goto out;
> >  	}
> > @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
> >  		if (next_agino != NULLAGINO) {
> >  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> >  					dip, sizeof(*dip), __this_address);
> > +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> >  			error = -EFSCORRUPTED;
> >  		}
> >  		goto out;
> > @@ -2271,6 +2277,7 @@ xfs_iunlink(
> >  	if (next_agino == agino ||
> >  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
> >  		xfs_buf_corruption_error(agibp);
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
> >  			XFS_CORRUPTION_ERROR(__func__,
> >  					XFS_ERRLEVEL_LOW, mp,
> >  					*dipp, sizeof(**dipp));
> > +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
> >  			error = -EFSCORRUPTED;
> >  			return error;
> >  		}
> > @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
> >  	if (!xfs_verify_agino(mp, agno, head_agino)) {
> >  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
> >  				agi, sizeof(*agi));
> > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-20 16:11   ` Brian Foster
@ 2019-11-20 16:55     ` Darrick J. Wong
  2019-11-21 13:26       ` Brian Foster
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-20 16:55 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 11:11:47AM -0500, Brian Foster wrote:
> On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Whenever we encounter corrupt directory or extended attribute blocks, we
> > should report that to the health monitoring system for later reporting.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
> >  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
> >  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
> >  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
> >  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
> >  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
> >  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
> >  fs/xfs/libxfs/xfs_health.h      |    3 +++
> >  fs/xfs/xfs_attr_inactive.c      |    4 ++++
> >  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
> >  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
> >  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
> >  12 files changed, 126 insertions(+), 20 deletions(-)
> > 
> > 
> ...
> > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> > index e424b004e3cb..a17622dadf00 100644
> > --- a/fs/xfs/libxfs/xfs_da_btree.c
> > +++ b/fs/xfs/libxfs/xfs_da_btree.c
> ...
> > @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
> >  
> >  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
> >  			xfs_buf_corruption_error(blk->bp);
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> >  		}
> >  
> > @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
> >  		/* Tree taller than we can handle; bail out! */
> >  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
> >  			xfs_buf_corruption_error(blk->bp);
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> >  		}
> >  
> > @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
> >  			expected_level = nodehdr.level - 1;
> >  		else if (expected_level != nodehdr.level) {
> >  			xfs_buf_corruption_error(blk->bp);
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> >  		} else
> >  			expected_level--;
> > @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
> >  		}
> >  
> >  		/* We can't point back to the root. */
> > -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> > +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> > +		}
> >  	}
> >  
> > -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> > +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> > +	}
> >  
> >  	/*
> >  	 * A leaf block that ends in the hashval that we are interested in
> > @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
> >  			args->blkno = blk->blkno;
> >  		} else {
> >  			ASSERT(0);
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> >  		}
> 
> I'm just kind of skimming through the rest for general feedback at this
> point given previous comments, but it might be nice to start using exit
> labels at some of these places where we're enlarging and duplicating the
> error path for particular errors.

Yeah.  This current iteration is pretty wordy since I used coccinelle to
find all the EFSCORRUPTED clauses and inject the appropriate _mark_sick
call.

> It's not so much about the code in
> these patches, but rather to hopefully ease maintaining these state bits
> properly in new code where devs/reviewers might not know much about
> scrub state or have it in mind. Short of having some kind of generic
> helper to handle corruption state, ISTM that the combination of using
> verifiers where possible and common exit labels anywhere else we
> generate -EFSCORRUPTED at multiple places within some function could
> shrink these patches a bit..

<nod> Eric suggested on IRC that maybe the _mark_sick functions should
return EFSCORRUPTED so that we could at least collapse that to:

if (XFS_IS_CORRUPT(...)) {
	error = xfs_da_mark_sick(...);
	goto barf;
}

However, doing it the wordy way I've done it has the neat effects (IMHO)
that you can find all the places where xfs decides some metadata is
corrupt by grepping for EFSCORRUPTED, and confirm that each place it
does that also has a corresponding _mark_sick call.

I guess you could create a dorky shouty wrapper to maintain that greppy
property:

#define XFS_DA_EFSCORRUPTED(...) \
	(xfs_da_mark_sick(...), -EFSCORRUPTED)

But... that might be stylistically undesirable.  OTOH I guess it
wouldn't be so bad either to do:

	if (XFS_IS_CORRUPT(...)) {
		error = -EFSCORRUPTED;
		goto bad;
	}

	if (XFS_IS_CORRUPT(...)) {
		error = -EFSCORRUPTED;
		goto bad;
	}

	return 0;
bad:
	if (error == -EFSCORRUPTED)
		xfs_da_mark_sick(...);
	return error;

Or using the shouty macro above:

	if (XFS_IS_CORRUPT(...)) {
		error = XFS_DA_EFSCORRUPTED(...);
		goto bad;
	}

	if (XFS_IS_CORRUPT(...)) {
		error = XFS_DA_EFSCORRUPTED(...);
		goto bad;
	}

bad:
	return error;

I'll think about that.  It doesn't sound so bad when coding it up in
this email.

--D

> 
> Brian
> 
> >  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> > @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
> >  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
> >  	if (error)
> >  		return error;
> > -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> > +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> > +	}
> >  	/*
> >  	 * Read the last block in the btree space.
> >  	 */
> > @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
> >  		if (XFS_IS_CORRUPT(mp,
> >  				   be32_to_cpu(sib_info->forw) != last_blkno ||
> >  				   sib_info->magic != dead_info->magic)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
> >  		if (XFS_IS_CORRUPT(mp,
> >  				   be32_to_cpu(sib_info->back) != last_blkno ||
> >  				   sib_info->magic != dead_info->magic)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
> >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> >  		if (XFS_IS_CORRUPT(mp,
> >  				   level >= 0 && level != par_hdr.level + 1)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
> >  		     entno++)
> >  			continue;
> >  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
> >  		xfs_trans_brelse(tp, par_buf);
> >  		par_buf = NULL;
> >  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
> >  		par_node = par_buf->b_addr;
> >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> >  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> > +			xfs_da_mark_sick(args);
> >  			error = -EFSCORRUPTED;
> >  			goto done;
> >  		}
> > @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
> >  					irecs[i].br_state);
> >  			}
> >  		}
> > +		xfs_dirattr_mark_sick(dp, whichfork);
> >  		error = -EFSCORRUPTED;
> >  		goto out;
> >  	}
> > @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
> >  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
> >  					dp->i_mount->m_ddev_targp,
> >  					mapp, nmap, 0, &bp, ops);
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_dirattr_mark_sick(dp, whichfork);
> >  	if (error)
> >  		goto out_free;
> >  
> > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > index 0aa87cbde49e..e1aa411a1b8b 100644
> > --- a/fs/xfs/libxfs/xfs_dir2.c
> > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > @@ -18,6 +18,7 @@
> >  #include "xfs_errortag.h"
> >  #include "xfs_error.h"
> >  #include "xfs_trace.h"
> > +#include "xfs_health.h"
> >  
> >  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
> >  
> > @@ -608,8 +609,10 @@ xfs_dir2_isblock(
> >  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
> >  	if (XFS_IS_CORRUPT(args->dp->i_mount,
> >  			   rval != 0 &&
> > -			   args->dp->i_d.di_size != args->geo->blksize))
> > +			   args->dp->i_d.di_size != args->geo->blksize)) {
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> > +	}
> >  	*vp = rval;
> >  	return 0;
> >  }
> > diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> > index a6eb71a62b53..80cc9c7ea4e5 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_data.c
> > +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> > @@ -18,6 +18,7 @@
> >  #include "xfs_trans.h"
> >  #include "xfs_buf_item.h"
> >  #include "xfs_log.h"
> > +#include "xfs_health.h"
> >  
> >  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
> >  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> > @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
> >  corrupt:
> >  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
> >  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> > +	xfs_da_mark_sick(args);
> >  	return -EFSCORRUPTED;
> >  }
> >  
> > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > index 73edd96ce0ac..32d17420fff3 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > @@ -19,6 +19,7 @@
> >  #include "xfs_trace.h"
> >  #include "xfs_trans.h"
> >  #include "xfs_buf_item.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Local function declarations.
> > @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
> >  	bestsp = xfs_dir2_leaf_bests_p(ltp);
> >  	if (be16_to_cpu(bestsp[db]) != oldbest) {
> >  		xfs_buf_corruption_error(lbp);
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> >  	}
> > +
> >  	/*
> >  	 * Mark the former data entry unused.
> >  	 */
> > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > index 3a8b0625a08b..e0f3ab254a1a 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > @@ -20,6 +20,7 @@
> >  #include "xfs_trans.h"
> >  #include "xfs_buf_item.h"
> >  #include "xfs_log.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Function declarations.
> > @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
> >  	if (fa) {
> >  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
> >  		xfs_trans_brelse(tp, *bpp);
> > +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
> >  	if (be32_to_cpu(ltp->bestcount) >
> >  				(uint)dp->i_d.di_size / args->geo->blksize) {
> >  		xfs_buf_corruption_error(lbp);
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
> >  	 */
> >  	if (index < 0) {
> >  		xfs_buf_corruption_error(bp);
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
> >  					   cpu_to_be16(NULLDATAOFF))) {
> >  				if (curfdb != newfdb)
> >  					xfs_trans_brelse(tp, curbp);
> > +				xfs_da_mark_sick(args);
> >  				return -EFSCORRUPTED;
> >  			}
> >  			curfdb = newfdb;
> > @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
> >  	xfs_dir3_leaf_check(dp, bp);
> >  	if (leafhdr.count <= 0) {
> >  		xfs_buf_corruption_error(bp);
> > +		xfs_da_mark_sick(args);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
> >  			} else {
> >  				xfs_alert(mp, " ... fblk is NULL");
> >  			}
> > +			xfs_da_mark_sick(args);
> >  			return -EFSCORRUPTED;
> >  		}
> >  
> > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > index 2049419e9555..d9404cd3d09b 100644
> > --- a/fs/xfs/libxfs/xfs_health.h
> > +++ b/fs/xfs/libxfs/xfs_health.h
> > @@ -38,6 +38,7 @@ struct xfs_perag;
> >  struct xfs_inode;
> >  struct xfs_fsop_geom;
> >  struct xfs_btree_cur;
> > +struct xfs_da_args;
> >  
> >  /* Observable health issues for metadata spanning the entire filesystem. */
> >  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> > @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> >  void xfs_health_unmount(struct xfs_mount *mp);
> >  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> >  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> > +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> > +void xfs_da_mark_sick(struct xfs_da_args *args);
> >  
> >  /* Now some helpers. */
> >  
> > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > index a78c501f6fb1..429a97494ffa 100644
> > --- a/fs/xfs/xfs_attr_inactive.c
> > +++ b/fs/xfs/xfs_attr_inactive.c
> > @@ -23,6 +23,7 @@
> >  #include "xfs_quota.h"
> >  #include "xfs_dir2.h"
> >  #include "xfs_error.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Look at all the extents for this logical region,
> > @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
> >  	if (level > XFS_DA_NODE_MAXDEPTH) {
> >  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
> >  		xfs_buf_corruption_error(bp);
> > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
> >  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
> >  			break;
> >  		default:
> > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  			xfs_buf_corruption_error(child_bp);
> >  			xfs_trans_brelse(*trans, child_bp);
> >  			error = -EFSCORRUPTED;
> > @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
> >  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
> >  		break;
> >  	default:
> > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  		error = -EFSCORRUPTED;
> >  		xfs_buf_corruption_error(bp);
> >  		xfs_trans_brelse(*trans, bp);
> > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > index 7a099df88a0c..1a2a3d4ce422 100644
> > --- a/fs/xfs/xfs_attr_list.c
> > +++ b/fs/xfs/xfs_attr_list.c
> > @@ -21,6 +21,7 @@
> >  #include "xfs_error.h"
> >  #include "xfs_trace.h"
> >  #include "xfs_dir2.h"
> > +#include "xfs_health.h"
> >  
> >  STATIC int
> >  xfs_attr_shortform_compare(const void *a, const void *b)
> > @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
> >  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
> >  			if (XFS_IS_CORRUPT(context->dp->i_mount,
> >  					   !xfs_attr_namecheck(sfe->nameval,
> > -							       sfe->namelen)))
> > +							       sfe->namelen))) {
> > +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  				return -EFSCORRUPTED;
> > +			}
> >  			context->put_listent(context,
> >  					     sfe->flags,
> >  					     sfe->nameval,
> > @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
> >  					     context->dp->i_mount, sfe,
> >  					     sizeof(*sfe));
> >  			kmem_free(sbuf);
> > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  			return -EFSCORRUPTED;
> >  		}
> >  
> > @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
> >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> >  				   !xfs_attr_namecheck(sbp->name,
> >  						       sbp->namelen))) {
> > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  			error = -EFSCORRUPTED;
> >  			goto out;
> >  		}
> > @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
> >  			return 0;
> >  
> >  		/* We can't point back to the root. */
> > -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> > +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  			return -EFSCORRUPTED;
> > +		}
> >  	}
> >  
> >  	if (expected_level != 0)
> > @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
> >  out_corruptbuf:
> >  	xfs_buf_corruption_error(bp);
> >  	xfs_trans_brelse(tp, bp);
> > +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> >  	return -EFSCORRUPTED;
> >  }
> >  
> > @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
> >  		}
> >  
> >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > -				   !xfs_attr_namecheck(name, namelen)))
> > +				   !xfs_attr_namecheck(name, namelen))) {
> > +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
> >  			return -EFSCORRUPTED;
> > +		}
> >  		context->put_listent(context, entry->flags,
> >  					      name, namelen, valuelen);
> >  		if (context->seen_enough)
> > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > index 95bc9ef8f5f9..715ded503334 100644
> > --- a/fs/xfs/xfs_dir2_readdir.c
> > +++ b/fs/xfs/xfs_dir2_readdir.c
> > @@ -18,6 +18,7 @@
> >  #include "xfs_bmap.h"
> >  #include "xfs_trans.h"
> >  #include "xfs_error.h"
> > +#include "xfs_health.h"
> >  
> >  /*
> >   * Directory file type support functions
> > @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
> >  		ctx->pos = off & 0x7fffffff;
> >  		if (XFS_IS_CORRUPT(dp->i_mount,
> >  				   !xfs_dir2_namecheck(sfep->name,
> > -						       sfep->namelen)))
> > +						       sfep->namelen))) {
> > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> >  			return -EFSCORRUPTED;
> > +		}
> >  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
> >  			    xfs_dir3_get_dtype(mp, filetype)))
> >  			return 0;
> > @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
> >  		if (XFS_IS_CORRUPT(dp->i_mount,
> >  				   !xfs_dir2_namecheck(dep->name,
> >  						       dep->namelen))) {
> > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> >  			error = -EFSCORRUPTED;
> >  			break;
> >  		}
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 1f09027c55ad..c1b6e8fb72ec 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -15,6 +15,8 @@
> >  #include "xfs_trace.h"
> >  #include "xfs_health.h"
> >  #include "xfs_btree.h"
> > +#include "xfs_da_format.h"
> > +#include "xfs_da_btree.h"
> >  
> >  /*
> >   * Warn about metadata corruption that we detected but haven't fixed, and
> > @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
> >  
> >  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
> >  }
> > +
> > +/*
> > + * Record observations of dir/attr btree corruption with the health tracking
> > + * system.
> > + */
> > +void
> > +xfs_dirattr_mark_sick(
> > +	struct xfs_inode	*ip,
> > +	int			whichfork)
> > +{
> > +	unsigned int		mask;
> > +
> > +	switch (whichfork) {
> > +	case XFS_DATA_FORK:
> > +		mask = XFS_SICK_INO_DIR;
> > +		break;
> > +	case XFS_ATTR_FORK:
> > +		mask = XFS_SICK_INO_XATTR;
> > +		break;
> > +	default:
> > +		ASSERT(0);
> > +		return;
> > +	}
> > +
> > +	xfs_inode_mark_sick(ip, mask);
> > +}
> > +
> > +/*
> > + * Record observations of dir/attr btree corruption with the health tracking
> > + * system.
> > + */
> > +void
> > +xfs_da_mark_sick(
> > +	struct xfs_da_args	*args)
> > +{
> > +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> > +}
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/9] xfs: report block map corruption errors to the health tracking system
  2019-11-20 14:21   ` Brian Foster
@ 2019-11-20 16:57     ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-20 16:57 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 09:21:19AM -0500, Brian Foster wrote:
> On Thu, Nov 14, 2019 at 10:19:33AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Whenever we encounter a corrupt block mapping, we should report that to
> > the health monitoring system for later reporting.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c   |   39 +++++++++++++++++++++++++++++++++------
> >  fs/xfs/libxfs/xfs_health.h |    1 +
> >  fs/xfs/xfs_health.c        |   26 ++++++++++++++++++++++++++
> >  fs/xfs/xfs_iomap.c         |   15 +++++++++++----
> >  4 files changed, 71 insertions(+), 10 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 4acc6e37c31d..c4674fb0bfb4 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -35,7 +35,7 @@
> >  #include "xfs_refcount.h"
> >  #include "xfs_icache.h"
> >  #include "xfs_iomap.h"
> > -
> > +#include "xfs_health.h"
> >  
> >  kmem_zone_t		*xfs_bmap_free_item_zone;
> >  
> > @@ -732,6 +732,7 @@ xfs_bmap_extents_to_btree(
> >  	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
> >  	abp = xfs_btree_get_bufl(mp, tp, args.fsbno);
> >  	if (XFS_IS_CORRUPT(mp, !abp)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		error = -EFSCORRUPTED;
> >  		goto out_unreserve_dquot;
> >  	}
> > @@ -1021,6 +1022,7 @@ xfs_bmap_add_attrfork_local(
> >  
> >  	/* should only be called for types that support local format data */
> >  	ASSERT(0);
> > +	xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
> >  	return -EFSCORRUPTED;
> >  }
> 
> Is it really the attr fork that's corrupt if we get here?
> 
> >  
> > @@ -1090,6 +1092,7 @@ xfs_bmap_add_attrfork(
> >  	if (XFS_IFORK_Q(ip))
> >  		goto trans_cancel;
> >  	if (XFS_IS_CORRUPT(mp, ip->i_d.di_anextents != 0)) {
> > +		xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
> 
> Similar question here given we haven't added the fork yet. di_anextents
> is at least related I suppose, but it's not clear that
> scrubbing/repairing the attr fork is what needs to happen.

Hm, you're right, it's scrub/inode*.c that deal with anextents and
aformat, so these ought to mark the inode core sick, not the attr fork.

> >  		error = -EFSCORRUPTED;
> >  		goto trans_cancel;
> >  	}
> ...
> > @@ -1239,6 +1244,7 @@ xfs_iread_extents(
> >  	if (XFS_IS_CORRUPT(mp,
> >  			   XFS_IFORK_FORMAT(ip, whichfork) !=
> >  			   XFS_DINODE_FMT_BTREE)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		error = -EFSCORRUPTED;
> >  		goto out;
> >  	}
> > @@ -1254,6 +1260,7 @@ xfs_iread_extents(
> >  
> >  	if (XFS_IS_CORRUPT(mp,
> >  			   ir.loaded != XFS_IFORK_NEXTENTS(ip, whichfork))) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		error = -EFSCORRUPTED;
> >  		goto out;
> >  	}
> > @@ -1262,6 +1269,8 @@ xfs_iread_extents(
> >  	ifp->if_flags |= XFS_IFEXTENTS;
> >  	return 0;
> >  out:
> > +	if (xfs_metadata_is_sick(error))
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  	xfs_iext_destroy(ifp);
> >  	return error;
> >  }
> 
> Duplicate calls in xfs_iread_extents()?

Oops, yeah.

> Brian
> 
> > @@ -1344,6 +1353,7 @@ xfs_bmap_last_before(
> >  		break;
> >  	default:
> >  		ASSERT(0);
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -1443,8 +1453,11 @@ xfs_bmap_last_offset(
> >  	if (XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_LOCAL)
> >  		return 0;
> >  
> > -	if (XFS_IS_CORRUPT(ip->i_mount, !xfs_ifork_has_extents(ip, whichfork)))
> > +	if (XFS_IS_CORRUPT(ip->i_mount,
> > +	    !xfs_ifork_has_extents(ip, whichfork))) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> > +	}
> >  
> >  	error = xfs_bmap_last_extent(NULL, ip, whichfork, &rec, &is_empty);
> >  	if (error || is_empty)
> > @@ -3905,6 +3918,7 @@ xfs_bmapi_read(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -3935,6 +3949,7 @@ xfs_bmapi_read(
> >  		xfs_alert(mp, "%s: inode %llu missing fork %d",
> >  				__func__, ip->i_ino, whichfork);
> >  #endif /* DEBUG */
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -4414,6 +4429,7 @@ xfs_bmapi_write(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -4621,9 +4637,11 @@ xfs_bmapi_convert_delalloc(
> >  	error = -ENOSPC;
> >  	if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK))
> >  		goto out_finish;
> > -	error = -EFSCORRUPTED;
> > -	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock)))
> > +	if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> > +		error = -EFSCORRUPTED;
> >  		goto out_finish;
> > +	}
> >  
> >  	XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, bma.length));
> >  	XFS_STATS_INC(mp, xs_xstrat_quick);
> > @@ -4681,6 +4699,7 @@ xfs_bmapi_remap(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -5319,8 +5338,10 @@ __xfs_bunmapi(
> >  	whichfork = xfs_bmapi_whichfork(flags);
> >  	ASSERT(whichfork != XFS_COW_FORK);
> >  	ifp = XFS_IFORK_PTR(ip, whichfork);
> > -	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)))
> > +	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork))) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> > +	}
> >  	if (XFS_FORCED_SHUTDOWN(mp))
> >  		return -EIO;
> >  
> > @@ -5815,6 +5836,7 @@ xfs_bmap_collapse_extents(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -5932,6 +5954,7 @@ xfs_bmap_insert_extents(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -6038,6 +6061,7 @@ xfs_bmap_split_extent_at(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > @@ -6253,8 +6277,10 @@ xfs_bmap_finish_one(
> >  			XFS_FSB_TO_AGBNO(tp->t_mountp, startblock),
> >  			ip->i_ino, whichfork, startoff, *blockcount, state);
> >  
> > -	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK))
> > +	if (WARN_ON_ONCE(whichfork != XFS_DATA_FORK)) {
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		return -EFSCORRUPTED;
> > +	}
> >  
> >  	if (XFS_TEST_ERROR(false, tp->t_mountp,
> >  			XFS_ERRTAG_BMAP_FINISH_ONE))
> > @@ -6272,6 +6298,7 @@ xfs_bmap_finish_one(
> >  		break;
> >  	default:
> >  		ASSERT(0);
> > +		xfs_bmap_mark_sick(ip, whichfork);
> >  		error = -EFSCORRUPTED;
> >  	}
> >  
> > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > index ce8954a10c66..25b61180b562 100644
> > --- a/fs/xfs/libxfs/xfs_health.h
> > +++ b/fs/xfs/libxfs/xfs_health.h
> > @@ -138,6 +138,7 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> >  		unsigned int *checked);
> >  
> >  void xfs_health_unmount(struct xfs_mount *mp);
> > +void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> >  
> >  /* Now some helpers. */
> >  
> > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > index 36c32b108b39..5e5de5338476 100644
> > --- a/fs/xfs/xfs_health.c
> > +++ b/fs/xfs/xfs_health.c
> > @@ -452,3 +452,29 @@ xfs_bulkstat_health(
> >  			bs->bs_sick |= m->ioctl_mask;
> >  	}
> >  }
> > +
> > +/* Mark a block mapping sick. */
> > +void
> > +xfs_bmap_mark_sick(
> > +	struct xfs_inode	*ip,
> > +	int			whichfork)
> > +{
> > +	unsigned int		mask;
> > +
> > +	switch (whichfork) {
> > +	case XFS_DATA_FORK:
> > +		mask = XFS_SICK_INO_BMBTD;
> > +		break;
> > +	case XFS_ATTR_FORK:
> > +		mask = XFS_SICK_INO_BMBTA;
> > +		break;
> > +	case XFS_COW_FORK:
> > +		mask = XFS_SICK_INO_BMBTC;
> > +		break;
> > +	default:
> > +		ASSERT(0);
> > +		return;
> > +	}
> > +
> > +	xfs_inode_mark_sick(ip, mask);
> > +}
> > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > index 28e2d1f37267..c1befb899911 100644
> > --- a/fs/xfs/xfs_iomap.c
> > +++ b/fs/xfs/xfs_iomap.c
> > @@ -27,7 +27,7 @@
> >  #include "xfs_dquot_item.h"
> >  #include "xfs_dquot.h"
> >  #include "xfs_reflink.h"
> > -
> > +#include "xfs_health.h"
> >  
> >  #define XFS_ALLOC_ALIGN(mp, off) \
> >  	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
> > @@ -59,8 +59,10 @@ xfs_bmbt_to_iomap(
> >  	struct xfs_mount	*mp = ip->i_mount;
> >  	struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
> >  
> > -	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
> > +	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
> > +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
> >  		return xfs_alert_fsblock_zero(ip, imap);
> > +	}
> >  
> >  	if (imap->br_startblock == HOLESTARTBLOCK) {
> >  		iomap->addr = IOMAP_NULL_ADDR;
> > @@ -277,8 +279,10 @@ xfs_iomap_write_direct(
> >  		goto out_unlock;
> >  	}
> >  
> > -	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock)))
> > +	if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) {
> > +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
> >  		error = xfs_alert_fsblock_zero(ip, imap);
> > +	}
> >  
> >  out_unlock:
> >  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > @@ -598,8 +602,10 @@ xfs_iomap_write_unwritten(
> >  		if (error)
> >  			return error;
> >  
> > -		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock)))
> > +		if (unlikely(!xfs_valid_startblock(ip, imap.br_startblock))) {
> > +			xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
> >  			return xfs_alert_fsblock_zero(ip, &imap);
> > +		}
> >  
> >  		if ((numblks_fsb = imap.br_blockcount) == 0) {
> >  			/*
> > @@ -858,6 +864,7 @@ xfs_buffered_write_iomap_begin(
> >  
> >  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, XFS_DATA_FORK)) ||
> >  	    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> > +		xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
> >  		error = -EFSCORRUPTED;
> >  		goto out_unlock;
> >  	}
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-20 16:43     ` Darrick J. Wong
@ 2019-11-21 13:26       ` Brian Foster
  2019-11-22  0:53         ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-21 13:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 08:43:23AM -0800, Darrick J. Wong wrote:
> On Wed, Nov 20, 2019 at 09:20:47AM -0500, Brian Foster wrote:
> > On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Whenever we encounter a corrupt AG header, we should report that to the
> > > health monitoring system for later reporting.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
> > >  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
> > >  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
> > >  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
> > >  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
> > >  fs/xfs/libxfs/xfs_sb.c       |    2 ++
> > >  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
> > >  fs/xfs/xfs_inode.c           |    9 +++++++++
> > >  8 files changed, 51 insertions(+), 2 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > index c284e10af491..e75e3ae6c912 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > @@ -26,6 +26,7 @@
> > >  #include "xfs_log.h"
> > >  #include "xfs_ag_resv.h"
> > >  #include "xfs_bmap.h"
> > > +#include "xfs_health.h"
> > >  
> > >  extern kmem_zone_t	*xfs_bmap_free_item_zone;
> > >  
> > > @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
> > >  			mp, tp, mp->m_ddev_targp,
> > >  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
> > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> > > +	if (xfs_metadata_is_sick(error))
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
> > 
> > Any reason we couldn't do some of these in verifiers? I'm assuming we'd
> > still need calls in various external corruption checks, but at least we
> > wouldn't add a requirement to check all future buffer reads, etc.
> 
> I thought about that.  It would be wonderful if C had a syntactically
> slick method to package a function + execution scope and pass that
> through other functions to be called later. :)
> 
> For the per-AG stuff it wouldn't be hard to make the verifier functions
> derive the AG number and call xfs_agno_mark_sick directly in the
> verifier.  For per-inode metadata, we'd have to find a way to pass the
> struct xfs_inode pointer to the verifier, which means that we'd have to
> add that to struct xfs_buf.
> 
> xfs_buf is ~384 bytes so maybe adding another pointer for read context
> wouldn't be terrible?  That would add a fair amount of ugly special
> casing in the btree code to decide if we have an inode to pass through,
> though it would solve the problem of the bmbt verifier not being able to
> check the owner field in the btree block header.
> 
> OTOH that's 8 bytes of overhead that we can never get rid of even though
> we only really need it the first time the buffer gets read in from disk.
> 
> Thoughts?
> 

That doesn't seem too unreasonable, but I guess I'd have to think about
it some more. Maybe it's worth defining a private pointer in the buffer
that callers can use to pass specific context to verifiers for health
processing. I suppose such a field could also be conditionally defined
on scrub enabled kernels (at least initially), so the overhead would be
opt-in.

Anyways, I think for this series it might be reasonable to push things
down into verifiers opportunistically where we can do so without any
core mechanism changes. We can follow up with changes to do the rest if
we can come up with something elegant.

> > >  	if (error)
> > >  		return error;
> > >  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> > > @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
> > >  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
> > >  		     be32_to_cpu(agf->agf_length))) {
> > >  		xfs_buf_corruption_error(agbp);
> > > +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -2952,6 +2956,8 @@ xfs_read_agf(
> > >  			mp, tp, mp->m_ddev_targp,
> > >  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> > >  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> > > +	if (xfs_metadata_is_sick(error))
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
> > >  	if (error)
> > >  		return error;
> > >  	if (!*bpp)
> > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > index 3657a9cb8490..ce8954a10c66 100644
> > > --- a/fs/xfs/libxfs/xfs_health.h
> > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> > >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> > >  		unsigned int *checked);
> > >  
> > > +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> > > +		unsigned int mask);
> > >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> > >  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> > >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> > > @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
> > >  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
> > >  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
> > >  
> > > +#define xfs_metadata_is_sick(error) \
> > > +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> > > +		  (error) == -EFSBADCRC))
> > 
> > Why is -EIO considered sick? My understanding is that once something is
> > marked sick, scrub is the only way to clear that state. -EIO can be
> > transient, so afaict that means we could mark a persistent in-core state
> > based on a transient/resolved issue.
> 
> I think it sounds reasonable that if the fs hits a metadata IO error
> then the administrator should scrub that data structure to make sure
> it's ok, and if so, clear the sick state.
> 

I'm not totally convinced... I thought we had configurations where I/O
errors can be reasonably expected and recovered from. For example,
consider the thin provisioning + infinite metadata writeback error retry
mechanism. IIRC, the whole purpose of that was to facilitate the use
case where the thin pool runs out of space, but the admin wants some
window of time to expand and keep the filesystem alive.

I don't necessarily think it's a bad thing to suggest a scrub any time
errors have occurred, but for something like the above where an
environment may have been thoroughly tested and verified through that
particular error->expand sequence, it seems that flagging bits as sick
might be unnecessarily ominous.

> Though I realized just now that if scrub isn't enabled then it's an
> unfixable dead end so the EIO check should be gated on
> CONFIG_XFS_ONLINE_SCRUB=y.
> 

Yeah, that was my initial concern..

> > Along similar lines, what's the expected behavior in the event of any of
> > these errors for a kernel that might not support
> > CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
> > used for anything? If so, that seems Ok I suppose.. but it's a little
> > awkward if we'd see the tracepoints and such associated with the state
> > changes.
> 
> Even if scrub is disabled, the kernel will still set the sick state, and
> later the administrator can query the filesystem with xfs_spaceman to
> observe that sick state.
> 

Ok, so it's intended to be a valid health state independent of scrub.
That seems reasonable in principle and can always be used to indicate
offline repair is necessary too.

> In the future, I will also use the per-AG sick states to steer
> allocations away from known problematic AGs to try to avoid
> unexpected shutdown in the middle of a transaction.
> 

Hmm.. I'm a little curious about how much we should steer away from
traditional behavior on kernels that might not support scrub. I suppose
I could see arguments for going either way, but this is getting a bit
ahead of this patch anyways. ;)

Brian

> --D
> 
> > 
> > Brian
> > 
> > > +
> > >  #endif	/* __XFS_HEALTH_H__ */
> > > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > > index 988cde7744e6..c401512a4350 100644
> > > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > > @@ -27,6 +27,7 @@
> > >  #include "xfs_trace.h"
> > >  #include "xfs_log.h"
> > >  #include "xfs_rmap.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Lookup a record by ino in the btree given by cur.
> > > @@ -2635,6 +2636,8 @@ xfs_read_agi(
> > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > >  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
> > >  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> > > +	if (xfs_metadata_is_sick(error))
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > >  	if (error)
> > >  		return error;
> > >  	if (tp)
> > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > > index d7d702ee4d1a..25c87834e42a 100644
> > > --- a/fs/xfs/libxfs/xfs_refcount.c
> > > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > > @@ -22,6 +22,7 @@
> > >  #include "xfs_bit.h"
> > >  #include "xfs_refcount.h"
> > >  #include "xfs_rmap.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /* Allowable refcount adjustment amounts. */
> > >  enum xfs_refc_adjust_op {
> > > @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
> > >  				XFS_ALLOC_FLAG_FREEING, &agbp);
> > >  		if (error)
> > >  			return error;
> > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  
> > >  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
> > >  		if (!rcur) {
> > > diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> > > index ff9412f113c4..a54a3c129cce 100644
> > > --- a/fs/xfs/libxfs/xfs_rmap.c
> > > +++ b/fs/xfs/libxfs/xfs_rmap.c
> > > @@ -21,6 +21,7 @@
> > >  #include "xfs_errortag.h"
> > >  #include "xfs_error.h"
> > >  #include "xfs_inode.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Lookup the first record less than or equal to [bno, len, owner, offset]
> > > @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
> > >  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
> > >  		if (error)
> > >  			return error;
> > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  
> > >  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
> > >  		if (!rcur) {
> > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > index 0ac69751fe85..4a923545465d 100644
> > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
> > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > >  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> > > +	if (xfs_metadata_is_sick(error))
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
> > >  	if (error)
> > >  		return error;
> > >  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > index 860dc70c99e7..36c32b108b39 100644
> > > --- a/fs/xfs/xfs_health.c
> > > +++ b/fs/xfs/xfs_health.c
> > > @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
> > >  	spin_unlock(&mp->m_sb_lock);
> > >  }
> > >  
> > > +/* Mark unhealthy per-ag metadata given a raw AG number. */
> > > +void
> > > +xfs_agno_mark_sick(
> > > +	struct xfs_mount	*mp,
> > > +	xfs_agnumber_t		agno,
> > > +	unsigned int		mask)
> > > +{
> > > +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> > > +
> > > +	/* per-ag structure not set up yet? */
> > > +	if (!pag)
> > > +		return;
> > > +
> > > +	xfs_ag_mark_sick(pag, mask);
> > > +	xfs_perag_put(pag);
> > > +}
> > > +
> > >  /* Mark unhealthy per-ag metadata. */
> > >  void
> > >  xfs_ag_mark_sick(
> > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > index 401da197f012..a2812cea748d 100644
> > > --- a/fs/xfs/xfs_inode.c
> > > +++ b/fs/xfs/xfs_inode.c
> > > @@ -35,6 +35,7 @@
> > >  #include "xfs_log.h"
> > >  #include "xfs_bmap_btree.h"
> > >  #include "xfs_reflink.h"
> > > +#include "xfs_health.h"
> > >  
> > >  kmem_zone_t *xfs_inode_zone;
> > >  
> > > @@ -787,6 +788,8 @@ xfs_ialloc(
> > >  	 */
> > >  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
> > >  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> > > +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> > > +				XFS_SICK_AG_INOBT);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
> > >  	 */
> > >  	if (old_value == new_agino) {
> > >  		xfs_buf_corruption_error(agibp);
> > > +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
> > >  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
> > >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > >  				sizeof(*dip), __this_address);
> > > +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > >  		error = -EFSCORRUPTED;
> > >  		goto out;
> > >  	}
> > > @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
> > >  		if (next_agino != NULLAGINO) {
> > >  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> > >  					dip, sizeof(*dip), __this_address);
> > > +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > >  			error = -EFSCORRUPTED;
> > >  		}
> > >  		goto out;
> > > @@ -2271,6 +2277,7 @@ xfs_iunlink(
> > >  	if (next_agino == agino ||
> > >  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
> > >  		xfs_buf_corruption_error(agibp);
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
> > >  			XFS_CORRUPTION_ERROR(__func__,
> > >  					XFS_ERRLEVEL_LOW, mp,
> > >  					*dipp, sizeof(**dipp));
> > > +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
> > >  			error = -EFSCORRUPTED;
> > >  			return error;
> > >  		}
> > > @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
> > >  	if (!xfs_verify_agino(mp, agno, head_agino)) {
> > >  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
> > >  				agi, sizeof(*agi));
> > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-20 16:55     ` Darrick J. Wong
@ 2019-11-21 13:26       ` Brian Foster
  2019-11-22  1:03         ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-21 13:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Nov 20, 2019 at 08:55:08AM -0800, Darrick J. Wong wrote:
> On Wed, Nov 20, 2019 at 11:11:47AM -0500, Brian Foster wrote:
> > On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Whenever we encounter corrupt directory or extended attribute blocks, we
> > > should report that to the health monitoring system for later reporting.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
> > >  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
> > >  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
> > >  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
> > >  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
> > >  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
> > >  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
> > >  fs/xfs/libxfs/xfs_health.h      |    3 +++
> > >  fs/xfs/xfs_attr_inactive.c      |    4 ++++
> > >  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
> > >  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
> > >  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
> > >  12 files changed, 126 insertions(+), 20 deletions(-)
> > > 
> > > 
> > ...
> > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> > > index e424b004e3cb..a17622dadf00 100644
> > > --- a/fs/xfs/libxfs/xfs_da_btree.c
> > > +++ b/fs/xfs/libxfs/xfs_da_btree.c
> > ...
> > > @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
> > >  
> > >  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
> > >  			xfs_buf_corruption_error(blk->bp);
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > >  		}
> > >  
> > > @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
> > >  		/* Tree taller than we can handle; bail out! */
> > >  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
> > >  			xfs_buf_corruption_error(blk->bp);
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > >  		}
> > >  
> > > @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
> > >  			expected_level = nodehdr.level - 1;
> > >  		else if (expected_level != nodehdr.level) {
> > >  			xfs_buf_corruption_error(blk->bp);
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > >  		} else
> > >  			expected_level--;
> > > @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
> > >  		}
> > >  
> > >  		/* We can't point back to the root. */
> > > -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> > > +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  	}
> > >  
> > > -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> > > +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > > +	}
> > >  
> > >  	/*
> > >  	 * A leaf block that ends in the hashval that we are interested in
> > > @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
> > >  			args->blkno = blk->blkno;
> > >  		} else {
> > >  			ASSERT(0);
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > >  		}
> > 
> > I'm just kind of skimming through the rest for general feedback at this
> > point given previous comments, but it might be nice to start using exit
> > labels at some of these places where we're enlarging and duplicating the
> > error path for particular errors.
> 
> Yeah.  This current iteration is pretty wordy since I used coccinelle to
> find all the EFSCORRUPTED clauses and inject the appropriate _mark_sick
> call.
> 
> > It's not so much about the code in
> > these patches, but rather to hopefully ease maintaining these state bits
> > properly in new code where devs/reviewers might not know much about
> > scrub state or have it in mind. Short of having some kind of generic
> > helper to handle corruption state, ISTM that the combination of using
> > verifiers where possible and common exit labels anywhere else we
> > generate -EFSCORRUPTED at multiple places within some function could
> > shrink these patches a bit..
> 
> <nod> Eric suggested on IRC that maybe the _mark_sick functions should
> return EFSCORRUPTED so that we could at least collapse that to:
> 
> if (XFS_IS_CORRUPT(...)) {
> 	error = xfs_da_mark_sick(...);
> 	goto barf;
> }
> 
> However, doing it the wordy way I've done it has the neat effects (IMHO)
> that you can find all the places where xfs decides some metadata is
> corrupt by grepping for EFSCORRUPTED, and confirm that each place it
> does that also has a corresponding _mark_sick call.
> 

Yeah, that was actually my thought process in suggesting pushing the
mark_sick() calls down into verifiers as well. It seems a little more
clear (and open to future cleanups) with a strict pattern of setting
sickness in the locations that generate corruption errors. Of course
that likely means some special macro or something like you propose
below, but I didn't want to quite go there until we could put the state
updates in the right places.

> I guess you could create a dorky shouty wrapper to maintain that greppy
> property:
> 
> #define XFS_DA_EFSCORRUPTED(...) \
> 	(xfs_da_mark_sick(...), -EFSCORRUPTED)
> 
> But... that might be stylistically undesirable.  OTOH I guess it
> wouldn't be so bad either to do:
> 
> 	if (XFS_IS_CORRUPT(...)) {
> 		error = -EFSCORRUPTED;
> 		goto bad;
> 	}
> 
> 	if (XFS_IS_CORRUPT(...)) {
> 		error = -EFSCORRUPTED;
> 		goto bad;
> 	}
> 
> 	return 0;
> bad:
> 	if (error == -EFSCORRUPTED)
> 		xfs_da_mark_sick(...);
> 	return error;
> 
> Or using the shouty macro above:
> 
> 	if (XFS_IS_CORRUPT(...)) {
> 		error = XFS_DA_EFSCORRUPTED(...);
> 		goto bad;
> 	}
> 
> 	if (XFS_IS_CORRUPT(...)) {
> 		error = XFS_DA_EFSCORRUPTED(...);
> 		goto bad;
> 	}
> 
> bad:
> 	return error;
> 
> I'll think about that.  It doesn't sound so bad when coding it up in
> this email.
> 

I suppose a macro is nice in that it enforces sickness is updated
wherever -EFSCORRUPTED occurs, or at least can easily be verified by
grepping. I find the separate macros pattern a little confusing, FWIW,
simply because at a glance it looks like a garbled bunch of logic to me.
I.e. I see 'if (IS_CORRUPT()) SOMETHING_CORRUPTED(); ...' and wonder wtf
that is doing, for one. It's also not immediately obvious when we should
use one or not the other, etc. This is getting into bikeshedding
territory though and I don't have much of a better suggestion atm...

Brian

> --D
> 
> > 
> > Brian
> > 
> > >  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> > > @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
> > >  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
> > >  	if (error)
> > >  		return error;
> > > -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> > > +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > > +	}
> > >  	/*
> > >  	 * Read the last block in the btree space.
> > >  	 */
> > > @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
> > >  		if (XFS_IS_CORRUPT(mp,
> > >  				   be32_to_cpu(sib_info->forw) != last_blkno ||
> > >  				   sib_info->magic != dead_info->magic)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
> > >  		if (XFS_IS_CORRUPT(mp,
> > >  				   be32_to_cpu(sib_info->back) != last_blkno ||
> > >  				   sib_info->magic != dead_info->magic)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
> > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > >  		if (XFS_IS_CORRUPT(mp,
> > >  				   level >= 0 && level != par_hdr.level + 1)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
> > >  		     entno++)
> > >  			continue;
> > >  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
> > >  		xfs_trans_brelse(tp, par_buf);
> > >  		par_buf = NULL;
> > >  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
> > >  		par_node = par_buf->b_addr;
> > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > >  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> > > +			xfs_da_mark_sick(args);
> > >  			error = -EFSCORRUPTED;
> > >  			goto done;
> > >  		}
> > > @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
> > >  					irecs[i].br_state);
> > >  			}
> > >  		}
> > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > >  		error = -EFSCORRUPTED;
> > >  		goto out;
> > >  	}
> > > @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
> > >  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
> > >  					dp->i_mount->m_ddev_targp,
> > >  					mapp, nmap, 0, &bp, ops);
> > > +	if (xfs_metadata_is_sick(error))
> > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > >  	if (error)
> > >  		goto out_free;
> > >  
> > > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > > index 0aa87cbde49e..e1aa411a1b8b 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2.c
> > > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > > @@ -18,6 +18,7 @@
> > >  #include "xfs_errortag.h"
> > >  #include "xfs_error.h"
> > >  #include "xfs_trace.h"
> > > +#include "xfs_health.h"
> > >  
> > >  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
> > >  
> > > @@ -608,8 +609,10 @@ xfs_dir2_isblock(
> > >  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
> > >  	if (XFS_IS_CORRUPT(args->dp->i_mount,
> > >  			   rval != 0 &&
> > > -			   args->dp->i_d.di_size != args->geo->blksize))
> > > +			   args->dp->i_d.di_size != args->geo->blksize)) {
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > > +	}
> > >  	*vp = rval;
> > >  	return 0;
> > >  }
> > > diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> > > index a6eb71a62b53..80cc9c7ea4e5 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2_data.c
> > > +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> > > @@ -18,6 +18,7 @@
> > >  #include "xfs_trans.h"
> > >  #include "xfs_buf_item.h"
> > >  #include "xfs_log.h"
> > > +#include "xfs_health.h"
> > >  
> > >  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
> > >  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> > > @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
> > >  corrupt:
> > >  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
> > >  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> > > +	xfs_da_mark_sick(args);
> > >  	return -EFSCORRUPTED;
> > >  }
> > >  
> > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > index 73edd96ce0ac..32d17420fff3 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > @@ -19,6 +19,7 @@
> > >  #include "xfs_trace.h"
> > >  #include "xfs_trans.h"
> > >  #include "xfs_buf_item.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Local function declarations.
> > > @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
> > >  	bestsp = xfs_dir2_leaf_bests_p(ltp);
> > >  	if (be16_to_cpu(bestsp[db]) != oldbest) {
> > >  		xfs_buf_corruption_error(lbp);
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > > +
> > >  	/*
> > >  	 * Mark the former data entry unused.
> > >  	 */
> > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > index 3a8b0625a08b..e0f3ab254a1a 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > @@ -20,6 +20,7 @@
> > >  #include "xfs_trans.h"
> > >  #include "xfs_buf_item.h"
> > >  #include "xfs_log.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Function declarations.
> > > @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
> > >  	if (fa) {
> > >  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
> > >  		xfs_trans_brelse(tp, *bpp);
> > > +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
> > >  	if (be32_to_cpu(ltp->bestcount) >
> > >  				(uint)dp->i_d.di_size / args->geo->blksize) {
> > >  		xfs_buf_corruption_error(lbp);
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
> > >  	 */
> > >  	if (index < 0) {
> > >  		xfs_buf_corruption_error(bp);
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
> > >  					   cpu_to_be16(NULLDATAOFF))) {
> > >  				if (curfdb != newfdb)
> > >  					xfs_trans_brelse(tp, curbp);
> > > +				xfs_da_mark_sick(args);
> > >  				return -EFSCORRUPTED;
> > >  			}
> > >  			curfdb = newfdb;
> > > @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
> > >  	xfs_dir3_leaf_check(dp, bp);
> > >  	if (leafhdr.count <= 0) {
> > >  		xfs_buf_corruption_error(bp);
> > > +		xfs_da_mark_sick(args);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
> > >  			} else {
> > >  				xfs_alert(mp, " ... fblk is NULL");
> > >  			}
> > > +			xfs_da_mark_sick(args);
> > >  			return -EFSCORRUPTED;
> > >  		}
> > >  
> > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > index 2049419e9555..d9404cd3d09b 100644
> > > --- a/fs/xfs/libxfs/xfs_health.h
> > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > @@ -38,6 +38,7 @@ struct xfs_perag;
> > >  struct xfs_inode;
> > >  struct xfs_fsop_geom;
> > >  struct xfs_btree_cur;
> > > +struct xfs_da_args;
> > >  
> > >  /* Observable health issues for metadata spanning the entire filesystem. */
> > >  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> > > @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> > >  void xfs_health_unmount(struct xfs_mount *mp);
> > >  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> > >  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> > > +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> > > +void xfs_da_mark_sick(struct xfs_da_args *args);
> > >  
> > >  /* Now some helpers. */
> > >  
> > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > > index a78c501f6fb1..429a97494ffa 100644
> > > --- a/fs/xfs/xfs_attr_inactive.c
> > > +++ b/fs/xfs/xfs_attr_inactive.c
> > > @@ -23,6 +23,7 @@
> > >  #include "xfs_quota.h"
> > >  #include "xfs_dir2.h"
> > >  #include "xfs_error.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Look at all the extents for this logical region,
> > > @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
> > >  	if (level > XFS_DA_NODE_MAXDEPTH) {
> > >  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
> > >  		xfs_buf_corruption_error(bp);
> > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  		return -EFSCORRUPTED;
> > >  	}
> > >  
> > > @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
> > >  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
> > >  			break;
> > >  		default:
> > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  			xfs_buf_corruption_error(child_bp);
> > >  			xfs_trans_brelse(*trans, child_bp);
> > >  			error = -EFSCORRUPTED;
> > > @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
> > >  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
> > >  		break;
> > >  	default:
> > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  		error = -EFSCORRUPTED;
> > >  		xfs_buf_corruption_error(bp);
> > >  		xfs_trans_brelse(*trans, bp);
> > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > > index 7a099df88a0c..1a2a3d4ce422 100644
> > > --- a/fs/xfs/xfs_attr_list.c
> > > +++ b/fs/xfs/xfs_attr_list.c
> > > @@ -21,6 +21,7 @@
> > >  #include "xfs_error.h"
> > >  #include "xfs_trace.h"
> > >  #include "xfs_dir2.h"
> > > +#include "xfs_health.h"
> > >  
> > >  STATIC int
> > >  xfs_attr_shortform_compare(const void *a, const void *b)
> > > @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
> > >  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
> > >  			if (XFS_IS_CORRUPT(context->dp->i_mount,
> > >  					   !xfs_attr_namecheck(sfe->nameval,
> > > -							       sfe->namelen)))
> > > +							       sfe->namelen))) {
> > > +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  				return -EFSCORRUPTED;
> > > +			}
> > >  			context->put_listent(context,
> > >  					     sfe->flags,
> > >  					     sfe->nameval,
> > > @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
> > >  					     context->dp->i_mount, sfe,
> > >  					     sizeof(*sfe));
> > >  			kmem_free(sbuf);
> > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  			return -EFSCORRUPTED;
> > >  		}
> > >  
> > > @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
> > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > >  				   !xfs_attr_namecheck(sbp->name,
> > >  						       sbp->namelen))) {
> > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  			error = -EFSCORRUPTED;
> > >  			goto out;
> > >  		}
> > > @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
> > >  			return 0;
> > >  
> > >  		/* We can't point back to the root. */
> > > -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> > > +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  	}
> > >  
> > >  	if (expected_level != 0)
> > > @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
> > >  out_corruptbuf:
> > >  	xfs_buf_corruption_error(bp);
> > >  	xfs_trans_brelse(tp, bp);
> > > +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > >  	return -EFSCORRUPTED;
> > >  }
> > >  
> > > @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
> > >  		}
> > >  
> > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > -				   !xfs_attr_namecheck(name, namelen)))
> > > +				   !xfs_attr_namecheck(name, namelen))) {
> > > +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  		context->put_listent(context, entry->flags,
> > >  					      name, namelen, valuelen);
> > >  		if (context->seen_enough)
> > > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > > index 95bc9ef8f5f9..715ded503334 100644
> > > --- a/fs/xfs/xfs_dir2_readdir.c
> > > +++ b/fs/xfs/xfs_dir2_readdir.c
> > > @@ -18,6 +18,7 @@
> > >  #include "xfs_bmap.h"
> > >  #include "xfs_trans.h"
> > >  #include "xfs_error.h"
> > > +#include "xfs_health.h"
> > >  
> > >  /*
> > >   * Directory file type support functions
> > > @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
> > >  		ctx->pos = off & 0x7fffffff;
> > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > >  				   !xfs_dir2_namecheck(sfep->name,
> > > -						       sfep->namelen)))
> > > +						       sfep->namelen))) {
> > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > >  			return -EFSCORRUPTED;
> > > +		}
> > >  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
> > >  			    xfs_dir3_get_dtype(mp, filetype)))
> > >  			return 0;
> > > @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
> > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > >  				   !xfs_dir2_namecheck(dep->name,
> > >  						       dep->namelen))) {
> > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > >  			error = -EFSCORRUPTED;
> > >  			break;
> > >  		}
> > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > index 1f09027c55ad..c1b6e8fb72ec 100644
> > > --- a/fs/xfs/xfs_health.c
> > > +++ b/fs/xfs/xfs_health.c
> > > @@ -15,6 +15,8 @@
> > >  #include "xfs_trace.h"
> > >  #include "xfs_health.h"
> > >  #include "xfs_btree.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > >  
> > >  /*
> > >   * Warn about metadata corruption that we detected but haven't fixed, and
> > > @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
> > >  
> > >  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
> > >  }
> > > +
> > > +/*
> > > + * Record observations of dir/attr btree corruption with the health tracking
> > > + * system.
> > > + */
> > > +void
> > > +xfs_dirattr_mark_sick(
> > > +	struct xfs_inode	*ip,
> > > +	int			whichfork)
> > > +{
> > > +	unsigned int		mask;
> > > +
> > > +	switch (whichfork) {
> > > +	case XFS_DATA_FORK:
> > > +		mask = XFS_SICK_INO_DIR;
> > > +		break;
> > > +	case XFS_ATTR_FORK:
> > > +		mask = XFS_SICK_INO_XATTR;
> > > +		break;
> > > +	default:
> > > +		ASSERT(0);
> > > +		return;
> > > +	}
> > > +
> > > +	xfs_inode_mark_sick(ip, mask);
> > > +}
> > > +
> > > +/*
> > > + * Record observations of dir/attr btree corruption with the health tracking
> > > + * system.
> > > + */
> > > +void
> > > +xfs_da_mark_sick(
> > > +	struct xfs_da_args	*args)
> > > +{
> > > +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> > > +}
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-21 13:26       ` Brian Foster
@ 2019-11-22  0:53         ` Darrick J. Wong
  2019-11-22 11:57           ` Brian Foster
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-22  0:53 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Nov 21, 2019 at 08:26:03AM -0500, Brian Foster wrote:
> On Wed, Nov 20, 2019 at 08:43:23AM -0800, Darrick J. Wong wrote:
> > On Wed, Nov 20, 2019 at 09:20:47AM -0500, Brian Foster wrote:
> > > On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Whenever we encounter a corrupt AG header, we should report that to the
> > > > health monitoring system for later reporting.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
> > > >  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
> > > >  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
> > > >  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
> > > >  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
> > > >  fs/xfs/libxfs/xfs_sb.c       |    2 ++
> > > >  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
> > > >  fs/xfs/xfs_inode.c           |    9 +++++++++
> > > >  8 files changed, 51 insertions(+), 2 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > > index c284e10af491..e75e3ae6c912 100644
> > > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > > @@ -26,6 +26,7 @@
> > > >  #include "xfs_log.h"
> > > >  #include "xfs_ag_resv.h"
> > > >  #include "xfs_bmap.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  extern kmem_zone_t	*xfs_bmap_free_item_zone;
> > > >  
> > > > @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
> > > >  			mp, tp, mp->m_ddev_targp,
> > > >  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
> > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> > > > +	if (xfs_metadata_is_sick(error))
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
> > > 
> > > Any reason we couldn't do some of these in verifiers? I'm assuming we'd
> > > still need calls in various external corruption checks, but at least we
> > > wouldn't add a requirement to check all future buffer reads, etc.
> > 
> > I thought about that.  It would be wonderful if C had a syntactically
> > slick method to package a function + execution scope and pass that
> > through other functions to be called later. :)
> > 
> > For the per-AG stuff it wouldn't be hard to make the verifier functions
> > derive the AG number and call xfs_agno_mark_sick directly in the
> > verifier.  For per-inode metadata, we'd have to find a way to pass the
> > struct xfs_inode pointer to the verifier, which means that we'd have to
> > add that to struct xfs_buf.
> > 
> > xfs_buf is ~384 bytes so maybe adding another pointer for read context
> > wouldn't be terrible?  That would add a fair amount of ugly special
> > casing in the btree code to decide if we have an inode to pass through,
> > though it would solve the problem of the bmbt verifier not being able to
> > check the owner field in the btree block header.
> > 
> > OTOH that's 8 bytes of overhead that we can never get rid of even though
> > we only really need it the first time the buffer gets read in from disk.
> > 
> > Thoughts?
> > 
> 
> That doesn't seem too unreasonable, but I guess I'd have to think about
> it some more. Maybe it's worth defining a private pointer in the buffer
> that callers can use to pass specific context to verifiers for health
> processing. I suppose such a field could also be conditionally defined
> on scrub enabled kernels (at least initially), so the overhead would be
> opt-in.

Looking further into this, what if we could did something like the
following:

struct xfs_buf_verify {
	const struct xfs_buf_ops	*ops;
	struct xfs_inode		*ip;
	unsigned int			sick_flags;
	/* whatever else */
};

...then we change the _read_buf and _trans_read_buf functions to take as
the final argument a (struct xfs_buf_verify *).  In the xfs_buf_reverify
cases, we can pass this context straight through to the ->read_verify
function.

To handle the !DONE case where the buffer read completion can happen
asynchronously, we change the b_ops field definition to:

	union {
		struct xfs_buf_ops	*b_ops;
		struct xfs_buf_verify	*b_vctx;
	};

Next we define a new XBF_HAVE_VERIFY_CTX flag that means b_vctx is
active and not ops.  xfs_buf_read_map can set the flag and b_vctx for
any synchronous (!XBF_ASYNC) read because we know the caller will be
asleep waiting for b_iowait and therefore cannot kill the verifier
context structure.  Once we get to xfs_buf_ioend we can set b_ops, drop
the XBF_H_V_C flag, and call ->verify_read.

Now we actually /can/ pass the inode pointer into the verifier, along
with pretty much anything else we can think of.

Does that sound reasonable?  Or totally heinous? :)

> Anyways, I think for this series it might be reasonable to push things
> down into verifiers opportunistically where we can do so without any
> core mechanism changes. We can follow up with changes to do the rest if
> we can come up with something elegant.

Ok.  I think I will try to implement such a beast for 5.6 and then put
this series after it.

> > > >  	if (error)
> > > >  		return error;
> > > >  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> > > > @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
> > > >  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
> > > >  		     be32_to_cpu(agf->agf_length))) {
> > > >  		xfs_buf_corruption_error(agbp);
> > > > +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -2952,6 +2956,8 @@ xfs_read_agf(
> > > >  			mp, tp, mp->m_ddev_targp,
> > > >  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> > > >  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> > > > +	if (xfs_metadata_is_sick(error))
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
> > > >  	if (error)
> > > >  		return error;
> > > >  	if (!*bpp)
> > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > index 3657a9cb8490..ce8954a10c66 100644
> > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> > > >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> > > >  		unsigned int *checked);
> > > >  
> > > > +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> > > > +		unsigned int mask);
> > > >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> > > >  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> > > >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> > > > @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
> > > >  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
> > > >  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
> > > >  
> > > > +#define xfs_metadata_is_sick(error) \
> > > > +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> > > > +		  (error) == -EFSBADCRC))
> > > 
> > > Why is -EIO considered sick? My understanding is that once something is
> > > marked sick, scrub is the only way to clear that state. -EIO can be
> > > transient, so afaict that means we could mark a persistent in-core state
> > > based on a transient/resolved issue.
> > 
> > I think it sounds reasonable that if the fs hits a metadata IO error
> > then the administrator should scrub that data structure to make sure
> > it's ok, and if so, clear the sick state.
> > 
> 
> I'm not totally convinced... I thought we had configurations where I/O
> errors can be reasonably expected and recovered from. For example,
> consider the thin provisioning + infinite metadata writeback error retry
> mechanism. IIRC, the whole purpose of that was to facilitate the use
> case where the thin pool runs out of space, but the admin wants some
> window of time to expand and keep the filesystem alive.

Aha, I just realized that it's not clear from the macro definition that
I was only intending it to be called from the read path.

Though I guess there's always the possibility that the PFY trips over
the PCIE cable in the datacenter and XFS hits an EIO, but the disk will
be fine a moment later when he shoves it back in.  The disk media is
fine, and by that point either we returned read error to userspace or
the transaction got cancelled and it's too late to do anything anyway.

I'll drop the EIO check for now and we'll see if I get around to
revisiting it.

> I don't necessarily think it's a bad thing to suggest a scrub any time
> errors have occurred, but for something like the above where an
> environment may have been thoroughly tested and verified through that
> particular error->expand sequence, it seems that flagging bits as sick
> might be unnecessarily ominous.

<shrug> Yeah, (sick && !checked) is a weird passive-aggressive state
like that.

> > Though I realized just now that if scrub isn't enabled then it's an
> > unfixable dead end so the EIO check should be gated on
> > CONFIG_XFS_ONLINE_SCRUB=y.
> > 
> 
> Yeah, that was my initial concern..
> 
> > > Along similar lines, what's the expected behavior in the event of any of
> > > these errors for a kernel that might not support
> > > CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
> > > used for anything? If so, that seems Ok I suppose.. but it's a little
> > > awkward if we'd see the tracepoints and such associated with the state
> > > changes.
> > 
> > Even if scrub is disabled, the kernel will still set the sick state, and
> > later the administrator can query the filesystem with xfs_spaceman to
> > observe that sick state.
> > 
> 
> Ok, so it's intended to be a valid health state independent of scrub.
> That seems reasonable in principle and can always be used to indicate
> offline repair is necessary too.

Yes.

> > In the future, I will also use the per-AG sick states to steer
> > allocations away from known problematic AGs to try to avoid
> > unexpected shutdown in the middle of a transaction.
> > 
> 
> Hmm.. I'm a little curious about how much we should steer away from
> traditional behavior on kernels that might not support scrub. I suppose
> I could see arguments for going either way, but this is getting a bit
> ahead of this patch anyways. ;)

Yeah.  I /do/ have prototype patches buried in my dev tree but they are
too ugly not to let all the magic smoke out.  What really happens is
that when we hit a corruption error, we mark the AG as offline.  Then
the sysadmin can run xfs_scrub to fix it (which would set th AG back
online) or I guess we could have a spaceman -x command to force it back
online.

I always build in /some/ kind of manual override somewhere... :)

--D

> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > +
> > > >  #endif	/* __XFS_HEALTH_H__ */
> > > > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > > > index 988cde7744e6..c401512a4350 100644
> > > > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > > > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > > > @@ -27,6 +27,7 @@
> > > >  #include "xfs_trace.h"
> > > >  #include "xfs_log.h"
> > > >  #include "xfs_rmap.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Lookup a record by ino in the btree given by cur.
> > > > @@ -2635,6 +2636,8 @@ xfs_read_agi(
> > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > >  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
> > > >  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> > > > +	if (xfs_metadata_is_sick(error))
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > >  	if (error)
> > > >  		return error;
> > > >  	if (tp)
> > > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > > > index d7d702ee4d1a..25c87834e42a 100644
> > > > --- a/fs/xfs/libxfs/xfs_refcount.c
> > > > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > > > @@ -22,6 +22,7 @@
> > > >  #include "xfs_bit.h"
> > > >  #include "xfs_refcount.h"
> > > >  #include "xfs_rmap.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /* Allowable refcount adjustment amounts. */
> > > >  enum xfs_refc_adjust_op {
> > > > @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
> > > >  				XFS_ALLOC_FLAG_FREEING, &agbp);
> > > >  		if (error)
> > > >  			return error;
> > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  
> > > >  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
> > > >  		if (!rcur) {
> > > > diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> > > > index ff9412f113c4..a54a3c129cce 100644
> > > > --- a/fs/xfs/libxfs/xfs_rmap.c
> > > > +++ b/fs/xfs/libxfs/xfs_rmap.c
> > > > @@ -21,6 +21,7 @@
> > > >  #include "xfs_errortag.h"
> > > >  #include "xfs_error.h"
> > > >  #include "xfs_inode.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Lookup the first record less than or equal to [bno, len, owner, offset]
> > > > @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
> > > >  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
> > > >  		if (error)
> > > >  			return error;
> > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  
> > > >  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
> > > >  		if (!rcur) {
> > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > > index 0ac69751fe85..4a923545465d 100644
> > > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > > @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
> > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > >  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> > > > +	if (xfs_metadata_is_sick(error))
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
> > > >  	if (error)
> > > >  		return error;
> > > >  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > index 860dc70c99e7..36c32b108b39 100644
> > > > --- a/fs/xfs/xfs_health.c
> > > > +++ b/fs/xfs/xfs_health.c
> > > > @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
> > > >  	spin_unlock(&mp->m_sb_lock);
> > > >  }
> > > >  
> > > > +/* Mark unhealthy per-ag metadata given a raw AG number. */
> > > > +void
> > > > +xfs_agno_mark_sick(
> > > > +	struct xfs_mount	*mp,
> > > > +	xfs_agnumber_t		agno,
> > > > +	unsigned int		mask)
> > > > +{
> > > > +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> > > > +
> > > > +	/* per-ag structure not set up yet? */
> > > > +	if (!pag)
> > > > +		return;
> > > > +
> > > > +	xfs_ag_mark_sick(pag, mask);
> > > > +	xfs_perag_put(pag);
> > > > +}
> > > > +
> > > >  /* Mark unhealthy per-ag metadata. */
> > > >  void
> > > >  xfs_ag_mark_sick(
> > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > index 401da197f012..a2812cea748d 100644
> > > > --- a/fs/xfs/xfs_inode.c
> > > > +++ b/fs/xfs/xfs_inode.c
> > > > @@ -35,6 +35,7 @@
> > > >  #include "xfs_log.h"
> > > >  #include "xfs_bmap_btree.h"
> > > >  #include "xfs_reflink.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  kmem_zone_t *xfs_inode_zone;
> > > >  
> > > > @@ -787,6 +788,8 @@ xfs_ialloc(
> > > >  	 */
> > > >  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
> > > >  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> > > > +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> > > > +				XFS_SICK_AG_INOBT);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
> > > >  	 */
> > > >  	if (old_value == new_agino) {
> > > >  		xfs_buf_corruption_error(agibp);
> > > > +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
> > > >  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
> > > >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > > >  				sizeof(*dip), __this_address);
> > > > +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > >  		error = -EFSCORRUPTED;
> > > >  		goto out;
> > > >  	}
> > > > @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
> > > >  		if (next_agino != NULLAGINO) {
> > > >  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> > > >  					dip, sizeof(*dip), __this_address);
> > > > +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > >  			error = -EFSCORRUPTED;
> > > >  		}
> > > >  		goto out;
> > > > @@ -2271,6 +2277,7 @@ xfs_iunlink(
> > > >  	if (next_agino == agino ||
> > > >  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
> > > >  		xfs_buf_corruption_error(agibp);
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
> > > >  			XFS_CORRUPTION_ERROR(__func__,
> > > >  					XFS_ERRLEVEL_LOW, mp,
> > > >  					*dipp, sizeof(**dipp));
> > > > +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
> > > >  			error = -EFSCORRUPTED;
> > > >  			return error;
> > > >  		}
> > > > @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
> > > >  	if (!xfs_verify_agino(mp, agno, head_agino)) {
> > > >  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
> > > >  				agi, sizeof(*agi));
> > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-21 13:26       ` Brian Foster
@ 2019-11-22  1:03         ` Darrick J. Wong
  2019-11-22 12:28           ` Brian Foster
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-22  1:03 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Nov 21, 2019 at 08:26:27AM -0500, Brian Foster wrote:
> On Wed, Nov 20, 2019 at 08:55:08AM -0800, Darrick J. Wong wrote:
> > On Wed, Nov 20, 2019 at 11:11:47AM -0500, Brian Foster wrote:
> > > On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Whenever we encounter corrupt directory or extended attribute blocks, we
> > > > should report that to the health monitoring system for later reporting.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
> > > >  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
> > > >  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
> > > >  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
> > > >  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
> > > >  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
> > > >  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
> > > >  fs/xfs/libxfs/xfs_health.h      |    3 +++
> > > >  fs/xfs/xfs_attr_inactive.c      |    4 ++++
> > > >  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
> > > >  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
> > > >  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
> > > >  12 files changed, 126 insertions(+), 20 deletions(-)
> > > > 
> > > > 
> > > ...
> > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> > > > index e424b004e3cb..a17622dadf00 100644
> > > > --- a/fs/xfs/libxfs/xfs_da_btree.c
> > > > +++ b/fs/xfs/libxfs/xfs_da_btree.c
> > > ...
> > > > @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
> > > >  
> > > >  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
> > > >  			xfs_buf_corruption_error(blk->bp);
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > >  		}
> > > >  
> > > > @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
> > > >  		/* Tree taller than we can handle; bail out! */
> > > >  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
> > > >  			xfs_buf_corruption_error(blk->bp);
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > >  		}
> > > >  
> > > > @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
> > > >  			expected_level = nodehdr.level - 1;
> > > >  		else if (expected_level != nodehdr.level) {
> > > >  			xfs_buf_corruption_error(blk->bp);
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > >  		} else
> > > >  			expected_level--;
> > > > @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
> > > >  		}
> > > >  
> > > >  		/* We can't point back to the root. */
> > > > -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> > > > +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  	}
> > > >  
> > > > -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> > > > +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > > +	}
> > > >  
> > > >  	/*
> > > >  	 * A leaf block that ends in the hashval that we are interested in
> > > > @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
> > > >  			args->blkno = blk->blkno;
> > > >  		} else {
> > > >  			ASSERT(0);
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > >  		}
> > > 
> > > I'm just kind of skimming through the rest for general feedback at this
> > > point given previous comments, but it might be nice to start using exit
> > > labels at some of these places where we're enlarging and duplicating the
> > > error path for particular errors.
> > 
> > Yeah.  This current iteration is pretty wordy since I used coccinelle to
> > find all the EFSCORRUPTED clauses and inject the appropriate _mark_sick
> > call.
> > 
> > > It's not so much about the code in
> > > these patches, but rather to hopefully ease maintaining these state bits
> > > properly in new code where devs/reviewers might not know much about
> > > scrub state or have it in mind. Short of having some kind of generic
> > > helper to handle corruption state, ISTM that the combination of using
> > > verifiers where possible and common exit labels anywhere else we
> > > generate -EFSCORRUPTED at multiple places within some function could
> > > shrink these patches a bit..
> > 
> > <nod> Eric suggested on IRC that maybe the _mark_sick functions should
> > return EFSCORRUPTED so that we could at least collapse that to:
> > 
> > if (XFS_IS_CORRUPT(...)) {
> > 	error = xfs_da_mark_sick(...);
> > 	goto barf;
> > }
> > 
> > However, doing it the wordy way I've done it has the neat effects (IMHO)
> > that you can find all the places where xfs decides some metadata is
> > corrupt by grepping for EFSCORRUPTED, and confirm that each place it
> > does that also has a corresponding _mark_sick call.
> > 
> 
> Yeah, that was actually my thought process in suggesting pushing the
> mark_sick() calls down into verifiers as well.

<nod> It does strike me as a little odd that the verifiers are the /one/
place where EFSCORRUPTED isn't preceded or followed by a _mark_sick.

> It seems a little more clear (and open to future cleanups) with a
> strict pattern of setting sickness in the locations that generate
> corruption errors. Of course that likely means some special macro or
> something like you propose below, but I didn't want to quite go there
> until we could put the state updates in the right places.

Yeah....

> > I guess you could create a dorky shouty wrapper to maintain that greppy
> > property:
> > 
> > #define XFS_DA_EFSCORRUPTED(...) \
> > 	(xfs_da_mark_sick(...), -EFSCORRUPTED)
> > 
> > But... that might be stylistically undesirable.  OTOH I guess it
> > wouldn't be so bad either to do:
> > 
> > 	if (XFS_IS_CORRUPT(...)) {
> > 		error = -EFSCORRUPTED;
> > 		goto bad;
> > 	}
> > 
> > 	if (XFS_IS_CORRUPT(...)) {
> > 		error = -EFSCORRUPTED;
> > 		goto bad;
> > 	}
> > 
> > 	return 0;
> > bad:
> > 	if (error == -EFSCORRUPTED)
> > 		xfs_da_mark_sick(...);
> > 	return error;
> > 
> > Or using the shouty macro above:
> > 
> > 	if (XFS_IS_CORRUPT(...)) {
> > 		error = XFS_DA_EFSCORRUPTED(...);
> > 		goto bad;
> > 	}
> > 
> > 	if (XFS_IS_CORRUPT(...)) {
> > 		error = XFS_DA_EFSCORRUPTED(...);
> > 		goto bad;
> > 	}
> > 
> > bad:
> > 	return error;
> > 
> > I'll think about that.  It doesn't sound so bad when coding it up in
> > this email.
> > 
> 
> I suppose a macro is nice in that it enforces sickness is updated
> wherever -EFSCORRUPTED occurs, or at least can easily be verified by
> grepping. I find the separate macros pattern a little confusing, FWIW,
> simply because at a glance it looks like a garbled bunch of logic to me.
> I.e. I see 'if (IS_CORRUPT()) SOMETHING_CORRUPTED(); ...' and wonder wtf
> that is doing, for one. It's also not immediately obvious when we should
> use one or not the other, etc. This is getting into bikeshedding
> territory though and I don't have much of a better suggestion atm...

...one /could/ have specific IS_CORRUPT macros mapping to different
types of things.  Though I think this could easily get messy:

#define XFS_DIR_IS_CORRUPT(dp, perror, expr) \
	(unlikely(expr) ? xfs_corruption_report(#expr, ...), \
			  *(perror) = -EFSCORRUPTED, \
			  xfs_da_mark_sick(dp, XFS_DATA_FORK), true : false)

I don't want to load up these macros with too much stuff, but I guess at
least that reduces the directory code to:

	if (XFS_DIR_IS_CORRUPT(dp, &error, blah == badvalue))
		goto out;
	...
	if (XFS_DIR_IS_CORRUPT(dp, &error, ugh == NULL))
		return error;
out:
	return error;

Though now we're getting pretty far from the original intent to kill off
wonky macros.  At least these are less weird, so maybe this won't set
off a round of macro bikeshed rage?

--D

> 
> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > >  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> > > > @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
> > > >  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
> > > >  	if (error)
> > > >  		return error;
> > > > -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> > > > +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > > +	}
> > > >  	/*
> > > >  	 * Read the last block in the btree space.
> > > >  	 */
> > > > @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
> > > >  		if (XFS_IS_CORRUPT(mp,
> > > >  				   be32_to_cpu(sib_info->forw) != last_blkno ||
> > > >  				   sib_info->magic != dead_info->magic)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
> > > >  		if (XFS_IS_CORRUPT(mp,
> > > >  				   be32_to_cpu(sib_info->back) != last_blkno ||
> > > >  				   sib_info->magic != dead_info->magic)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
> > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > >  		if (XFS_IS_CORRUPT(mp,
> > > >  				   level >= 0 && level != par_hdr.level + 1)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
> > > >  		     entno++)
> > > >  			continue;
> > > >  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
> > > >  		xfs_trans_brelse(tp, par_buf);
> > > >  		par_buf = NULL;
> > > >  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
> > > >  		par_node = par_buf->b_addr;
> > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > >  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> > > > +			xfs_da_mark_sick(args);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto done;
> > > >  		}
> > > > @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
> > > >  					irecs[i].br_state);
> > > >  			}
> > > >  		}
> > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > >  		error = -EFSCORRUPTED;
> > > >  		goto out;
> > > >  	}
> > > > @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
> > > >  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
> > > >  					dp->i_mount->m_ddev_targp,
> > > >  					mapp, nmap, 0, &bp, ops);
> > > > +	if (xfs_metadata_is_sick(error))
> > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > >  	if (error)
> > > >  		goto out_free;
> > > >  
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > > > index 0aa87cbde49e..e1aa411a1b8b 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2.c
> > > > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > > > @@ -18,6 +18,7 @@
> > > >  #include "xfs_errortag.h"
> > > >  #include "xfs_error.h"
> > > >  #include "xfs_trace.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
> > > >  
> > > > @@ -608,8 +609,10 @@ xfs_dir2_isblock(
> > > >  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
> > > >  	if (XFS_IS_CORRUPT(args->dp->i_mount,
> > > >  			   rval != 0 &&
> > > > -			   args->dp->i_d.di_size != args->geo->blksize))
> > > > +			   args->dp->i_d.di_size != args->geo->blksize)) {
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > > +	}
> > > >  	*vp = rval;
> > > >  	return 0;
> > > >  }
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > index a6eb71a62b53..80cc9c7ea4e5 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2_data.c
> > > > +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > @@ -18,6 +18,7 @@
> > > >  #include "xfs_trans.h"
> > > >  #include "xfs_buf_item.h"
> > > >  #include "xfs_log.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
> > > >  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> > > > @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
> > > >  corrupt:
> > > >  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
> > > >  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> > > > +	xfs_da_mark_sick(args);
> > > >  	return -EFSCORRUPTED;
> > > >  }
> > > >  
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > index 73edd96ce0ac..32d17420fff3 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > @@ -19,6 +19,7 @@
> > > >  #include "xfs_trace.h"
> > > >  #include "xfs_trans.h"
> > > >  #include "xfs_buf_item.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Local function declarations.
> > > > @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
> > > >  	bestsp = xfs_dir2_leaf_bests_p(ltp);
> > > >  	if (be16_to_cpu(bestsp[db]) != oldbest) {
> > > >  		xfs_buf_corruption_error(lbp);
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > > +
> > > >  	/*
> > > >  	 * Mark the former data entry unused.
> > > >  	 */
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > index 3a8b0625a08b..e0f3ab254a1a 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > @@ -20,6 +20,7 @@
> > > >  #include "xfs_trans.h"
> > > >  #include "xfs_buf_item.h"
> > > >  #include "xfs_log.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Function declarations.
> > > > @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
> > > >  	if (fa) {
> > > >  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
> > > >  		xfs_trans_brelse(tp, *bpp);
> > > > +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
> > > >  	if (be32_to_cpu(ltp->bestcount) >
> > > >  				(uint)dp->i_d.di_size / args->geo->blksize) {
> > > >  		xfs_buf_corruption_error(lbp);
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
> > > >  	 */
> > > >  	if (index < 0) {
> > > >  		xfs_buf_corruption_error(bp);
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
> > > >  					   cpu_to_be16(NULLDATAOFF))) {
> > > >  				if (curfdb != newfdb)
> > > >  					xfs_trans_brelse(tp, curbp);
> > > > +				xfs_da_mark_sick(args);
> > > >  				return -EFSCORRUPTED;
> > > >  			}
> > > >  			curfdb = newfdb;
> > > > @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
> > > >  	xfs_dir3_leaf_check(dp, bp);
> > > >  	if (leafhdr.count <= 0) {
> > > >  		xfs_buf_corruption_error(bp);
> > > > +		xfs_da_mark_sick(args);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
> > > >  			} else {
> > > >  				xfs_alert(mp, " ... fblk is NULL");
> > > >  			}
> > > > +			xfs_da_mark_sick(args);
> > > >  			return -EFSCORRUPTED;
> > > >  		}
> > > >  
> > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > index 2049419e9555..d9404cd3d09b 100644
> > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > @@ -38,6 +38,7 @@ struct xfs_perag;
> > > >  struct xfs_inode;
> > > >  struct xfs_fsop_geom;
> > > >  struct xfs_btree_cur;
> > > > +struct xfs_da_args;
> > > >  
> > > >  /* Observable health issues for metadata spanning the entire filesystem. */
> > > >  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> > > > @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> > > >  void xfs_health_unmount(struct xfs_mount *mp);
> > > >  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> > > >  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> > > > +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> > > > +void xfs_da_mark_sick(struct xfs_da_args *args);
> > > >  
> > > >  /* Now some helpers. */
> > > >  
> > > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > > > index a78c501f6fb1..429a97494ffa 100644
> > > > --- a/fs/xfs/xfs_attr_inactive.c
> > > > +++ b/fs/xfs/xfs_attr_inactive.c
> > > > @@ -23,6 +23,7 @@
> > > >  #include "xfs_quota.h"
> > > >  #include "xfs_dir2.h"
> > > >  #include "xfs_error.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Look at all the extents for this logical region,
> > > > @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
> > > >  	if (level > XFS_DA_NODE_MAXDEPTH) {
> > > >  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
> > > >  		xfs_buf_corruption_error(bp);
> > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  		return -EFSCORRUPTED;
> > > >  	}
> > > >  
> > > > @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
> > > >  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
> > > >  			break;
> > > >  		default:
> > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  			xfs_buf_corruption_error(child_bp);
> > > >  			xfs_trans_brelse(*trans, child_bp);
> > > >  			error = -EFSCORRUPTED;
> > > > @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
> > > >  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
> > > >  		break;
> > > >  	default:
> > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  		error = -EFSCORRUPTED;
> > > >  		xfs_buf_corruption_error(bp);
> > > >  		xfs_trans_brelse(*trans, bp);
> > > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > > > index 7a099df88a0c..1a2a3d4ce422 100644
> > > > --- a/fs/xfs/xfs_attr_list.c
> > > > +++ b/fs/xfs/xfs_attr_list.c
> > > > @@ -21,6 +21,7 @@
> > > >  #include "xfs_error.h"
> > > >  #include "xfs_trace.h"
> > > >  #include "xfs_dir2.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  STATIC int
> > > >  xfs_attr_shortform_compare(const void *a, const void *b)
> > > > @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
> > > >  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
> > > >  			if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > >  					   !xfs_attr_namecheck(sfe->nameval,
> > > > -							       sfe->namelen)))
> > > > +							       sfe->namelen))) {
> > > > +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  				return -EFSCORRUPTED;
> > > > +			}
> > > >  			context->put_listent(context,
> > > >  					     sfe->flags,
> > > >  					     sfe->nameval,
> > > > @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
> > > >  					     context->dp->i_mount, sfe,
> > > >  					     sizeof(*sfe));
> > > >  			kmem_free(sbuf);
> > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  			return -EFSCORRUPTED;
> > > >  		}
> > > >  
> > > > @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
> > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > >  				   !xfs_attr_namecheck(sbp->name,
> > > >  						       sbp->namelen))) {
> > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  			error = -EFSCORRUPTED;
> > > >  			goto out;
> > > >  		}
> > > > @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
> > > >  			return 0;
> > > >  
> > > >  		/* We can't point back to the root. */
> > > > -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> > > > +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	if (expected_level != 0)
> > > > @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
> > > >  out_corruptbuf:
> > > >  	xfs_buf_corruption_error(bp);
> > > >  	xfs_trans_brelse(tp, bp);
> > > > +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > >  	return -EFSCORRUPTED;
> > > >  }
> > > >  
> > > > @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
> > > >  		}
> > > >  
> > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > -				   !xfs_attr_namecheck(name, namelen)))
> > > > +				   !xfs_attr_namecheck(name, namelen))) {
> > > > +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  		context->put_listent(context, entry->flags,
> > > >  					      name, namelen, valuelen);
> > > >  		if (context->seen_enough)
> > > > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > > > index 95bc9ef8f5f9..715ded503334 100644
> > > > --- a/fs/xfs/xfs_dir2_readdir.c
> > > > +++ b/fs/xfs/xfs_dir2_readdir.c
> > > > @@ -18,6 +18,7 @@
> > > >  #include "xfs_bmap.h"
> > > >  #include "xfs_trans.h"
> > > >  #include "xfs_error.h"
> > > > +#include "xfs_health.h"
> > > >  
> > > >  /*
> > > >   * Directory file type support functions
> > > > @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
> > > >  		ctx->pos = off & 0x7fffffff;
> > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > >  				   !xfs_dir2_namecheck(sfep->name,
> > > > -						       sfep->namelen)))
> > > > +						       sfep->namelen))) {
> > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > >  			return -EFSCORRUPTED;
> > > > +		}
> > > >  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
> > > >  			    xfs_dir3_get_dtype(mp, filetype)))
> > > >  			return 0;
> > > > @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
> > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > >  				   !xfs_dir2_namecheck(dep->name,
> > > >  						       dep->namelen))) {
> > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > >  			error = -EFSCORRUPTED;
> > > >  			break;
> > > >  		}
> > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > index 1f09027c55ad..c1b6e8fb72ec 100644
> > > > --- a/fs/xfs/xfs_health.c
> > > > +++ b/fs/xfs/xfs_health.c
> > > > @@ -15,6 +15,8 @@
> > > >  #include "xfs_trace.h"
> > > >  #include "xfs_health.h"
> > > >  #include "xfs_btree.h"
> > > > +#include "xfs_da_format.h"
> > > > +#include "xfs_da_btree.h"
> > > >  
> > > >  /*
> > > >   * Warn about metadata corruption that we detected but haven't fixed, and
> > > > @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
> > > >  
> > > >  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
> > > >  }
> > > > +
> > > > +/*
> > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > + * system.
> > > > + */
> > > > +void
> > > > +xfs_dirattr_mark_sick(
> > > > +	struct xfs_inode	*ip,
> > > > +	int			whichfork)
> > > > +{
> > > > +	unsigned int		mask;
> > > > +
> > > > +	switch (whichfork) {
> > > > +	case XFS_DATA_FORK:
> > > > +		mask = XFS_SICK_INO_DIR;
> > > > +		break;
> > > > +	case XFS_ATTR_FORK:
> > > > +		mask = XFS_SICK_INO_XATTR;
> > > > +		break;
> > > > +	default:
> > > > +		ASSERT(0);
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	xfs_inode_mark_sick(ip, mask);
> > > > +}
> > > > +
> > > > +/*
> > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > + * system.
> > > > + */
> > > > +void
> > > > +xfs_da_mark_sick(
> > > > +	struct xfs_da_args	*args)
> > > > +{
> > > > +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> > > > +}
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-22  0:53         ` Darrick J. Wong
@ 2019-11-22 11:57           ` Brian Foster
  2019-11-22 18:10             ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-22 11:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 21, 2019 at 04:53:13PM -0800, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 08:26:03AM -0500, Brian Foster wrote:
> > On Wed, Nov 20, 2019 at 08:43:23AM -0800, Darrick J. Wong wrote:
> > > On Wed, Nov 20, 2019 at 09:20:47AM -0500, Brian Foster wrote:
> > > > On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > Whenever we encounter a corrupt AG header, we should report that to the
> > > > > health monitoring system for later reporting.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
> > > > >  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
> > > > >  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
> > > > >  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
> > > > >  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
> > > > >  fs/xfs/libxfs/xfs_sb.c       |    2 ++
> > > > >  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
> > > > >  fs/xfs/xfs_inode.c           |    9 +++++++++
> > > > >  8 files changed, 51 insertions(+), 2 deletions(-)
> > > > > 
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > > > index c284e10af491..e75e3ae6c912 100644
> > > > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > > > @@ -26,6 +26,7 @@
> > > > >  #include "xfs_log.h"
> > > > >  #include "xfs_ag_resv.h"
> > > > >  #include "xfs_bmap.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  extern kmem_zone_t	*xfs_bmap_free_item_zone;
> > > > >  
> > > > > @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
> > > > >  			mp, tp, mp->m_ddev_targp,
> > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
> > > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> > > > > +	if (xfs_metadata_is_sick(error))
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
> > > > 
> > > > Any reason we couldn't do some of these in verifiers? I'm assuming we'd
> > > > still need calls in various external corruption checks, but at least we
> > > > wouldn't add a requirement to check all future buffer reads, etc.
> > > 
> > > I thought about that.  It would be wonderful if C had a syntactically
> > > slick method to package a function + execution scope and pass that
> > > through other functions to be called later. :)
> > > 
> > > For the per-AG stuff it wouldn't be hard to make the verifier functions
> > > derive the AG number and call xfs_agno_mark_sick directly in the
> > > verifier.  For per-inode metadata, we'd have to find a way to pass the
> > > struct xfs_inode pointer to the verifier, which means that we'd have to
> > > add that to struct xfs_buf.
> > > 
> > > xfs_buf is ~384 bytes so maybe adding another pointer for read context
> > > wouldn't be terrible?  That would add a fair amount of ugly special
> > > casing in the btree code to decide if we have an inode to pass through,
> > > though it would solve the problem of the bmbt verifier not being able to
> > > check the owner field in the btree block header.
> > > 
> > > OTOH that's 8 bytes of overhead that we can never get rid of even though
> > > we only really need it the first time the buffer gets read in from disk.
> > > 
> > > Thoughts?
> > > 
> > 
> > That doesn't seem too unreasonable, but I guess I'd have to think about
> > it some more. Maybe it's worth defining a private pointer in the buffer
> > that callers can use to pass specific context to verifiers for health
> > processing. I suppose such a field could also be conditionally defined
> > on scrub enabled kernels (at least initially), so the overhead would be
> > opt-in.
> 
> Looking further into this, what if we could did something like the
> following:
> 
> struct xfs_buf_verify {
> 	const struct xfs_buf_ops	*ops;
> 	struct xfs_inode		*ip;
> 	unsigned int			sick_flags;
> 	/* whatever else */
> };
> 
> ...then we change the _read_buf and _trans_read_buf functions to take as
> the final argument a (struct xfs_buf_verify *).  In the xfs_buf_reverify
> cases, we can pass this context straight through to the ->read_verify
> function.
> 
> To handle the !DONE case where the buffer read completion can happen
> asynchronously, we change the b_ops field definition to:
> 
> 	union {
> 		struct xfs_buf_ops	*b_ops;
> 		struct xfs_buf_verify	*b_vctx;
> 	};
> 
> Next we define a new XBF_HAVE_VERIFY_CTX flag that means b_vctx is
> active and not ops.  xfs_buf_read_map can set the flag and b_vctx for
> any synchronous (!XBF_ASYNC) read because we know the caller will be
> asleep waiting for b_iowait and therefore cannot kill the verifier
> context structure.  Once we get to xfs_buf_ioend we can set b_ops, drop
> the XBF_H_V_C flag, and call ->verify_read.
> 
> Now we actually /can/ pass the inode pointer into the verifier, along
> with pretty much anything else we can think of.
> 
> Does that sound reasonable?  Or totally heinous? :)
> 

That sounds reasonable to me and potentially a nice way to mitigate
additional overhead. I suppose we'd also need a means to abstract the
various contextual data fed into the type-specific verifiers (i.e., does
the verifier care about inode health state? perag? both?). Would you
plan to do that with higher level wrappers and/or perhaps use similar
union/flag magic in the xfs_buf_verify context to indicate which state
an instance happens to provide? It might be worth a quick and dirty RFC
to answer these questions with a couple examples and get any API
feedback before running through the full set of verifiers..

Brian

> > Anyways, I think for this series it might be reasonable to push things
> > down into verifiers opportunistically where we can do so without any
> > core mechanism changes. We can follow up with changes to do the rest if
> > we can come up with something elegant.
> 
> Ok.  I think I will try to implement such a beast for 5.6 and then put
> this series after it.
> 
> > > > >  	if (error)
> > > > >  		return error;
> > > > >  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> > > > > @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
> > > > >  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
> > > > >  		     be32_to_cpu(agf->agf_length))) {
> > > > >  		xfs_buf_corruption_error(agbp);
> > > > > +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -2952,6 +2956,8 @@ xfs_read_agf(
> > > > >  			mp, tp, mp->m_ddev_targp,
> > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> > > > >  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> > > > > +	if (xfs_metadata_is_sick(error))
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
> > > > >  	if (error)
> > > > >  		return error;
> > > > >  	if (!*bpp)
> > > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > > index 3657a9cb8490..ce8954a10c66 100644
> > > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > > @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> > > > >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> > > > >  		unsigned int *checked);
> > > > >  
> > > > > +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> > > > > +		unsigned int mask);
> > > > >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> > > > >  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> > > > >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> > > > > @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
> > > > >  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
> > > > >  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
> > > > >  
> > > > > +#define xfs_metadata_is_sick(error) \
> > > > > +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> > > > > +		  (error) == -EFSBADCRC))
> > > > 
> > > > Why is -EIO considered sick? My understanding is that once something is
> > > > marked sick, scrub is the only way to clear that state. -EIO can be
> > > > transient, so afaict that means we could mark a persistent in-core state
> > > > based on a transient/resolved issue.
> > > 
> > > I think it sounds reasonable that if the fs hits a metadata IO error
> > > then the administrator should scrub that data structure to make sure
> > > it's ok, and if so, clear the sick state.
> > > 
> > 
> > I'm not totally convinced... I thought we had configurations where I/O
> > errors can be reasonably expected and recovered from. For example,
> > consider the thin provisioning + infinite metadata writeback error retry
> > mechanism. IIRC, the whole purpose of that was to facilitate the use
> > case where the thin pool runs out of space, but the admin wants some
> > window of time to expand and keep the filesystem alive.
> 
> Aha, I just realized that it's not clear from the macro definition that
> I was only intending it to be called from the read path.
> 
> Though I guess there's always the possibility that the PFY trips over
> the PCIE cable in the datacenter and XFS hits an EIO, but the disk will
> be fine a moment later when he shoves it back in.  The disk media is
> fine, and by that point either we returned read error to userspace or
> the transaction got cancelled and it's too late to do anything anyway.
> 
> I'll drop the EIO check for now and we'll see if I get around to
> revisiting it.
> 
> > I don't necessarily think it's a bad thing to suggest a scrub any time
> > errors have occurred, but for something like the above where an
> > environment may have been thoroughly tested and verified through that
> > particular error->expand sequence, it seems that flagging bits as sick
> > might be unnecessarily ominous.
> 
> <shrug> Yeah, (sick && !checked) is a weird passive-aggressive state
> like that.
> 
> > > Though I realized just now that if scrub isn't enabled then it's an
> > > unfixable dead end so the EIO check should be gated on
> > > CONFIG_XFS_ONLINE_SCRUB=y.
> > > 
> > 
> > Yeah, that was my initial concern..
> > 
> > > > Along similar lines, what's the expected behavior in the event of any of
> > > > these errors for a kernel that might not support
> > > > CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
> > > > used for anything? If so, that seems Ok I suppose.. but it's a little
> > > > awkward if we'd see the tracepoints and such associated with the state
> > > > changes.
> > > 
> > > Even if scrub is disabled, the kernel will still set the sick state, and
> > > later the administrator can query the filesystem with xfs_spaceman to
> > > observe that sick state.
> > > 
> > 
> > Ok, so it's intended to be a valid health state independent of scrub.
> > That seems reasonable in principle and can always be used to indicate
> > offline repair is necessary too.
> 
> Yes.
> 
> > > In the future, I will also use the per-AG sick states to steer
> > > allocations away from known problematic AGs to try to avoid
> > > unexpected shutdown in the middle of a transaction.
> > > 
> > 
> > Hmm.. I'm a little curious about how much we should steer away from
> > traditional behavior on kernels that might not support scrub. I suppose
> > I could see arguments for going either way, but this is getting a bit
> > ahead of this patch anyways. ;)
> 
> Yeah.  I /do/ have prototype patches buried in my dev tree but they are
> too ugly not to let all the magic smoke out.  What really happens is
> that when we hit a corruption error, we mark the AG as offline.  Then
> the sysadmin can run xfs_scrub to fix it (which would set th AG back
> online) or I guess we could have a spaceman -x command to force it back
> online.
> 
> I always build in /some/ kind of manual override somewhere... :)
> 
> --D
> 
> > Brian
> > 
> > > --D
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > > +
> > > > >  #endif	/* __XFS_HEALTH_H__ */
> > > > > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > > > > index 988cde7744e6..c401512a4350 100644
> > > > > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > > > > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > > > > @@ -27,6 +27,7 @@
> > > > >  #include "xfs_trace.h"
> > > > >  #include "xfs_log.h"
> > > > >  #include "xfs_rmap.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Lookup a record by ino in the btree given by cur.
> > > > > @@ -2635,6 +2636,8 @@ xfs_read_agi(
> > > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
> > > > >  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> > > > > +	if (xfs_metadata_is_sick(error))
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > >  	if (error)
> > > > >  		return error;
> > > > >  	if (tp)
> > > > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > > > > index d7d702ee4d1a..25c87834e42a 100644
> > > > > --- a/fs/xfs/libxfs/xfs_refcount.c
> > > > > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > > > > @@ -22,6 +22,7 @@
> > > > >  #include "xfs_bit.h"
> > > > >  #include "xfs_refcount.h"
> > > > >  #include "xfs_rmap.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /* Allowable refcount adjustment amounts. */
> > > > >  enum xfs_refc_adjust_op {
> > > > > @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
> > > > >  				XFS_ALLOC_FLAG_FREEING, &agbp);
> > > > >  		if (error)
> > > > >  			return error;
> > > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  
> > > > >  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
> > > > >  		if (!rcur) {
> > > > > diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> > > > > index ff9412f113c4..a54a3c129cce 100644
> > > > > --- a/fs/xfs/libxfs/xfs_rmap.c
> > > > > +++ b/fs/xfs/libxfs/xfs_rmap.c
> > > > > @@ -21,6 +21,7 @@
> > > > >  #include "xfs_errortag.h"
> > > > >  #include "xfs_error.h"
> > > > >  #include "xfs_inode.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Lookup the first record less than or equal to [bno, len, owner, offset]
> > > > > @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
> > > > >  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
> > > > >  		if (error)
> > > > >  			return error;
> > > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  
> > > > >  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
> > > > >  		if (!rcur) {
> > > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > > > index 0ac69751fe85..4a923545465d 100644
> > > > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > > > @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
> > > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > > >  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> > > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> > > > > +	if (xfs_metadata_is_sick(error))
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
> > > > >  	if (error)
> > > > >  		return error;
> > > > >  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> > > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > > index 860dc70c99e7..36c32b108b39 100644
> > > > > --- a/fs/xfs/xfs_health.c
> > > > > +++ b/fs/xfs/xfs_health.c
> > > > > @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
> > > > >  	spin_unlock(&mp->m_sb_lock);
> > > > >  }
> > > > >  
> > > > > +/* Mark unhealthy per-ag metadata given a raw AG number. */
> > > > > +void
> > > > > +xfs_agno_mark_sick(
> > > > > +	struct xfs_mount	*mp,
> > > > > +	xfs_agnumber_t		agno,
> > > > > +	unsigned int		mask)
> > > > > +{
> > > > > +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> > > > > +
> > > > > +	/* per-ag structure not set up yet? */
> > > > > +	if (!pag)
> > > > > +		return;
> > > > > +
> > > > > +	xfs_ag_mark_sick(pag, mask);
> > > > > +	xfs_perag_put(pag);
> > > > > +}
> > > > > +
> > > > >  /* Mark unhealthy per-ag metadata. */
> > > > >  void
> > > > >  xfs_ag_mark_sick(
> > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > > index 401da197f012..a2812cea748d 100644
> > > > > --- a/fs/xfs/xfs_inode.c
> > > > > +++ b/fs/xfs/xfs_inode.c
> > > > > @@ -35,6 +35,7 @@
> > > > >  #include "xfs_log.h"
> > > > >  #include "xfs_bmap_btree.h"
> > > > >  #include "xfs_reflink.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  kmem_zone_t *xfs_inode_zone;
> > > > >  
> > > > > @@ -787,6 +788,8 @@ xfs_ialloc(
> > > > >  	 */
> > > > >  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
> > > > >  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> > > > > +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> > > > > +				XFS_SICK_AG_INOBT);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
> > > > >  	 */
> > > > >  	if (old_value == new_agino) {
> > > > >  		xfs_buf_corruption_error(agibp);
> > > > > +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
> > > > >  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
> > > > >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > > > >  				sizeof(*dip), __this_address);
> > > > > +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > > >  		error = -EFSCORRUPTED;
> > > > >  		goto out;
> > > > >  	}
> > > > > @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
> > > > >  		if (next_agino != NULLAGINO) {
> > > > >  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> > > > >  					dip, sizeof(*dip), __this_address);
> > > > > +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  		}
> > > > >  		goto out;
> > > > > @@ -2271,6 +2277,7 @@ xfs_iunlink(
> > > > >  	if (next_agino == agino ||
> > > > >  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
> > > > >  		xfs_buf_corruption_error(agibp);
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
> > > > >  			XFS_CORRUPTION_ERROR(__func__,
> > > > >  					XFS_ERRLEVEL_LOW, mp,
> > > > >  					*dipp, sizeof(**dipp));
> > > > > +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			return error;
> > > > >  		}
> > > > > @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
> > > > >  	if (!xfs_verify_agino(mp, agno, head_agino)) {
> > > > >  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
> > > > >  				agi, sizeof(*agi));
> > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-22  1:03         ` Darrick J. Wong
@ 2019-11-22 12:28           ` Brian Foster
  2019-11-22 18:35             ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Brian Foster @ 2019-11-22 12:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Nov 21, 2019 at 05:03:32PM -0800, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 08:26:27AM -0500, Brian Foster wrote:
> > On Wed, Nov 20, 2019 at 08:55:08AM -0800, Darrick J. Wong wrote:
> > > On Wed, Nov 20, 2019 at 11:11:47AM -0500, Brian Foster wrote:
> > > > On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > Whenever we encounter corrupt directory or extended attribute blocks, we
> > > > > should report that to the health monitoring system for later reporting.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
> > > > >  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
> > > > >  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
> > > > >  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
> > > > >  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
> > > > >  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
> > > > >  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
> > > > >  fs/xfs/libxfs/xfs_health.h      |    3 +++
> > > > >  fs/xfs/xfs_attr_inactive.c      |    4 ++++
> > > > >  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
> > > > >  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
> > > > >  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
> > > > >  12 files changed, 126 insertions(+), 20 deletions(-)
> > > > > 
> > > > > 
> > > > ...
> > > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> > > > > index e424b004e3cb..a17622dadf00 100644
> > > > > --- a/fs/xfs/libxfs/xfs_da_btree.c
> > > > > +++ b/fs/xfs/libxfs/xfs_da_btree.c
> > > > ...
> > > > > @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
> > > > >  
> > > > >  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
> > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		}
> > > > >  
> > > > > @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
> > > > >  		/* Tree taller than we can handle; bail out! */
> > > > >  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
> > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		}
> > > > >  
> > > > > @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
> > > > >  			expected_level = nodehdr.level - 1;
> > > > >  		else if (expected_level != nodehdr.level) {
> > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		} else
> > > > >  			expected_level--;
> > > > > @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
> > > > >  		}
> > > > >  
> > > > >  		/* We can't point back to the root. */
> > > > > -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> > > > > +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  	}
> > > > >  
> > > > > -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> > > > > +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > > +	}
> > > > >  
> > > > >  	/*
> > > > >  	 * A leaf block that ends in the hashval that we are interested in
> > > > > @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
> > > > >  			args->blkno = blk->blkno;
> > > > >  		} else {
> > > > >  			ASSERT(0);
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		}
> > > > 
> > > > I'm just kind of skimming through the rest for general feedback at this
> > > > point given previous comments, but it might be nice to start using exit
> > > > labels at some of these places where we're enlarging and duplicating the
> > > > error path for particular errors.
> > > 
> > > Yeah.  This current iteration is pretty wordy since I used coccinelle to
> > > find all the EFSCORRUPTED clauses and inject the appropriate _mark_sick
> > > call.
> > > 
> > > > It's not so much about the code in
> > > > these patches, but rather to hopefully ease maintaining these state bits
> > > > properly in new code where devs/reviewers might not know much about
> > > > scrub state or have it in mind. Short of having some kind of generic
> > > > helper to handle corruption state, ISTM that the combination of using
> > > > verifiers where possible and common exit labels anywhere else we
> > > > generate -EFSCORRUPTED at multiple places within some function could
> > > > shrink these patches a bit..
> > > 
> > > <nod> Eric suggested on IRC that maybe the _mark_sick functions should
> > > return EFSCORRUPTED so that we could at least collapse that to:
> > > 
> > > if (XFS_IS_CORRUPT(...)) {
> > > 	error = xfs_da_mark_sick(...);
> > > 	goto barf;
> > > }
> > > 
> > > However, doing it the wordy way I've done it has the neat effects (IMHO)
> > > that you can find all the places where xfs decides some metadata is
> > > corrupt by grepping for EFSCORRUPTED, and confirm that each place it
> > > does that also has a corresponding _mark_sick call.
> > > 
> > 
> > Yeah, that was actually my thought process in suggesting pushing the
> > mark_sick() calls down into verifiers as well.
> 
> <nod> It does strike me as a little odd that the verifiers are the /one/
> place where EFSCORRUPTED isn't preceded or followed by a _mark_sick.
> 
> > It seems a little more clear (and open to future cleanups) with a
> > strict pattern of setting sickness in the locations that generate
> > corruption errors. Of course that likely means some special macro or
> > something like you propose below, but I didn't want to quite go there
> > until we could put the state updates in the right places.
> 
> Yeah....
> 
> > > I guess you could create a dorky shouty wrapper to maintain that greppy
> > > property:
> > > 
> > > #define XFS_DA_EFSCORRUPTED(...) \
> > > 	(xfs_da_mark_sick(...), -EFSCORRUPTED)
> > > 
> > > But... that might be stylistically undesirable.  OTOH I guess it
> > > wouldn't be so bad either to do:
> > > 
> > > 	if (XFS_IS_CORRUPT(...)) {
> > > 		error = -EFSCORRUPTED;
> > > 		goto bad;
> > > 	}
> > > 
> > > 	if (XFS_IS_CORRUPT(...)) {
> > > 		error = -EFSCORRUPTED;
> > > 		goto bad;
> > > 	}
> > > 
> > > 	return 0;
> > > bad:
> > > 	if (error == -EFSCORRUPTED)
> > > 		xfs_da_mark_sick(...);
> > > 	return error;
> > > 
> > > Or using the shouty macro above:
> > > 
> > > 	if (XFS_IS_CORRUPT(...)) {
> > > 		error = XFS_DA_EFSCORRUPTED(...);
> > > 		goto bad;
> > > 	}
> > > 
> > > 	if (XFS_IS_CORRUPT(...)) {
> > > 		error = XFS_DA_EFSCORRUPTED(...);
> > > 		goto bad;
> > > 	}
> > > 
> > > bad:
> > > 	return error;
> > > 
> > > I'll think about that.  It doesn't sound so bad when coding it up in
> > > this email.
> > > 
> > 
> > I suppose a macro is nice in that it enforces sickness is updated
> > wherever -EFSCORRUPTED occurs, or at least can easily be verified by
> > grepping. I find the separate macros pattern a little confusing, FWIW,
> > simply because at a glance it looks like a garbled bunch of logic to me.
> > I.e. I see 'if (IS_CORRUPT()) SOMETHING_CORRUPTED(); ...' and wonder wtf
> > that is doing, for one. It's also not immediately obvious when we should
> > use one or not the other, etc. This is getting into bikeshedding
> > territory though and I don't have much of a better suggestion atm...
> 
> ...one /could/ have specific IS_CORRUPT macros mapping to different
> types of things.  Though I think this could easily get messy:
> 

Yep.

> #define XFS_DIR_IS_CORRUPT(dp, perror, expr) \
> 	(unlikely(expr) ? xfs_corruption_report(#expr, ...), \
> 			  *(perror) = -EFSCORRUPTED, \
> 			  xfs_da_mark_sick(dp, XFS_DATA_FORK), true : false)
> 
> I don't want to load up these macros with too much stuff, but I guess at
> least that reduces the directory code to:
> 
> 	if (XFS_DIR_IS_CORRUPT(dp, &error, blah == badvalue))
> 		goto out;
> 	...
> 	if (XFS_DIR_IS_CORRUPT(dp, &error, ugh == NULL))
> 		return error;
> out:
> 	return error;
> 
> Though now we're getting pretty far from the original intent to kill off
> wonky macros.  At least these are less weird, so maybe this won't set
> off a round of macro bikeshed rage?
> 

I dunno.. I'm trying to find an opinion beyond a waffley sense of "is it
worth changing?" on the whole macro thing. While I agree that the
original macros are ugly, they never really confused me or affected
readability so I didn't care too much whether they stay or go TBH.

In general, I think having usable interfaces for the developer and
readable functional code is more important than how ugly/bloated the
macro might be. That's why I really don't like the previous example that
combines multiple "simple" macros and turns that into some reusable
pattern. The resulting user code is not really readable IMO.

The DIR_IS_CORRUPT() example above reminds me a little more of the
original macros in that it is easy to use and makes the user code
concise. Indeed, it somewhat overloads the macro, but that seems
advantageous to me if the intent of this series is to add more
boilerplate associated with how we handle corruption errors generically.
In that regard, I find the DIR_IS_CORRUPT() approach preferable to
alternatives discussed so far (though I'd probably name it XFS_DA_*()
for consistency with the underlying health state type). Just my .02
though.. ;)

Brian

> --D
> 
> > 
> > Brian
> > 
> > > --D
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > >  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> > > > > @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
> > > > >  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
> > > > >  	if (error)
> > > > >  		return error;
> > > > > -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> > > > > +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > > +	}
> > > > >  	/*
> > > > >  	 * Read the last block in the btree space.
> > > > >  	 */
> > > > > @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
> > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > >  				   be32_to_cpu(sib_info->forw) != last_blkno ||
> > > > >  				   sib_info->magic != dead_info->magic)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
> > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > >  				   be32_to_cpu(sib_info->back) != last_blkno ||
> > > > >  				   sib_info->magic != dead_info->magic)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
> > > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > >  				   level >= 0 && level != par_hdr.level + 1)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
> > > > >  		     entno++)
> > > > >  			continue;
> > > > >  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
> > > > >  		xfs_trans_brelse(tp, par_buf);
> > > > >  		par_buf = NULL;
> > > > >  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
> > > > >  		par_node = par_buf->b_addr;
> > > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > > >  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto done;
> > > > >  		}
> > > > > @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
> > > > >  					irecs[i].br_state);
> > > > >  			}
> > > > >  		}
> > > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > > >  		error = -EFSCORRUPTED;
> > > > >  		goto out;
> > > > >  	}
> > > > > @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
> > > > >  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
> > > > >  					dp->i_mount->m_ddev_targp,
> > > > >  					mapp, nmap, 0, &bp, ops);
> > > > > +	if (xfs_metadata_is_sick(error))
> > > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > > >  	if (error)
> > > > >  		goto out_free;
> > > > >  
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > > > > index 0aa87cbde49e..e1aa411a1b8b 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2.c
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > > > > @@ -18,6 +18,7 @@
> > > > >  #include "xfs_errortag.h"
> > > > >  #include "xfs_error.h"
> > > > >  #include "xfs_trace.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
> > > > >  
> > > > > @@ -608,8 +609,10 @@ xfs_dir2_isblock(
> > > > >  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
> > > > >  	if (XFS_IS_CORRUPT(args->dp->i_mount,
> > > > >  			   rval != 0 &&
> > > > > -			   args->dp->i_d.di_size != args->geo->blksize))
> > > > > +			   args->dp->i_d.di_size != args->geo->blksize)) {
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > > +	}
> > > > >  	*vp = rval;
> > > > >  	return 0;
> > > > >  }
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > index a6eb71a62b53..80cc9c7ea4e5 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > @@ -18,6 +18,7 @@
> > > > >  #include "xfs_trans.h"
> > > > >  #include "xfs_buf_item.h"
> > > > >  #include "xfs_log.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
> > > > >  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> > > > > @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
> > > > >  corrupt:
> > > > >  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
> > > > >  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> > > > > +	xfs_da_mark_sick(args);
> > > > >  	return -EFSCORRUPTED;
> > > > >  }
> > > > >  
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > index 73edd96ce0ac..32d17420fff3 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > @@ -19,6 +19,7 @@
> > > > >  #include "xfs_trace.h"
> > > > >  #include "xfs_trans.h"
> > > > >  #include "xfs_buf_item.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Local function declarations.
> > > > > @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
> > > > >  	bestsp = xfs_dir2_leaf_bests_p(ltp);
> > > > >  	if (be16_to_cpu(bestsp[db]) != oldbest) {
> > > > >  		xfs_buf_corruption_error(lbp);
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > > +
> > > > >  	/*
> > > > >  	 * Mark the former data entry unused.
> > > > >  	 */
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > index 3a8b0625a08b..e0f3ab254a1a 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > @@ -20,6 +20,7 @@
> > > > >  #include "xfs_trans.h"
> > > > >  #include "xfs_buf_item.h"
> > > > >  #include "xfs_log.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Function declarations.
> > > > > @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
> > > > >  	if (fa) {
> > > > >  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
> > > > >  		xfs_trans_brelse(tp, *bpp);
> > > > > +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
> > > > >  	if (be32_to_cpu(ltp->bestcount) >
> > > > >  				(uint)dp->i_d.di_size / args->geo->blksize) {
> > > > >  		xfs_buf_corruption_error(lbp);
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
> > > > >  	 */
> > > > >  	if (index < 0) {
> > > > >  		xfs_buf_corruption_error(bp);
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
> > > > >  					   cpu_to_be16(NULLDATAOFF))) {
> > > > >  				if (curfdb != newfdb)
> > > > >  					xfs_trans_brelse(tp, curbp);
> > > > > +				xfs_da_mark_sick(args);
> > > > >  				return -EFSCORRUPTED;
> > > > >  			}
> > > > >  			curfdb = newfdb;
> > > > > @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
> > > > >  	xfs_dir3_leaf_check(dp, bp);
> > > > >  	if (leafhdr.count <= 0) {
> > > > >  		xfs_buf_corruption_error(bp);
> > > > > +		xfs_da_mark_sick(args);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
> > > > >  			} else {
> > > > >  				xfs_alert(mp, " ... fblk is NULL");
> > > > >  			}
> > > > > +			xfs_da_mark_sick(args);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		}
> > > > >  
> > > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > > index 2049419e9555..d9404cd3d09b 100644
> > > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > > @@ -38,6 +38,7 @@ struct xfs_perag;
> > > > >  struct xfs_inode;
> > > > >  struct xfs_fsop_geom;
> > > > >  struct xfs_btree_cur;
> > > > > +struct xfs_da_args;
> > > > >  
> > > > >  /* Observable health issues for metadata spanning the entire filesystem. */
> > > > >  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> > > > > @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> > > > >  void xfs_health_unmount(struct xfs_mount *mp);
> > > > >  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> > > > >  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> > > > > +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> > > > > +void xfs_da_mark_sick(struct xfs_da_args *args);
> > > > >  
> > > > >  /* Now some helpers. */
> > > > >  
> > > > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > > > > index a78c501f6fb1..429a97494ffa 100644
> > > > > --- a/fs/xfs/xfs_attr_inactive.c
> > > > > +++ b/fs/xfs/xfs_attr_inactive.c
> > > > > @@ -23,6 +23,7 @@
> > > > >  #include "xfs_quota.h"
> > > > >  #include "xfs_dir2.h"
> > > > >  #include "xfs_error.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Look at all the extents for this logical region,
> > > > > @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
> > > > >  	if (level > XFS_DA_NODE_MAXDEPTH) {
> > > > >  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
> > > > >  		xfs_buf_corruption_error(bp);
> > > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  		return -EFSCORRUPTED;
> > > > >  	}
> > > > >  
> > > > > @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
> > > > >  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
> > > > >  			break;
> > > > >  		default:
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  			xfs_buf_corruption_error(child_bp);
> > > > >  			xfs_trans_brelse(*trans, child_bp);
> > > > >  			error = -EFSCORRUPTED;
> > > > > @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
> > > > >  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
> > > > >  		break;
> > > > >  	default:
> > > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  		error = -EFSCORRUPTED;
> > > > >  		xfs_buf_corruption_error(bp);
> > > > >  		xfs_trans_brelse(*trans, bp);
> > > > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > > > > index 7a099df88a0c..1a2a3d4ce422 100644
> > > > > --- a/fs/xfs/xfs_attr_list.c
> > > > > +++ b/fs/xfs/xfs_attr_list.c
> > > > > @@ -21,6 +21,7 @@
> > > > >  #include "xfs_error.h"
> > > > >  #include "xfs_trace.h"
> > > > >  #include "xfs_dir2.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  STATIC int
> > > > >  xfs_attr_shortform_compare(const void *a, const void *b)
> > > > > @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
> > > > >  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
> > > > >  			if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > >  					   !xfs_attr_namecheck(sfe->nameval,
> > > > > -							       sfe->namelen)))
> > > > > +							       sfe->namelen))) {
> > > > > +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  				return -EFSCORRUPTED;
> > > > > +			}
> > > > >  			context->put_listent(context,
> > > > >  					     sfe->flags,
> > > > >  					     sfe->nameval,
> > > > > @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
> > > > >  					     context->dp->i_mount, sfe,
> > > > >  					     sizeof(*sfe));
> > > > >  			kmem_free(sbuf);
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  			return -EFSCORRUPTED;
> > > > >  		}
> > > > >  
> > > > > @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
> > > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > >  				   !xfs_attr_namecheck(sbp->name,
> > > > >  						       sbp->namelen))) {
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			goto out;
> > > > >  		}
> > > > > @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
> > > > >  			return 0;
> > > > >  
> > > > >  		/* We can't point back to the root. */
> > > > > -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> > > > > +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  	}
> > > > >  
> > > > >  	if (expected_level != 0)
> > > > > @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
> > > > >  out_corruptbuf:
> > > > >  	xfs_buf_corruption_error(bp);
> > > > >  	xfs_trans_brelse(tp, bp);
> > > > > +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > >  	return -EFSCORRUPTED;
> > > > >  }
> > > > >  
> > > > > @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
> > > > >  		}
> > > > >  
> > > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > > -				   !xfs_attr_namecheck(name, namelen)))
> > > > > +				   !xfs_attr_namecheck(name, namelen))) {
> > > > > +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  		context->put_listent(context, entry->flags,
> > > > >  					      name, namelen, valuelen);
> > > > >  		if (context->seen_enough)
> > > > > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > > > > index 95bc9ef8f5f9..715ded503334 100644
> > > > > --- a/fs/xfs/xfs_dir2_readdir.c
> > > > > +++ b/fs/xfs/xfs_dir2_readdir.c
> > > > > @@ -18,6 +18,7 @@
> > > > >  #include "xfs_bmap.h"
> > > > >  #include "xfs_trans.h"
> > > > >  #include "xfs_error.h"
> > > > > +#include "xfs_health.h"
> > > > >  
> > > > >  /*
> > > > >   * Directory file type support functions
> > > > > @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
> > > > >  		ctx->pos = off & 0x7fffffff;
> > > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > > >  				   !xfs_dir2_namecheck(sfep->name,
> > > > > -						       sfep->namelen)))
> > > > > +						       sfep->namelen))) {
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > >  			return -EFSCORRUPTED;
> > > > > +		}
> > > > >  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
> > > > >  			    xfs_dir3_get_dtype(mp, filetype)))
> > > > >  			return 0;
> > > > > @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
> > > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > > >  				   !xfs_dir2_namecheck(dep->name,
> > > > >  						       dep->namelen))) {
> > > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > >  			error = -EFSCORRUPTED;
> > > > >  			break;
> > > > >  		}
> > > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > > index 1f09027c55ad..c1b6e8fb72ec 100644
> > > > > --- a/fs/xfs/xfs_health.c
> > > > > +++ b/fs/xfs/xfs_health.c
> > > > > @@ -15,6 +15,8 @@
> > > > >  #include "xfs_trace.h"
> > > > >  #include "xfs_health.h"
> > > > >  #include "xfs_btree.h"
> > > > > +#include "xfs_da_format.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >  
> > > > >  /*
> > > > >   * Warn about metadata corruption that we detected but haven't fixed, and
> > > > > @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
> > > > >  
> > > > >  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
> > > > >  }
> > > > > +
> > > > > +/*
> > > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > > + * system.
> > > > > + */
> > > > > +void
> > > > > +xfs_dirattr_mark_sick(
> > > > > +	struct xfs_inode	*ip,
> > > > > +	int			whichfork)
> > > > > +{
> > > > > +	unsigned int		mask;
> > > > > +
> > > > > +	switch (whichfork) {
> > > > > +	case XFS_DATA_FORK:
> > > > > +		mask = XFS_SICK_INO_DIR;
> > > > > +		break;
> > > > > +	case XFS_ATTR_FORK:
> > > > > +		mask = XFS_SICK_INO_XATTR;
> > > > > +		break;
> > > > > +	default:
> > > > > +		ASSERT(0);
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	xfs_inode_mark_sick(ip, mask);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > > + * system.
> > > > > + */
> > > > > +void
> > > > > +xfs_da_mark_sick(
> > > > > +	struct xfs_da_args	*args)
> > > > > +{
> > > > > +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> > > > > +}
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system
  2019-11-22 11:57           ` Brian Foster
@ 2019-11-22 18:10             ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-22 18:10 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Nov 22, 2019 at 06:57:00AM -0500, Brian Foster wrote:
> On Thu, Nov 21, 2019 at 04:53:13PM -0800, Darrick J. Wong wrote:
> > On Thu, Nov 21, 2019 at 08:26:03AM -0500, Brian Foster wrote:
> > > On Wed, Nov 20, 2019 at 08:43:23AM -0800, Darrick J. Wong wrote:
> > > > On Wed, Nov 20, 2019 at 09:20:47AM -0500, Brian Foster wrote:
> > > > > On Thu, Nov 14, 2019 at 10:19:26AM -0800, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > 
> > > > > > Whenever we encounter a corrupt AG header, we should report that to the
> > > > > > health monitoring system for later reporting.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > ---
> > > > > >  fs/xfs/libxfs/xfs_alloc.c    |    6 ++++++
> > > > > >  fs/xfs/libxfs/xfs_health.h   |    6 ++++++
> > > > > >  fs/xfs/libxfs/xfs_ialloc.c   |    3 +++
> > > > > >  fs/xfs/libxfs/xfs_refcount.c |    5 ++++-
> > > > > >  fs/xfs/libxfs/xfs_rmap.c     |    5 ++++-
> > > > > >  fs/xfs/libxfs/xfs_sb.c       |    2 ++
> > > > > >  fs/xfs/xfs_health.c          |   17 +++++++++++++++++
> > > > > >  fs/xfs/xfs_inode.c           |    9 +++++++++
> > > > > >  8 files changed, 51 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > 
> > > > > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > > > > index c284e10af491..e75e3ae6c912 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > > > > @@ -26,6 +26,7 @@
> > > > > >  #include "xfs_log.h"
> > > > > >  #include "xfs_ag_resv.h"
> > > > > >  #include "xfs_bmap.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  extern kmem_zone_t	*xfs_bmap_free_item_zone;
> > > > > >  
> > > > > > @@ -699,6 +700,8 @@ xfs_alloc_read_agfl(
> > > > > >  			mp, tp, mp->m_ddev_targp,
> > > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
> > > > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_agfl_buf_ops);
> > > > > > +	if (xfs_metadata_is_sick(error))
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGFL);
> > > > > 
> > > > > Any reason we couldn't do some of these in verifiers? I'm assuming we'd
> > > > > still need calls in various external corruption checks, but at least we
> > > > > wouldn't add a requirement to check all future buffer reads, etc.
> > > > 
> > > > I thought about that.  It would be wonderful if C had a syntactically
> > > > slick method to package a function + execution scope and pass that
> > > > through other functions to be called later. :)
> > > > 
> > > > For the per-AG stuff it wouldn't be hard to make the verifier functions
> > > > derive the AG number and call xfs_agno_mark_sick directly in the
> > > > verifier.  For per-inode metadata, we'd have to find a way to pass the
> > > > struct xfs_inode pointer to the verifier, which means that we'd have to
> > > > add that to struct xfs_buf.
> > > > 
> > > > xfs_buf is ~384 bytes so maybe adding another pointer for read context
> > > > wouldn't be terrible?  That would add a fair amount of ugly special
> > > > casing in the btree code to decide if we have an inode to pass through,
> > > > though it would solve the problem of the bmbt verifier not being able to
> > > > check the owner field in the btree block header.
> > > > 
> > > > OTOH that's 8 bytes of overhead that we can never get rid of even though
> > > > we only really need it the first time the buffer gets read in from disk.
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > That doesn't seem too unreasonable, but I guess I'd have to think about
> > > it some more. Maybe it's worth defining a private pointer in the buffer
> > > that callers can use to pass specific context to verifiers for health
> > > processing. I suppose such a field could also be conditionally defined
> > > on scrub enabled kernels (at least initially), so the overhead would be
> > > opt-in.
> > 
> > Looking further into this, what if we could did something like the
> > following:
> > 
> > struct xfs_buf_verify {
> > 	const struct xfs_buf_ops	*ops;
> > 	struct xfs_inode		*ip;
> > 	unsigned int			sick_flags;
> > 	/* whatever else */
> > };
> > 
> > ...then we change the _read_buf and _trans_read_buf functions to take as
> > the final argument a (struct xfs_buf_verify *).  In the xfs_buf_reverify
> > cases, we can pass this context straight through to the ->read_verify
> > function.
> > 
> > To handle the !DONE case where the buffer read completion can happen
> > asynchronously, we change the b_ops field definition to:
> > 
> > 	union {
> > 		struct xfs_buf_ops	*b_ops;
> > 		struct xfs_buf_verify	*b_vctx;
> > 	};
> > 
> > Next we define a new XBF_HAVE_VERIFY_CTX flag that means b_vctx is
> > active and not ops.  xfs_buf_read_map can set the flag and b_vctx for
> > any synchronous (!XBF_ASYNC) read because we know the caller will be
> > asleep waiting for b_iowait and therefore cannot kill the verifier
> > context structure.  Once we get to xfs_buf_ioend we can set b_ops, drop
> > the XBF_H_V_C flag, and call ->verify_read.
> > 
> > Now we actually /can/ pass the inode pointer into the verifier, along
> > with pretty much anything else we can think of.
> > 
> > Does that sound reasonable?  Or totally heinous? :)
> > 
> 
> That sounds reasonable to me and potentially a nice way to mitigate
> additional overhead. I suppose we'd also need a means to abstract the
> various contextual data fed into the type-specific verifiers (i.e., does
> the verifier care about inode health state? perag? both?).

For an initial RFC I think I could start with adding some trivial
helpers to initialize the xfs_buf_verify structure, so that it's a
little more obvious what the intended usage patterns are.

> Would you
> plan to do that with higher level wrappers and/or perhaps use similar
> union/flag magic in the xfs_buf_verify context to indicate which state
> an instance happens to provide? It might be worth a quick and dirty RFC
> to answer these questions with a couple examples and get any API
> feedback before running through the full set of verifiers..

Yeah.  I think I'll do a quick RFC to implement the bmbt owner check or
something.

--D

> Brian
> 
> > > Anyways, I think for this series it might be reasonable to push things
> > > down into verifiers opportunistically where we can do so without any
> > > core mechanism changes. We can follow up with changes to do the rest if
> > > we can come up with something elegant.
> > 
> > Ok.  I think I will try to implement such a beast for 5.6 and then put
> > this series after it.
> > 
> > > > > >  	if (error)
> > > > > >  		return error;
> > > > > >  	xfs_buf_set_ref(bp, XFS_AGFL_REF);
> > > > > > @@ -722,6 +725,7 @@ xfs_alloc_update_counters(
> > > > > >  	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
> > > > > >  		     be32_to_cpu(agf->agf_length))) {
> > > > > >  		xfs_buf_corruption_error(agbp);
> > > > > > +		xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -2952,6 +2956,8 @@ xfs_read_agf(
> > > > > >  			mp, tp, mp->m_ddev_targp,
> > > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> > > > > >  			XFS_FSS_TO_BB(mp, 1), flags, bpp, &xfs_agf_buf_ops);
> > > > > > +	if (xfs_metadata_is_sick(error))
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGF);
> > > > > >  	if (error)
> > > > > >  		return error;
> > > > > >  	if (!*bpp)
> > > > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > > > index 3657a9cb8490..ce8954a10c66 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > > > @@ -123,6 +123,8 @@ void xfs_rt_mark_healthy(struct xfs_mount *mp, unsigned int mask);
> > > > > >  void xfs_rt_measure_sickness(struct xfs_mount *mp, unsigned int *sick,
> > > > > >  		unsigned int *checked);
> > > > > >  
> > > > > > +void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno,
> > > > > > +		unsigned int mask);
> > > > > >  void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask);
> > > > > >  void xfs_ag_mark_checked(struct xfs_perag *pag, unsigned int mask);
> > > > > >  void xfs_ag_mark_healthy(struct xfs_perag *pag, unsigned int mask);
> > > > > > @@ -203,4 +205,8 @@ void xfs_fsop_geom_health(struct xfs_mount *mp, struct xfs_fsop_geom *geo);
> > > > > >  void xfs_ag_geom_health(struct xfs_perag *pag, struct xfs_ag_geometry *ageo);
> > > > > >  void xfs_bulkstat_health(struct xfs_inode *ip, struct xfs_bulkstat *bs);
> > > > > >  
> > > > > > +#define xfs_metadata_is_sick(error) \
> > > > > > +	(unlikely((error) == -EFSCORRUPTED || (error) == -EIO || \
> > > > > > +		  (error) == -EFSBADCRC))
> > > > > 
> > > > > Why is -EIO considered sick? My understanding is that once something is
> > > > > marked sick, scrub is the only way to clear that state. -EIO can be
> > > > > transient, so afaict that means we could mark a persistent in-core state
> > > > > based on a transient/resolved issue.
> > > > 
> > > > I think it sounds reasonable that if the fs hits a metadata IO error
> > > > then the administrator should scrub that data structure to make sure
> > > > it's ok, and if so, clear the sick state.
> > > > 
> > > 
> > > I'm not totally convinced... I thought we had configurations where I/O
> > > errors can be reasonably expected and recovered from. For example,
> > > consider the thin provisioning + infinite metadata writeback error retry
> > > mechanism. IIRC, the whole purpose of that was to facilitate the use
> > > case where the thin pool runs out of space, but the admin wants some
> > > window of time to expand and keep the filesystem alive.
> > 
> > Aha, I just realized that it's not clear from the macro definition that
> > I was only intending it to be called from the read path.
> > 
> > Though I guess there's always the possibility that the PFY trips over
> > the PCIE cable in the datacenter and XFS hits an EIO, but the disk will
> > be fine a moment later when he shoves it back in.  The disk media is
> > fine, and by that point either we returned read error to userspace or
> > the transaction got cancelled and it's too late to do anything anyway.
> > 
> > I'll drop the EIO check for now and we'll see if I get around to
> > revisiting it.
> > 
> > > I don't necessarily think it's a bad thing to suggest a scrub any time
> > > errors have occurred, but for something like the above where an
> > > environment may have been thoroughly tested and verified through that
> > > particular error->expand sequence, it seems that flagging bits as sick
> > > might be unnecessarily ominous.
> > 
> > <shrug> Yeah, (sick && !checked) is a weird passive-aggressive state
> > like that.
> > 
> > > > Though I realized just now that if scrub isn't enabled then it's an
> > > > unfixable dead end so the EIO check should be gated on
> > > > CONFIG_XFS_ONLINE_SCRUB=y.
> > > > 
> > > 
> > > Yeah, that was my initial concern..
> > > 
> > > > > Along similar lines, what's the expected behavior in the event of any of
> > > > > these errors for a kernel that might not support
> > > > > CONFIG_XFS_ONLINE_[SCRUB|REPAIR]? Just set the states that are never
> > > > > used for anything? If so, that seems Ok I suppose.. but it's a little
> > > > > awkward if we'd see the tracepoints and such associated with the state
> > > > > changes.
> > > > 
> > > > Even if scrub is disabled, the kernel will still set the sick state, and
> > > > later the administrator can query the filesystem with xfs_spaceman to
> > > > observe that sick state.
> > > > 
> > > 
> > > Ok, so it's intended to be a valid health state independent of scrub.
> > > That seems reasonable in principle and can always be used to indicate
> > > offline repair is necessary too.
> > 
> > Yes.
> > 
> > > > In the future, I will also use the per-AG sick states to steer
> > > > allocations away from known problematic AGs to try to avoid
> > > > unexpected shutdown in the middle of a transaction.
> > > > 
> > > 
> > > Hmm.. I'm a little curious about how much we should steer away from
> > > traditional behavior on kernels that might not support scrub. I suppose
> > > I could see arguments for going either way, but this is getting a bit
> > > ahead of this patch anyways. ;)
> > 
> > Yeah.  I /do/ have prototype patches buried in my dev tree but they are
> > too ugly not to let all the magic smoke out.  What really happens is
> > that when we hit a corruption error, we mark the AG as offline.  Then
> > the sysadmin can run xfs_scrub to fix it (which would set th AG back
> > online) or I guess we could have a spaceman -x command to force it back
> > online.
> > 
> > I always build in /some/ kind of manual override somewhere... :)
> > 
> > --D
> > 
> > > Brian
> > > 
> > > > --D
> > > > 
> > > > > 
> > > > > Brian
> > > > > 
> > > > > > +
> > > > > >  #endif	/* __XFS_HEALTH_H__ */
> > > > > > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > > > > > index 988cde7744e6..c401512a4350 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > > > > > @@ -27,6 +27,7 @@
> > > > > >  #include "xfs_trace.h"
> > > > > >  #include "xfs_log.h"
> > > > > >  #include "xfs_rmap.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Lookup a record by ino in the btree given by cur.
> > > > > > @@ -2635,6 +2636,8 @@ xfs_read_agi(
> > > > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > > > >  			XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
> > > > > >  			XFS_FSS_TO_BB(mp, 1), 0, bpp, &xfs_agi_buf_ops);
> > > > > > +	if (xfs_metadata_is_sick(error))
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > > >  	if (error)
> > > > > >  		return error;
> > > > > >  	if (tp)
> > > > > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > > > > > index d7d702ee4d1a..25c87834e42a 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_refcount.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > > > > > @@ -22,6 +22,7 @@
> > > > > >  #include "xfs_bit.h"
> > > > > >  #include "xfs_refcount.h"
> > > > > >  #include "xfs_rmap.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /* Allowable refcount adjustment amounts. */
> > > > > >  enum xfs_refc_adjust_op {
> > > > > > @@ -1177,8 +1178,10 @@ xfs_refcount_finish_one(
> > > > > >  				XFS_ALLOC_FLAG_FREEING, &agbp);
> > > > > >  		if (error)
> > > > > >  			return error;
> > > > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  
> > > > > >  		rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno);
> > > > > >  		if (!rcur) {
> > > > > > diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> > > > > > index ff9412f113c4..a54a3c129cce 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_rmap.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_rmap.c
> > > > > > @@ -21,6 +21,7 @@
> > > > > >  #include "xfs_errortag.h"
> > > > > >  #include "xfs_error.h"
> > > > > >  #include "xfs_inode.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Lookup the first record less than or equal to [bno, len, owner, offset]
> > > > > > @@ -2400,8 +2401,10 @@ xfs_rmap_finish_one(
> > > > > >  		error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
> > > > > >  		if (error)
> > > > > >  			return error;
> > > > > > -		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp))
> > > > > > +		if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) {
> > > > > > +			xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGF);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  
> > > > > >  		rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, agno);
> > > > > >  		if (!rcur) {
> > > > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > > > > index 0ac69751fe85..4a923545465d 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > > > > @@ -1169,6 +1169,8 @@ xfs_sb_read_secondary(
> > > > > >  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > > > > >  			XFS_AG_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
> > > > > >  			XFS_FSS_TO_BB(mp, 1), 0, &bp, &xfs_sb_buf_ops);
> > > > > > +	if (xfs_metadata_is_sick(error))
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_SB);
> > > > > >  	if (error)
> > > > > >  		return error;
> > > > > >  	xfs_buf_set_ref(bp, XFS_SSB_REF);
> > > > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > > > index 860dc70c99e7..36c32b108b39 100644
> > > > > > --- a/fs/xfs/xfs_health.c
> > > > > > +++ b/fs/xfs/xfs_health.c
> > > > > > @@ -200,6 +200,23 @@ xfs_rt_measure_sickness(
> > > > > >  	spin_unlock(&mp->m_sb_lock);
> > > > > >  }
> > > > > >  
> > > > > > +/* Mark unhealthy per-ag metadata given a raw AG number. */
> > > > > > +void
> > > > > > +xfs_agno_mark_sick(
> > > > > > +	struct xfs_mount	*mp,
> > > > > > +	xfs_agnumber_t		agno,
> > > > > > +	unsigned int		mask)
> > > > > > +{
> > > > > > +	struct xfs_perag	*pag = xfs_perag_get(mp, agno);
> > > > > > +
> > > > > > +	/* per-ag structure not set up yet? */
> > > > > > +	if (!pag)
> > > > > > +		return;
> > > > > > +
> > > > > > +	xfs_ag_mark_sick(pag, mask);
> > > > > > +	xfs_perag_put(pag);
> > > > > > +}
> > > > > > +
> > > > > >  /* Mark unhealthy per-ag metadata. */
> > > > > >  void
> > > > > >  xfs_ag_mark_sick(
> > > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > > > index 401da197f012..a2812cea748d 100644
> > > > > > --- a/fs/xfs/xfs_inode.c
> > > > > > +++ b/fs/xfs/xfs_inode.c
> > > > > > @@ -35,6 +35,7 @@
> > > > > >  #include "xfs_log.h"
> > > > > >  #include "xfs_bmap_btree.h"
> > > > > >  #include "xfs_reflink.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  kmem_zone_t *xfs_inode_zone;
> > > > > >  
> > > > > > @@ -787,6 +788,8 @@ xfs_ialloc(
> > > > > >  	 */
> > > > > >  	if ((pip && ino == pip->i_ino) || !xfs_verify_dir_ino(mp, ino)) {
> > > > > >  		xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
> > > > > > +		xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
> > > > > > +				XFS_SICK_AG_INOBT);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -2137,6 +2140,7 @@ xfs_iunlink_update_bucket(
> > > > > >  	 */
> > > > > >  	if (old_value == new_agino) {
> > > > > >  		xfs_buf_corruption_error(agibp);
> > > > > > +		xfs_agno_mark_sick(tp->t_mountp, agno, XFS_SICK_AG_AGI);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -2203,6 +2207,7 @@ xfs_iunlink_update_inode(
> > > > > >  	if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
> > > > > >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > > > > >  				sizeof(*dip), __this_address);
> > > > > > +		xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > > > >  		error = -EFSCORRUPTED;
> > > > > >  		goto out;
> > > > > >  	}
> > > > > > @@ -2217,6 +2222,7 @@ xfs_iunlink_update_inode(
> > > > > >  		if (next_agino != NULLAGINO) {
> > > > > >  			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> > > > > >  					dip, sizeof(*dip), __this_address);
> > > > > > +			xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  		}
> > > > > >  		goto out;
> > > > > > @@ -2271,6 +2277,7 @@ xfs_iunlink(
> > > > > >  	if (next_agino == agino ||
> > > > > >  	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
> > > > > >  		xfs_buf_corruption_error(agibp);
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -2408,6 +2415,7 @@ xfs_iunlink_map_prev(
> > > > > >  			XFS_CORRUPTION_ERROR(__func__,
> > > > > >  					XFS_ERRLEVEL_LOW, mp,
> > > > > >  					*dipp, sizeof(**dipp));
> > > > > > +			xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			return error;
> > > > > >  		}
> > > > > > @@ -2454,6 +2462,7 @@ xfs_iunlink_remove(
> > > > > >  	if (!xfs_verify_agino(mp, agno, head_agino)) {
> > > > > >  		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
> > > > > >  				agi, sizeof(*agi));
> > > > > > +		xfs_agno_mark_sick(mp, agno, XFS_SICK_AG_AGI);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs: report dir/attr block corruption errors to the health system
  2019-11-22 12:28           ` Brian Foster
@ 2019-11-22 18:35             ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2019-11-22 18:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Nov 22, 2019 at 07:28:49AM -0500, Brian Foster wrote:
> On Thu, Nov 21, 2019 at 05:03:32PM -0800, Darrick J. Wong wrote:
> > On Thu, Nov 21, 2019 at 08:26:27AM -0500, Brian Foster wrote:
> > > On Wed, Nov 20, 2019 at 08:55:08AM -0800, Darrick J. Wong wrote:
> > > > On Wed, Nov 20, 2019 at 11:11:47AM -0500, Brian Foster wrote:
> > > > > On Thu, Nov 14, 2019 at 10:19:46AM -0800, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > 
> > > > > > Whenever we encounter corrupt directory or extended attribute blocks, we
> > > > > > should report that to the health monitoring system for later reporting.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > ---
> > > > > >  fs/xfs/libxfs/xfs_attr_leaf.c   |    5 ++++-
> > > > > >  fs/xfs/libxfs/xfs_attr_remote.c |   27 ++++++++++++++++-----------
> > > > > >  fs/xfs/libxfs/xfs_da_btree.c    |   29 ++++++++++++++++++++++++++---
> > > > > >  fs/xfs/libxfs/xfs_dir2.c        |    5 ++++-
> > > > > >  fs/xfs/libxfs/xfs_dir2_data.c   |    2 ++
> > > > > >  fs/xfs/libxfs/xfs_dir2_leaf.c   |    3 +++
> > > > > >  fs/xfs/libxfs/xfs_dir2_node.c   |    7 +++++++
> > > > > >  fs/xfs/libxfs/xfs_health.h      |    3 +++
> > > > > >  fs/xfs/xfs_attr_inactive.c      |    4 ++++
> > > > > >  fs/xfs/xfs_attr_list.c          |   16 +++++++++++++---
> > > > > >  fs/xfs/xfs_dir2_readdir.c       |    6 +++++-
> > > > > >  fs/xfs/xfs_health.c             |   39 +++++++++++++++++++++++++++++++++++++++
> > > > > >  12 files changed, 126 insertions(+), 20 deletions(-)
> > > > > > 
> > > > > > 
> > > > > ...
> > > > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> > > > > > index e424b004e3cb..a17622dadf00 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_da_btree.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_da_btree.c
> > > > > ...
> > > > > > @@ -1589,6 +1593,7 @@ xfs_da3_node_lookup_int(
> > > > > >  
> > > > > >  		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
> > > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		}
> > > > > >  
> > > > > > @@ -1604,6 +1609,7 @@ xfs_da3_node_lookup_int(
> > > > > >  		/* Tree taller than we can handle; bail out! */
> > > > > >  		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
> > > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		}
> > > > > >  
> > > > > > @@ -1612,6 +1618,7 @@ xfs_da3_node_lookup_int(
> > > > > >  			expected_level = nodehdr.level - 1;
> > > > > >  		else if (expected_level != nodehdr.level) {
> > > > > >  			xfs_buf_corruption_error(blk->bp);
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		} else
> > > > > >  			expected_level--;
> > > > > > @@ -1663,12 +1670,16 @@ xfs_da3_node_lookup_int(
> > > > > >  		}
> > > > > >  
> > > > > >  		/* We can't point back to the root. */
> > > > > > -		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
> > > > > > +		if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  	}
> > > > > >  
> > > > > > -	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
> > > > > > +	if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > > +	}
> > > > > >  
> > > > > >  	/*
> > > > > >  	 * A leaf block that ends in the hashval that we are interested in
> > > > > > @@ -1686,6 +1697,7 @@ xfs_da3_node_lookup_int(
> > > > > >  			args->blkno = blk->blkno;
> > > > > >  		} else {
> > > > > >  			ASSERT(0);
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		}
> > > > > 
> > > > > I'm just kind of skimming through the rest for general feedback at this
> > > > > point given previous comments, but it might be nice to start using exit
> > > > > labels at some of these places where we're enlarging and duplicating the
> > > > > error path for particular errors.
> > > > 
> > > > Yeah.  This current iteration is pretty wordy since I used coccinelle to
> > > > find all the EFSCORRUPTED clauses and inject the appropriate _mark_sick
> > > > call.
> > > > 
> > > > > It's not so much about the code in
> > > > > these patches, but rather to hopefully ease maintaining these state bits
> > > > > properly in new code where devs/reviewers might not know much about
> > > > > scrub state or have it in mind. Short of having some kind of generic
> > > > > helper to handle corruption state, ISTM that the combination of using
> > > > > verifiers where possible and common exit labels anywhere else we
> > > > > generate -EFSCORRUPTED at multiple places within some function could
> > > > > shrink these patches a bit..
> > > > 
> > > > <nod> Eric suggested on IRC that maybe the _mark_sick functions should
> > > > return EFSCORRUPTED so that we could at least collapse that to:
> > > > 
> > > > if (XFS_IS_CORRUPT(...)) {
> > > > 	error = xfs_da_mark_sick(...);
> > > > 	goto barf;
> > > > }
> > > > 
> > > > However, doing it the wordy way I've done it has the neat effects (IMHO)
> > > > that you can find all the places where xfs decides some metadata is
> > > > corrupt by grepping for EFSCORRUPTED, and confirm that each place it
> > > > does that also has a corresponding _mark_sick call.
> > > > 
> > > 
> > > Yeah, that was actually my thought process in suggesting pushing the
> > > mark_sick() calls down into verifiers as well.
> > 
> > <nod> It does strike me as a little odd that the verifiers are the /one/
> > place where EFSCORRUPTED isn't preceded or followed by a _mark_sick.
> > 
> > > It seems a little more clear (and open to future cleanups) with a
> > > strict pattern of setting sickness in the locations that generate
> > > corruption errors. Of course that likely means some special macro or
> > > something like you propose below, but I didn't want to quite go there
> > > until we could put the state updates in the right places.
> > 
> > Yeah....
> > 
> > > > I guess you could create a dorky shouty wrapper to maintain that greppy
> > > > property:
> > > > 
> > > > #define XFS_DA_EFSCORRUPTED(...) \
> > > > 	(xfs_da_mark_sick(...), -EFSCORRUPTED)
> > > > 
> > > > But... that might be stylistically undesirable.  OTOH I guess it
> > > > wouldn't be so bad either to do:
> > > > 
> > > > 	if (XFS_IS_CORRUPT(...)) {
> > > > 		error = -EFSCORRUPTED;
> > > > 		goto bad;
> > > > 	}
> > > > 
> > > > 	if (XFS_IS_CORRUPT(...)) {
> > > > 		error = -EFSCORRUPTED;
> > > > 		goto bad;
> > > > 	}
> > > > 
> > > > 	return 0;
> > > > bad:
> > > > 	if (error == -EFSCORRUPTED)
> > > > 		xfs_da_mark_sick(...);
> > > > 	return error;
> > > > 
> > > > Or using the shouty macro above:
> > > > 
> > > > 	if (XFS_IS_CORRUPT(...)) {
> > > > 		error = XFS_DA_EFSCORRUPTED(...);
> > > > 		goto bad;
> > > > 	}
> > > > 
> > > > 	if (XFS_IS_CORRUPT(...)) {
> > > > 		error = XFS_DA_EFSCORRUPTED(...);
> > > > 		goto bad;
> > > > 	}
> > > > 
> > > > bad:
> > > > 	return error;
> > > > 
> > > > I'll think about that.  It doesn't sound so bad when coding it up in
> > > > this email.
> > > > 
> > > 
> > > I suppose a macro is nice in that it enforces sickness is updated
> > > wherever -EFSCORRUPTED occurs, or at least can easily be verified by
> > > grepping. I find the separate macros pattern a little confusing, FWIW,
> > > simply because at a glance it looks like a garbled bunch of logic to me.
> > > I.e. I see 'if (IS_CORRUPT()) SOMETHING_CORRUPTED(); ...' and wonder wtf
> > > that is doing, for one. It's also not immediately obvious when we should
> > > use one or not the other, etc. This is getting into bikeshedding
> > > territory though and I don't have much of a better suggestion atm...
> > 
> > ...one /could/ have specific IS_CORRUPT macros mapping to different
> > types of things.  Though I think this could easily get messy:
> > 
> 
> Yep.
> 
> > #define XFS_DIR_IS_CORRUPT(dp, perror, expr) \
> > 	(unlikely(expr) ? xfs_corruption_report(#expr, ...), \
> > 			  *(perror) = -EFSCORRUPTED, \
> > 			  xfs_da_mark_sick(dp, XFS_DATA_FORK), true : false)
> > 
> > I don't want to load up these macros with too much stuff, but I guess at
> > least that reduces the directory code to:
> > 
> > 	if (XFS_DIR_IS_CORRUPT(dp, &error, blah == badvalue))
> > 		goto out;
> > 	...
> > 	if (XFS_DIR_IS_CORRUPT(dp, &error, ugh == NULL))
> > 		return error;
> > out:
> > 	return error;
> > 
> > Though now we're getting pretty far from the original intent to kill off
> > wonky macros.  At least these are less weird, so maybe this won't set
> > off a round of macro bikeshed rage?
> > 
> 
> I dunno.. I'm trying to find an opinion beyond a waffley sense of "is it
> worth changing?" on the whole macro thing. While I agree that the
> original macros are ugly, they never really confused me or affected
> readability so I didn't care too much whether they stay or go TBH.

Same here.

> In general, I think having usable interfaces for the developer and
> readable functional code is more important than how ugly/bloated the
> macro might be. That's why I really don't like the previous example that
> combines multiple "simple" macros and turns that into some reusable
> pattern. The resulting user code is not really readable IMO.

Yeah, now that I've gone through several rounds of reworking things,
inflating those error handling clauses isn't much of an improvement.
I promise I'm not being paid by the kLOC.

> The DIR_IS_CORRUPT() example above reminds me a little more of the
> original macros in that it is easy to use and makes the user code
> concise. Indeed, it somewhat overloads the macro, but that seems
> advantageous to me if the intent of this series is to add more
> boilerplate associated with how we handle corruption errors generically.
> In that regard, I find the DIR_IS_CORRUPT() approach preferable to
> alternatives discussed so far (though I'd probably name it XFS_DA_*()
> for consistency with the underlying health state type). Just my .02
> though.. ;)

Hm, yeah.  I think I'll rework the rest of this series to do that...

--D

> 
> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > --D
> > > > 
> > > > > 
> > > > > Brian
> > > > > 
> > > > > >  		if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
> > > > > > @@ -2250,8 +2262,10 @@ xfs_da3_swap_lastblock(
> > > > > >  	error = xfs_bmap_last_before(tp, dp, &lastoff, w);
> > > > > >  	if (error)
> > > > > >  		return error;
> > > > > > -	if (XFS_IS_CORRUPT(mp, lastoff == 0))
> > > > > > +	if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > > +	}
> > > > > >  	/*
> > > > > >  	 * Read the last block in the btree space.
> > > > > >  	 */
> > > > > > @@ -2300,6 +2314,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > > >  				   be32_to_cpu(sib_info->forw) != last_blkno ||
> > > > > >  				   sib_info->magic != dead_info->magic)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2320,6 +2335,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > > >  				   be32_to_cpu(sib_info->back) != last_blkno ||
> > > > > >  				   sib_info->magic != dead_info->magic)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2342,6 +2358,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > > > >  		if (XFS_IS_CORRUPT(mp,
> > > > > >  				   level >= 0 && level != par_hdr.level + 1)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2353,6 +2370,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		     entno++)
> > > > > >  			continue;
> > > > > >  		if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2378,6 +2396,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		xfs_trans_brelse(tp, par_buf);
> > > > > >  		par_buf = NULL;
> > > > > >  		if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2387,6 +2406,7 @@ xfs_da3_swap_lastblock(
> > > > > >  		par_node = par_buf->b_addr;
> > > > > >  		xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
> > > > > >  		if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto done;
> > > > > >  		}
> > > > > > @@ -2601,6 +2621,7 @@ xfs_dabuf_map(
> > > > > >  					irecs[i].br_state);
> > > > > >  			}
> > > > > >  		}
> > > > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > > > >  		error = -EFSCORRUPTED;
> > > > > >  		goto out;
> > > > > >  	}
> > > > > > @@ -2693,6 +2714,8 @@ xfs_da_read_buf(
> > > > > >  	error = xfs_trans_read_buf_map(dp->i_mount, trans,
> > > > > >  					dp->i_mount->m_ddev_targp,
> > > > > >  					mapp, nmap, 0, &bp, ops);
> > > > > > +	if (xfs_metadata_is_sick(error))
> > > > > > +		xfs_dirattr_mark_sick(dp, whichfork);
> > > > > >  	if (error)
> > > > > >  		goto out_free;
> > > > > >  
> > > > > > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > > > > > index 0aa87cbde49e..e1aa411a1b8b 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_dir2.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > > > > > @@ -18,6 +18,7 @@
> > > > > >  #include "xfs_errortag.h"
> > > > > >  #include "xfs_error.h"
> > > > > >  #include "xfs_trace.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  struct xfs_name xfs_name_dotdot = { (unsigned char *)"..", 2, XFS_DIR3_FT_DIR };
> > > > > >  
> > > > > > @@ -608,8 +609,10 @@ xfs_dir2_isblock(
> > > > > >  	rval = XFS_FSB_TO_B(args->dp->i_mount, last) == args->geo->blksize;
> > > > > >  	if (XFS_IS_CORRUPT(args->dp->i_mount,
> > > > > >  			   rval != 0 &&
> > > > > > -			   args->dp->i_d.di_size != args->geo->blksize))
> > > > > > +			   args->dp->i_d.di_size != args->geo->blksize)) {
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > > +	}
> > > > > >  	*vp = rval;
> > > > > >  	return 0;
> > > > > >  }
> > > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > > index a6eb71a62b53..80cc9c7ea4e5 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_dir2_data.c
> > > > > > @@ -18,6 +18,7 @@
> > > > > >  #include "xfs_trans.h"
> > > > > >  #include "xfs_buf_item.h"
> > > > > >  #include "xfs_log.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  static xfs_failaddr_t xfs_dir2_data_freefind_verify(
> > > > > >  		struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
> > > > > > @@ -1170,6 +1171,7 @@ xfs_dir2_data_use_free(
> > > > > >  corrupt:
> > > > > >  	xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
> > > > > >  			hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
> > > > > > +	xfs_da_mark_sick(args);
> > > > > >  	return -EFSCORRUPTED;
> > > > > >  }
> > > > > >  
> > > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > > index 73edd96ce0ac..32d17420fff3 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> > > > > > @@ -19,6 +19,7 @@
> > > > > >  #include "xfs_trace.h"
> > > > > >  #include "xfs_trans.h"
> > > > > >  #include "xfs_buf_item.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Local function declarations.
> > > > > > @@ -1386,8 +1387,10 @@ xfs_dir2_leaf_removename(
> > > > > >  	bestsp = xfs_dir2_leaf_bests_p(ltp);
> > > > > >  	if (be16_to_cpu(bestsp[db]) != oldbest) {
> > > > > >  		xfs_buf_corruption_error(lbp);
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > > +
> > > > > >  	/*
> > > > > >  	 * Mark the former data entry unused.
> > > > > >  	 */
> > > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > > index 3a8b0625a08b..e0f3ab254a1a 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > > @@ -20,6 +20,7 @@
> > > > > >  #include "xfs_trans.h"
> > > > > >  #include "xfs_buf_item.h"
> > > > > >  #include "xfs_log.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Function declarations.
> > > > > > @@ -228,6 +229,7 @@ __xfs_dir3_free_read(
> > > > > >  	if (fa) {
> > > > > >  		xfs_verifier_error(*bpp, -EFSCORRUPTED, fa);
> > > > > >  		xfs_trans_brelse(tp, *bpp);
> > > > > > +		xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -440,6 +442,7 @@ xfs_dir2_leaf_to_node(
> > > > > >  	if (be32_to_cpu(ltp->bestcount) >
> > > > > >  				(uint)dp->i_d.di_size / args->geo->blksize) {
> > > > > >  		xfs_buf_corruption_error(lbp);
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -514,6 +517,7 @@ xfs_dir2_leafn_add(
> > > > > >  	 */
> > > > > >  	if (index < 0) {
> > > > > >  		xfs_buf_corruption_error(bp);
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -733,6 +737,7 @@ xfs_dir2_leafn_lookup_for_addname(
> > > > > >  					   cpu_to_be16(NULLDATAOFF))) {
> > > > > >  				if (curfdb != newfdb)
> > > > > >  					xfs_trans_brelse(tp, curbp);
> > > > > > +				xfs_da_mark_sick(args);
> > > > > >  				return -EFSCORRUPTED;
> > > > > >  			}
> > > > > >  			curfdb = newfdb;
> > > > > > @@ -801,6 +806,7 @@ xfs_dir2_leafn_lookup_for_entry(
> > > > > >  	xfs_dir3_leaf_check(dp, bp);
> > > > > >  	if (leafhdr.count <= 0) {
> > > > > >  		xfs_buf_corruption_error(bp);
> > > > > > +		xfs_da_mark_sick(args);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -1737,6 +1743,7 @@ xfs_dir2_node_add_datablk(
> > > > > >  			} else {
> > > > > >  				xfs_alert(mp, " ... fblk is NULL");
> > > > > >  			}
> > > > > > +			xfs_da_mark_sick(args);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		}
> > > > > >  
> > > > > > diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
> > > > > > index 2049419e9555..d9404cd3d09b 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_health.h
> > > > > > +++ b/fs/xfs/libxfs/xfs_health.h
> > > > > > @@ -38,6 +38,7 @@ struct xfs_perag;
> > > > > >  struct xfs_inode;
> > > > > >  struct xfs_fsop_geom;
> > > > > >  struct xfs_btree_cur;
> > > > > > +struct xfs_da_args;
> > > > > >  
> > > > > >  /* Observable health issues for metadata spanning the entire filesystem. */
> > > > > >  #define XFS_SICK_FS_COUNTERS	(1 << 0)  /* summary counters */
> > > > > > @@ -141,6 +142,8 @@ void xfs_inode_measure_sickness(struct xfs_inode *ip, unsigned int *sick,
> > > > > >  void xfs_health_unmount(struct xfs_mount *mp);
> > > > > >  void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork);
> > > > > >  void xfs_btree_mark_sick(struct xfs_btree_cur *cur);
> > > > > > +void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork);
> > > > > > +void xfs_da_mark_sick(struct xfs_da_args *args);
> > > > > >  
> > > > > >  /* Now some helpers. */
> > > > > >  
> > > > > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > > > > > index a78c501f6fb1..429a97494ffa 100644
> > > > > > --- a/fs/xfs/xfs_attr_inactive.c
> > > > > > +++ b/fs/xfs/xfs_attr_inactive.c
> > > > > > @@ -23,6 +23,7 @@
> > > > > >  #include "xfs_quota.h"
> > > > > >  #include "xfs_dir2.h"
> > > > > >  #include "xfs_error.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Look at all the extents for this logical region,
> > > > > > @@ -209,6 +210,7 @@ xfs_attr3_node_inactive(
> > > > > >  	if (level > XFS_DA_NODE_MAXDEPTH) {
> > > > > >  		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
> > > > > >  		xfs_buf_corruption_error(bp);
> > > > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  		return -EFSCORRUPTED;
> > > > > >  	}
> > > > > >  
> > > > > > @@ -256,6 +258,7 @@ xfs_attr3_node_inactive(
> > > > > >  			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
> > > > > >  			break;
> > > > > >  		default:
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  			xfs_buf_corruption_error(child_bp);
> > > > > >  			xfs_trans_brelse(*trans, child_bp);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > > @@ -342,6 +345,7 @@ xfs_attr3_root_inactive(
> > > > > >  		error = xfs_attr3_leaf_inactive(trans, dp, bp);
> > > > > >  		break;
> > > > > >  	default:
> > > > > > +		xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  		error = -EFSCORRUPTED;
> > > > > >  		xfs_buf_corruption_error(bp);
> > > > > >  		xfs_trans_brelse(*trans, bp);
> > > > > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > > > > > index 7a099df88a0c..1a2a3d4ce422 100644
> > > > > > --- a/fs/xfs/xfs_attr_list.c
> > > > > > +++ b/fs/xfs/xfs_attr_list.c
> > > > > > @@ -21,6 +21,7 @@
> > > > > >  #include "xfs_error.h"
> > > > > >  #include "xfs_trace.h"
> > > > > >  #include "xfs_dir2.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  STATIC int
> > > > > >  xfs_attr_shortform_compare(const void *a, const void *b)
> > > > > > @@ -88,8 +89,10 @@ xfs_attr_shortform_list(
> > > > > >  		for (i = 0, sfe = &sf->list[0]; i < sf->hdr.count; i++) {
> > > > > >  			if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > > >  					   !xfs_attr_namecheck(sfe->nameval,
> > > > > > -							       sfe->namelen)))
> > > > > > +							       sfe->namelen))) {
> > > > > > +				xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  				return -EFSCORRUPTED;
> > > > > > +			}
> > > > > >  			context->put_listent(context,
> > > > > >  					     sfe->flags,
> > > > > >  					     sfe->nameval,
> > > > > > @@ -131,6 +134,7 @@ xfs_attr_shortform_list(
> > > > > >  					     context->dp->i_mount, sfe,
> > > > > >  					     sizeof(*sfe));
> > > > > >  			kmem_free(sbuf);
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  			return -EFSCORRUPTED;
> > > > > >  		}
> > > > > >  
> > > > > > @@ -181,6 +185,7 @@ xfs_attr_shortform_list(
> > > > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > > >  				   !xfs_attr_namecheck(sbp->name,
> > > > > >  						       sbp->namelen))) {
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			goto out;
> > > > > >  		}
> > > > > > @@ -268,8 +273,10 @@ xfs_attr_node_list_lookup(
> > > > > >  			return 0;
> > > > > >  
> > > > > >  		/* We can't point back to the root. */
> > > > > > -		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0))
> > > > > > +		if (XFS_IS_CORRUPT(mp, cursor->blkno == 0)) {
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  	}
> > > > > >  
> > > > > >  	if (expected_level != 0)
> > > > > > @@ -281,6 +288,7 @@ xfs_attr_node_list_lookup(
> > > > > >  out_corruptbuf:
> > > > > >  	xfs_buf_corruption_error(bp);
> > > > > >  	xfs_trans_brelse(tp, bp);
> > > > > > +	xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
> > > > > >  	return -EFSCORRUPTED;
> > > > > >  }
> > > > > >  
> > > > > > @@ -471,8 +479,10 @@ xfs_attr3_leaf_list_int(
> > > > > >  		}
> > > > > >  
> > > > > >  		if (XFS_IS_CORRUPT(context->dp->i_mount,
> > > > > > -				   !xfs_attr_namecheck(name, namelen)))
> > > > > > +				   !xfs_attr_namecheck(name, namelen))) {
> > > > > > +			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  		context->put_listent(context, entry->flags,
> > > > > >  					      name, namelen, valuelen);
> > > > > >  		if (context->seen_enough)
> > > > > > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > > > > > index 95bc9ef8f5f9..715ded503334 100644
> > > > > > --- a/fs/xfs/xfs_dir2_readdir.c
> > > > > > +++ b/fs/xfs/xfs_dir2_readdir.c
> > > > > > @@ -18,6 +18,7 @@
> > > > > >  #include "xfs_bmap.h"
> > > > > >  #include "xfs_trans.h"
> > > > > >  #include "xfs_error.h"
> > > > > > +#include "xfs_health.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Directory file type support functions
> > > > > > @@ -119,8 +120,10 @@ xfs_dir2_sf_getdents(
> > > > > >  		ctx->pos = off & 0x7fffffff;
> > > > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > > > >  				   !xfs_dir2_namecheck(sfep->name,
> > > > > > -						       sfep->namelen)))
> > > > > > +						       sfep->namelen))) {
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > > >  			return -EFSCORRUPTED;
> > > > > > +		}
> > > > > >  		if (!dir_emit(ctx, (char *)sfep->name, sfep->namelen, ino,
> > > > > >  			    xfs_dir3_get_dtype(mp, filetype)))
> > > > > >  			return 0;
> > > > > > @@ -461,6 +464,7 @@ xfs_dir2_leaf_getdents(
> > > > > >  		if (XFS_IS_CORRUPT(dp->i_mount,
> > > > > >  				   !xfs_dir2_namecheck(dep->name,
> > > > > >  						       dep->namelen))) {
> > > > > > +			xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
> > > > > >  			error = -EFSCORRUPTED;
> > > > > >  			break;
> > > > > >  		}
> > > > > > diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
> > > > > > index 1f09027c55ad..c1b6e8fb72ec 100644
> > > > > > --- a/fs/xfs/xfs_health.c
> > > > > > +++ b/fs/xfs/xfs_health.c
> > > > > > @@ -15,6 +15,8 @@
> > > > > >  #include "xfs_trace.h"
> > > > > >  #include "xfs_health.h"
> > > > > >  #include "xfs_btree.h"
> > > > > > +#include "xfs_da_format.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >  
> > > > > >  /*
> > > > > >   * Warn about metadata corruption that we detected but haven't fixed, and
> > > > > > @@ -517,3 +519,40 @@ xfs_btree_mark_sick(
> > > > > >  
> > > > > >  	xfs_agno_mark_sick(cur->bc_mp, cur->bc_private.a.agno, mask);
> > > > > >  }
> > > > > > +
> > > > > > +/*
> > > > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > > > + * system.
> > > > > > + */
> > > > > > +void
> > > > > > +xfs_dirattr_mark_sick(
> > > > > > +	struct xfs_inode	*ip,
> > > > > > +	int			whichfork)
> > > > > > +{
> > > > > > +	unsigned int		mask;
> > > > > > +
> > > > > > +	switch (whichfork) {
> > > > > > +	case XFS_DATA_FORK:
> > > > > > +		mask = XFS_SICK_INO_DIR;
> > > > > > +		break;
> > > > > > +	case XFS_ATTR_FORK:
> > > > > > +		mask = XFS_SICK_INO_XATTR;
> > > > > > +		break;
> > > > > > +	default:
> > > > > > +		ASSERT(0);
> > > > > > +		return;
> > > > > > +	}
> > > > > > +
> > > > > > +	xfs_inode_mark_sick(ip, mask);
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Record observations of dir/attr btree corruption with the health tracking
> > > > > > + * system.
> > > > > > + */
> > > > > > +void
> > > > > > +xfs_da_mark_sick(
> > > > > > +	struct xfs_da_args	*args)
> > > > > > +{
> > > > > > +	xfs_dirattr_mark_sick(args->dp, args->whichfork);
> > > > > > +}
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-11-22 18:35 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 18:19 [PATCH v4 0/9] xfs: report corruption to the health trackers Darrick J. Wong
2019-11-14 18:19 ` [PATCH 1/9] xfs: separate the marking of sick and checked metadata Darrick J. Wong
2019-11-20 14:20   ` Brian Foster
2019-11-20 16:12     ` Darrick J. Wong
2019-11-14 18:19 ` [PATCH 2/9] xfs: report ag header corruption errors to the health tracking system Darrick J. Wong
2019-11-20 14:20   ` Brian Foster
2019-11-20 16:43     ` Darrick J. Wong
2019-11-21 13:26       ` Brian Foster
2019-11-22  0:53         ` Darrick J. Wong
2019-11-22 11:57           ` Brian Foster
2019-11-22 18:10             ` Darrick J. Wong
2019-11-14 18:19 ` [PATCH 3/9] xfs: report block map " Darrick J. Wong
2019-11-20 14:21   ` Brian Foster
2019-11-20 16:57     ` Darrick J. Wong
2019-11-14 18:19 ` [PATCH 4/9] xfs: report btree block corruption errors to the health system Darrick J. Wong
2019-11-14 18:19 ` [PATCH 5/9] xfs: report dir/attr " Darrick J. Wong
2019-11-20 16:11   ` Brian Foster
2019-11-20 16:55     ` Darrick J. Wong
2019-11-21 13:26       ` Brian Foster
2019-11-22  1:03         ` Darrick J. Wong
2019-11-22 12:28           ` Brian Foster
2019-11-22 18:35             ` Darrick J. Wong
2019-11-14 18:19 ` [PATCH 6/9] xfs: report symlink " Darrick J. Wong
2019-11-14 18:19 ` [PATCH 7/9] xfs: report inode " Darrick J. Wong
2019-11-14 18:20 ` [PATCH 8/9] xfs: report quota block " Darrick J. Wong
2019-11-14 18:20 ` [PATCH 9/9] xfs: report realtime metadata " Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).