linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] xfs-5.0: inode btree scrub fixes
@ 2018-11-06  4:08 Darrick J. Wong
  2018-11-06  4:08 ` [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub Darrick J. Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

This series fixes some problems with the inode btree scrub code, namely
that the existing code does not handle the case where a single inode
cluster is mapped by multiple inobt records.  The series also cleans up
the rather confusing variable names.

Patch 1 teaches the inode record block counting code to handle the case
where there's more than one inobt record per inode cluster.  We do this
by counting inodes and converting to blocks only at the end.

Patch 2 starts moving the inode geometry stuff to the xchk_iallocbt
context so that we only calculate those things once.

Patches 3-4 move the inobt record alignment checks to a separate
function and enhance the function to check that when we have more than
one inobt record per cluster we actually check that *all* of the
necessary records are present and in the correct order.

Patches 5-7 reorganize the inobt free data checks to deal with the
"multiple inobt records per icluster" situation.  In restructuring the
code to do so, we also rename variables and functions to be less
confusing about what they're there for.

Patch 8 fixes the 'is the inode free?' check to calculate dinode buffer
offsets correctly in the "multiple inobt records per icluster"
situation.

If you're going to start using this mess, you probably ought to just
pull from my git trees.  The kernel patches[1] should apply against
4.20-rc1.

This is an extraordinary way to eat your data.  Enjoy!
Comments and questions are, as always, welcome.

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 21:20   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 2/8] xfs: calculate inode cluster geometry once during " Darrick J. Wong
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

A big block filesystem might require more than one inobt record to cover
all the inodes in the block.  In these cases it is not correct to round
the irec count up to the nearest block because this causes us to
overestimate the number of inode blocks we expect to find.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |   23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 224dba937492..76ae8338a42e 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -44,6 +44,11 @@ xchk_setup_ag_iallocbt(
 
 /* Inode btree scrubber. */
 
+struct xchk_iallocbt {
+	/* Number of inodes we see while scanning inobt. */
+	unsigned long long	inodes;
+};
+
 /*
  * If we're checking the finobt, cross-reference with the inobt.
  * Otherwise we're checking the inobt; if there is an finobt, make sure
@@ -272,7 +277,7 @@ xchk_iallocbt_rec(
 	union xfs_btree_rec		*rec)
 {
 	struct xfs_mount		*mp = bs->cur->bc_mp;
-	xfs_filblks_t			*inode_blocks = bs->private;
+	struct xchk_iallocbt		*iabt = bs->private;
 	struct xfs_inobt_rec_incore	irec;
 	uint64_t			holes;
 	xfs_agnumber_t			agno = bs->cur->bc_private.a.agno;
@@ -310,8 +315,7 @@ xchk_iallocbt_rec(
 	    (agbno & (xfs_icluster_size_fsb(mp) - 1)))
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 
-	*inode_blocks += XFS_B_TO_FSB(mp,
-			irec.ir_count * mp->m_sb.sb_inodesize);
+	iabt->inodes += irec.ir_count;
 
 	/* Handle non-sparse inodes */
 	if (!xfs_inobt_issparse(irec.ir_holemask)) {
@@ -405,10 +409,11 @@ STATIC void
 xchk_iallocbt_xref_rmap_inodes(
 	struct xfs_scrub	*sc,
 	int			which,
-	xfs_filblks_t		inode_blocks)
+	unsigned long long	inodes)
 {
 	struct xfs_owner_info	oinfo;
 	xfs_filblks_t		blocks;
+	xfs_filblks_t		inode_blocks;
 	int			error;
 
 	if (!sc->sa.rmap_cur || xchk_skip_xref(sc->sm))
@@ -420,6 +425,7 @@ xchk_iallocbt_xref_rmap_inodes(
 			&blocks);
 	if (!xchk_should_check_xref(sc, &error, &sc->sa.rmap_cur))
 		return;
+	inode_blocks = XFS_B_TO_FSB(sc->mp, inodes * sc->mp->m_sb.sb_inodesize);
 	if (blocks != inode_blocks)
 		xchk_btree_xref_set_corrupt(sc, sc->sa.rmap_cur, 0);
 }
@@ -432,13 +438,14 @@ xchk_iallocbt(
 {
 	struct xfs_btree_cur	*cur;
 	struct xfs_owner_info	oinfo;
-	xfs_filblks_t		inode_blocks = 0;
+	struct xchk_iallocbt	iabt = {
+		.inodes		= 0,
+	};
 	int			error;
 
 	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
 	cur = which == XFS_BTNUM_INO ? sc->sa.ino_cur : sc->sa.fino_cur;
-	error = xchk_btree(sc, cur, xchk_iallocbt_rec, &oinfo,
-			&inode_blocks);
+	error = xchk_btree(sc, cur, xchk_iallocbt_rec, &oinfo, &iabt);
 	if (error)
 		return error;
 
@@ -452,7 +459,7 @@ xchk_iallocbt(
 	 * to inode chunks with free inodes.
 	 */
 	if (which == XFS_BTNUM_INO)
-		xchk_iallocbt_xref_rmap_inodes(sc, which, inode_blocks);
+		xchk_iallocbt_xref_rmap_inodes(sc, which, iabt.inodes);
 
 	return error;
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/8] xfs: calculate inode cluster geometry once during inobt scrub
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
  2018-11-06  4:08 ` [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 21:32   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 3/8] xfs: check the ir_startino alignment directly Darrick J. Wong
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Hoist all the inode cluster geometry information into struct
xchk_iallocbt instead of recomputing them over and over.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |   42 ++++++++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 14 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 76ae8338a42e..3c12a0fe3b38 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -45,8 +45,18 @@ xchk_setup_ag_iallocbt(
 /* Inode btree scrubber. */
 
 struct xchk_iallocbt {
+	/* owner info for inode blocks */
+	struct xfs_owner_info	oinfo;
+
 	/* Number of inodes we see while scanning inobt. */
 	unsigned long long	inodes;
+
+	/* Blocks and inodes per inode cluster. */
+	unsigned int		blocks_per_cluster;
+	unsigned int		inodes_per_cluster;
+
+	/* Block alignment of inode clusters. */
+	unsigned int		cluster_align;
 };
 
 /*
@@ -189,32 +199,30 @@ xchk_iallocbt_check_cluster_freemask(
 STATIC int
 xchk_iallocbt_check_freemask(
 	struct xchk_btree		*bs,
+	struct xchk_iallocbt		*iabt,
 	struct xfs_inobt_rec_incore	*irec)
 {
-	struct xfs_owner_info		oinfo;
 	struct xfs_imap			imap;
 	struct xfs_mount		*mp = bs->cur->bc_mp;
 	struct xfs_dinode		*dip;
 	struct xfs_buf			*bp;
 	xfs_ino_t			fsino;
-	xfs_agino_t			nr_inodes;
+	unsigned int			nr_inodes;
 	xfs_agino_t			agino;
 	xfs_agino_t			chunkino;
 	xfs_agino_t			clusterino;
 	xfs_agblock_t			agbno;
-	int				blks_per_cluster;
 	uint16_t			holemask;
 	uint16_t			ir_holemask;
 	int				error = 0;
 
 	/* Make sure the freemask matches the inode records. */
-	blks_per_cluster = xfs_icluster_size_fsb(mp);
-	nr_inodes = XFS_OFFBNO_TO_AGINO(mp, blks_per_cluster, 0);
-	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INODES);
+	nr_inodes = min_t(unsigned int, iabt->inodes_per_cluster,
+			XFS_INODES_PER_CHUNK);
 
 	for (agino = irec->ir_startino;
 	     agino < irec->ir_startino + XFS_INODES_PER_CHUNK;
-	     agino += blks_per_cluster * mp->m_sb.sb_inopblock) {
+	     agino += iabt->blocks_per_cluster * mp->m_sb.sb_inopblock) {
 		fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
 		chunkino = agino - irec->ir_startino;
 		agbno = XFS_AGINO_TO_AGBNO(mp, agino);
@@ -235,17 +243,17 @@ xchk_iallocbt_check_freemask(
 		/* If any part of this is a hole, skip it. */
 		if (ir_holemask) {
 			xchk_xref_is_not_owned_by(bs->sc, agbno,
-					blks_per_cluster, &oinfo);
+					iabt->blocks_per_cluster, &iabt->oinfo);
 			continue;
 		}
 
-		xchk_xref_is_owned_by(bs->sc, agbno, blks_per_cluster,
-				&oinfo);
+		xchk_xref_is_owned_by(bs->sc, agbno, iabt->blocks_per_cluster,
+				&iabt->oinfo);
 
 		/* Grab the inode cluster buffer. */
 		imap.im_blkno = XFS_AGB_TO_DADDR(mp, bs->cur->bc_private.a.agno,
 				agbno);
-		imap.im_len = XFS_FSB_TO_BB(mp, blks_per_cluster);
+		imap.im_len = XFS_FSB_TO_BB(mp, iabt->blocks_per_cluster);
 		imap.im_boffset = 0;
 
 		error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap,
@@ -311,8 +319,8 @@ xchk_iallocbt_rec(
 
 	/* Make sure this record is aligned to cluster and inoalignmnt size. */
 	agbno = XFS_AGINO_TO_AGBNO(mp, irec.ir_startino);
-	if ((agbno & (xfs_ialloc_cluster_alignment(mp) - 1)) ||
-	    (agbno & (xfs_icluster_size_fsb(mp) - 1)))
+	if ((agbno & (iabt->cluster_align - 1)) ||
+	    (agbno & (iabt->blocks_per_cluster - 1)))
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 
 	iabt->inodes += irec.ir_count;
@@ -353,7 +361,7 @@ xchk_iallocbt_rec(
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 
 check_freemask:
-	error = xchk_iallocbt_check_freemask(bs, &irec);
+	error = xchk_iallocbt_check_freemask(bs, iabt, &irec);
 	if (error)
 		goto out;
 
@@ -440,9 +448,15 @@ xchk_iallocbt(
 	struct xfs_owner_info	oinfo;
 	struct xchk_iallocbt	iabt = {
 		.inodes		= 0,
+		.cluster_align	= xfs_ialloc_cluster_alignment(sc->mp),
+		.blocks_per_cluster = xfs_icluster_size_fsb(sc->mp),
 	};
 	int			error;
 
+	xfs_rmap_ag_owner(&iabt.oinfo, XFS_RMAP_OWN_INODES);
+	iabt.inodes_per_cluster = XFS_OFFBNO_TO_AGINO(sc->mp,
+			iabt.blocks_per_cluster, 0);
+
 	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
 	cur = which == XFS_BTNUM_INO ? sc->sa.ino_cur : sc->sa.fino_cur;
 	error = xchk_btree(sc, cur, xchk_iallocbt_rec, &oinfo, &iabt);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/8] xfs: check the ir_startino alignment directly
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
  2018-11-06  4:08 ` [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub Darrick J. Wong
  2018-11-06  4:08 ` [PATCH 2/8] xfs: calculate inode cluster geometry once during " Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 21:34   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 4/8] xfs: check inobt record alignment on big block filesystems Darrick J. Wong
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In xchk_iallocbt_rec, check the alignment of ir_startino by converting
the inode cluster block alignment into units of inodes instead of the
other way around (converting ir_startino to blocks).  This prevents us
from tripping over off-by-one errors in ir_startino which are obscured
by the inode -> block conversion.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 3c12a0fe3b38..f498dfca3312 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -278,6 +278,29 @@ xchk_iallocbt_check_freemask(
 	return error;
 }
 
+/* Make sure this record is aligned to cluster and inoalignmnt size. */
+STATIC void
+xchk_iallocbt_rec_alignment(
+	struct xchk_btree		*bs,
+	struct xfs_inobt_rec_incore	*irec)
+{
+	struct xfs_mount		*mp = bs->cur->bc_mp;
+	struct xchk_iallocbt		*iabt = bs->private;
+	xfs_agino_t			imask;
+
+	imask = XFS_OFFBNO_TO_AGINO(mp, iabt->cluster_align, 0) - 1;
+	if (irec->ir_startino & imask) {
+		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+		return;
+	}
+
+	imask = XFS_OFFBNO_TO_AGINO(mp, iabt->blocks_per_cluster, 0) - 1;
+	if (irec->ir_startino & imask) {
+		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+		return;
+	}
+}
+
 /* Scrub an inobt/finobt record. */
 STATIC int
 xchk_iallocbt_rec(
@@ -290,7 +313,6 @@ xchk_iallocbt_rec(
 	uint64_t			holes;
 	xfs_agnumber_t			agno = bs->cur->bc_private.a.agno;
 	xfs_agino_t			agino;
-	xfs_agblock_t			agbno;
 	xfs_extlen_t			len;
 	int				holecount;
 	int				i;
@@ -317,11 +339,9 @@ xchk_iallocbt_rec(
 		goto out;
 	}
 
-	/* Make sure this record is aligned to cluster and inoalignmnt size. */
-	agbno = XFS_AGINO_TO_AGBNO(mp, irec.ir_startino);
-	if ((agbno & (iabt->cluster_align - 1)) ||
-	    (agbno & (iabt->blocks_per_cluster - 1)))
-		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+	xchk_iallocbt_rec_alignment(bs, &irec);
+	if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		goto out;
 
 	iabt->inodes += irec.ir_count;
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/8] xfs: check inobt record alignment on big block filesystems
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 3/8] xfs: check the ir_startino alignment directly Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 21:36   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 5/8] xfs: hoist inode cluster checks out of loop Darrick J. Wong
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

On a big block filesystem, there may be multiple inobt records covering
a single inode cluster.  These records obviously won't be aligned to
cluster alignment rules, and they must cover the entire cluster.  Teach
scrub to check for these things.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index f498dfca3312..4fadea892440 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -57,6 +57,12 @@ struct xchk_iallocbt {
 
 	/* Block alignment of inode clusters. */
 	unsigned int		cluster_align;
+
+	/* Expected next startino, for big block filesystems. */
+	xfs_agino_t		next_startino;
+
+	/* Expected end of the current inode cluster. */
+	xfs_agino_t		next_cluster_ino;
 };
 
 /*
@@ -288,6 +294,27 @@ xchk_iallocbt_rec_alignment(
 	struct xchk_iallocbt		*iabt = bs->private;
 	xfs_agino_t			imask;
 
+	if (iabt->next_startino != NULLAGINO) {
+		/*
+		 * We're midway through a cluster of inodes that is mapped by
+		 * multiple inobt records.  Did we get the record for the next
+		 * irec in the sequence?
+		 */
+		if (irec->ir_startino != iabt->next_startino) {
+			xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+			return;
+		}
+
+		iabt->next_startino += XFS_INODES_PER_CHUNK;
+
+		/* Are we done with the cluster? */
+		if (iabt->next_startino >= iabt->next_cluster_ino) {
+			iabt->next_startino = NULLAGINO;
+			iabt->next_cluster_ino = NULLAGINO;
+		}
+		return;
+	}
+
 	imask = XFS_OFFBNO_TO_AGINO(mp, iabt->cluster_align, 0) - 1;
 	if (irec->ir_startino & imask) {
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
@@ -299,6 +326,17 @@ xchk_iallocbt_rec_alignment(
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 		return;
 	}
+
+	if (iabt->inodes_per_cluster <= XFS_INODES_PER_CHUNK)
+		return;
+
+	/*
+	 * If this is the start of an inode cluster that can be mapped by
+	 * multiple inobt records, the next inobt record must follow exactly
+	 * after this one.
+	 */
+	iabt->next_startino = irec->ir_startino + XFS_INODES_PER_CHUNK;
+	iabt->next_cluster_ino = irec->ir_startino + iabt->inodes_per_cluster;
 }
 
 /* Scrub an inobt/finobt record. */
@@ -470,6 +508,8 @@ xchk_iallocbt(
 		.inodes		= 0,
 		.cluster_align	= xfs_ialloc_cluster_alignment(sc->mp),
 		.blocks_per_cluster = xfs_icluster_size_fsb(sc->mp),
+		.next_startino	= NULLAGINO,
+		.next_cluster_ino = NULLAGINO,
 	};
 	int			error;
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/8] xfs: hoist inode cluster checks out of loop
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (3 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 4/8] xfs: check inobt record alignment on big block filesystems Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 21:39   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub Darrick J. Wong
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Hoist the inode cluster checks out of the inobt record check loop into
a separate function in preparation for refactoring of that loop.  No
functional changes here; that's in the next patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |  120 +++++++++++++++++++++++++++----------------------
 1 file changed, 66 insertions(+), 54 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 4fadea892440..12158ed8ad29 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -201,12 +201,13 @@ xchk_iallocbt_check_cluster_freemask(
 	return 0;
 }
 
-/* Make sure the free mask is consistent with what the inodes think. */
+/* Check an inode cluster. */
 STATIC int
-xchk_iallocbt_check_freemask(
+xchk_iallocbt_check_cluster(
 	struct xchk_btree		*bs,
 	struct xchk_iallocbt		*iabt,
-	struct xfs_inobt_rec_incore	*irec)
+	struct xfs_inobt_rec_incore	*irec,
+	xfs_agino_t			agino)
 {
 	struct xfs_imap			imap;
 	struct xfs_mount		*mp = bs->cur->bc_mp;
@@ -214,7 +215,6 @@ xchk_iallocbt_check_freemask(
 	struct xfs_buf			*bp;
 	xfs_ino_t			fsino;
 	unsigned int			nr_inodes;
-	xfs_agino_t			agino;
 	xfs_agino_t			chunkino;
 	xfs_agino_t			clusterino;
 	xfs_agblock_t			agbno;
@@ -226,59 +226,71 @@ xchk_iallocbt_check_freemask(
 	nr_inodes = min_t(unsigned int, iabt->inodes_per_cluster,
 			XFS_INODES_PER_CHUNK);
 
-	for (agino = irec->ir_startino;
-	     agino < irec->ir_startino + XFS_INODES_PER_CHUNK;
-	     agino += iabt->blocks_per_cluster * mp->m_sb.sb_inopblock) {
-		fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
-		chunkino = agino - irec->ir_startino;
-		agbno = XFS_AGINO_TO_AGBNO(mp, agino);
-
-		/* Compute the holemask mask for this cluster. */
-		for (clusterino = 0, holemask = 0; clusterino < nr_inodes;
-		     clusterino += XFS_INODES_PER_HOLEMASK_BIT)
-			holemask |= XFS_INOBT_MASK((chunkino + clusterino) /
-					XFS_INODES_PER_HOLEMASK_BIT);
-
-		/* The whole cluster must be a hole or not a hole. */
-		ir_holemask = (irec->ir_holemask & holemask);
-		if (ir_holemask != holemask && ir_holemask != 0) {
-			xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
-			continue;
-		}
+	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
+	chunkino = agino - irec->ir_startino;
+	agbno = XFS_AGINO_TO_AGBNO(mp, agino);
 
-		/* If any part of this is a hole, skip it. */
-		if (ir_holemask) {
-			xchk_xref_is_not_owned_by(bs->sc, agbno,
-					iabt->blocks_per_cluster, &iabt->oinfo);
-			continue;
-		}
+	/* Compute the holemask mask for this cluster. */
+	for (clusterino = 0, holemask = 0; clusterino < nr_inodes;
+	     clusterino += XFS_INODES_PER_HOLEMASK_BIT)
+		holemask |= XFS_INOBT_MASK((chunkino + clusterino) /
+				XFS_INODES_PER_HOLEMASK_BIT);
 
-		xchk_xref_is_owned_by(bs->sc, agbno, iabt->blocks_per_cluster,
-				&iabt->oinfo);
-
-		/* Grab the inode cluster buffer. */
-		imap.im_blkno = XFS_AGB_TO_DADDR(mp, bs->cur->bc_private.a.agno,
-				agbno);
-		imap.im_len = XFS_FSB_TO_BB(mp, iabt->blocks_per_cluster);
-		imap.im_boffset = 0;
-
-		error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap,
-				&dip, &bp, 0, 0);
-		if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0,
-				&error))
-			continue;
-
-		/* Which inodes are free? */
-		for (clusterino = 0; clusterino < nr_inodes; clusterino++) {
-			error = xchk_iallocbt_check_cluster_freemask(bs,
-					fsino, chunkino, clusterino, irec, bp);
-			if (error) {
-				xfs_trans_brelse(bs->cur->bc_tp, bp);
-				return error;
-			}
-		}
+	/* The whole cluster must be a hole or not a hole. */
+	ir_holemask = (irec->ir_holemask & holemask);
+	if (ir_holemask != holemask && ir_holemask != 0) {
+		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+		return 0;
+	}
+
+	/* If any part of this is a hole, skip it. */
+	if (ir_holemask) {
+		xchk_xref_is_not_owned_by(bs->sc, agbno,
+				iabt->blocks_per_cluster, &iabt->oinfo);
+		return 0;
+	}
+
+	xchk_xref_is_owned_by(bs->sc, agbno, iabt->blocks_per_cluster,
+			&iabt->oinfo);
+
+	/* Grab the inode cluster buffer. */
+	imap.im_blkno = XFS_AGB_TO_DADDR(mp, bs->cur->bc_private.a.agno, agbno);
+	imap.im_len = XFS_FSB_TO_BB(mp, iabt->blocks_per_cluster);
+	imap.im_boffset = 0;
 
-		xfs_trans_brelse(bs->cur->bc_tp, bp);
+	error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap, &dip, &bp, 0, 0);
+	if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0, &error))
+		return 0;
+
+	/* Which inodes are free? */
+	for (clusterino = 0; clusterino < nr_inodes; clusterino++) {
+		error = xchk_iallocbt_check_cluster_freemask(bs, fsino,
+				chunkino, clusterino, irec, bp);
+		if (error)
+			break;
+	}
+
+	xfs_trans_brelse(bs->cur->bc_tp, bp);
+	return error;
+}
+
+/* Make sure the free mask is consistent with what the inodes think. */
+STATIC int
+xchk_iallocbt_check_freemask(
+	struct xchk_btree		*bs,
+	struct xchk_iallocbt		*iabt,
+	struct xfs_inobt_rec_incore	*irec)
+{
+	struct xfs_mount		*mp = bs->cur->bc_mp;
+	xfs_agino_t			agino;
+	int				error = 0;
+
+	for (agino = irec->ir_startino;
+	     agino < irec->ir_startino + XFS_INODES_PER_CHUNK;
+	     agino += iabt->blocks_per_cluster * mp->m_sb.sb_inopblock) {
+		error = xchk_iallocbt_check_cluster(bs, iabt, irec, agino);
+		if (error)
+			break;
 	}
 
 	return error;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (4 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 5/8] xfs: hoist inode cluster checks out of loop Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 22:11   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask Darrick J. Wong
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

The code to check inobt records against inode clusters is a mess of
poorly named variables and unnecessary parameters.  Clean the
unnecessary inode number parameters out of _check_cluster_freemask in
favor of computing them inside the function instead of making the caller
do it.  In xchk_iallocbt_check_cluster, rename the variables to make it
more obvious just what chunk_ino and cluster_ino represent.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |  130 ++++++++++++++++++++++++++++++-------------------
 fs/xfs/scrub/trace.h  |   45 +++++++++++++++++
 2 files changed, 125 insertions(+), 50 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 12158ed8ad29..645b248faa80 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -147,41 +147,63 @@ xchk_iallocbt_freecount(
 	return hweight64(freemask);
 }
 
-/* Check a particular inode with ir_free. */
+/*
+ * Given the number of an inode within an inode cluster, check that the inode's
+ * allocation status matches ir_free in the inobt record.
+ *
+ * @chunk_ioff is the inode offset of the cluster within the @irec.
+ * @irec is the inobt record.
+ * @bp is the cluster buffer.
+ * @loop_ioff is the inode offset within the inode cluster.
+ */
 STATIC int
-xchk_iallocbt_check_cluster_freemask(
+xchk_iallocbt_check_cluster_ifree(
 	struct xchk_btree		*bs,
-	xfs_ino_t			fsino,
-	xfs_agino_t			chunkino,
-	xfs_agino_t			clusterino,
 	struct xfs_inobt_rec_incore	*irec,
-	struct xfs_buf			*bp)
+	unsigned int			chunk_ioff,
+	struct xfs_buf			*bp,
+	unsigned int			loop_ioff)
 {
-	struct xfs_dinode		*dip;
 	struct xfs_mount		*mp = bs->cur->bc_mp;
-	bool				inode_is_free = false;
+	struct xfs_dinode		*dip;
+	xfs_ino_t			fsino;
+	xfs_agino_t			agino;
+	unsigned int			offset;
+	bool				irec_free;
+	bool				ino_inuse;
 	bool				freemask_ok;
-	bool				inuse;
-	int				error = 0;
+	int				error;
 
 	if (xchk_should_terminate(bs->sc, &error))
 		return error;
 
-	dip = xfs_buf_offset(bp, clusterino * mp->m_sb.sb_inodesize);
+	/*
+	 * Given an inobt record, an offset of a cluster within the record,
+	 * and an offset of an inode within a cluster, compute which fs inode
+	 * we're talking about and the offset of the inode record within the
+	 * inode buffer.
+	 */
+	agino = irec->ir_startino + chunk_ioff + loop_ioff;
+	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
+	offset = loop_ioff * mp->m_sb.sb_inodesize;
+	if (offset >= BBTOB(bp->b_length)) {
+		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
+		goto out;
+	}
+	dip = xfs_buf_offset(bp, offset);
+	irec_free = (irec->ir_free & XFS_INOBT_MASK(chunk_ioff + loop_ioff));
+
 	if (be16_to_cpu(dip->di_magic) != XFS_DINODE_MAGIC ||
-	    (dip->di_version >= 3 &&
-	     be64_to_cpu(dip->di_ino) != fsino + clusterino)) {
+	    (dip->di_version >= 3 && be64_to_cpu(dip->di_ino) != fsino)) {
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 		goto out;
 	}
 
-	if (irec->ir_free & XFS_INOBT_MASK(chunkino + clusterino))
-		inode_is_free = true;
-	error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp,
-			fsino + clusterino, &inuse);
+	error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp, fsino,
+			&ino_inuse);
 	if (error == -ENODATA) {
 		/* Not cached, just read the disk buffer */
-		freemask_ok = inode_is_free ^ !!(dip->di_mode);
+		freemask_ok = irec_free ^ !!(dip->di_mode);
 		if (!bs->sc->try_harder && !freemask_ok)
 			return -EDEADLOCK;
 	} else if (error < 0) {
@@ -193,7 +215,7 @@ xchk_iallocbt_check_cluster_freemask(
 		goto out;
 	} else {
 		/* Inode is all there. */
-		freemask_ok = inode_is_free ^ inuse;
+		freemask_ok = irec_free ^ ino_inuse;
 	}
 	if (!freemask_ok)
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
@@ -201,44 +223,57 @@ xchk_iallocbt_check_cluster_freemask(
 	return 0;
 }
 
-/* Check an inode cluster. */
+/*
+ * Check that the holemask and freemask of a hypothetical inode cluster match
+ * what's actually on disk.  If sparse inodes are enabled, the cluster does
+ * not actually have to map to inodes if the corresponding holemask bit is set.
+ *
+ * @chunk_ioff is the inode offset of the cluster within the @irec.
+ */
 STATIC int
 xchk_iallocbt_check_cluster(
 	struct xchk_btree		*bs,
 	struct xchk_iallocbt		*iabt,
 	struct xfs_inobt_rec_incore	*irec,
-	xfs_agino_t			agino)
+	unsigned int			chunk_ioff)
 {
 	struct xfs_imap			imap;
 	struct xfs_mount		*mp = bs->cur->bc_mp;
 	struct xfs_dinode		*dip;
 	struct xfs_buf			*bp;
-	xfs_ino_t			fsino;
 	unsigned int			nr_inodes;
-	xfs_agino_t			chunkino;
-	xfs_agino_t			clusterino;
+	xfs_agnumber_t			agno = bs->cur->bc_private.a.agno;
 	xfs_agblock_t			agbno;
-	uint16_t			holemask;
+	unsigned int			loop_ioff;
+	uint16_t			cluster_mask = 0;
 	uint16_t			ir_holemask;
 	int				error = 0;
 
-	/* Make sure the freemask matches the inode records. */
 	nr_inodes = min_t(unsigned int, iabt->inodes_per_cluster,
 			XFS_INODES_PER_CHUNK);
 
-	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
-	chunkino = agino - irec->ir_startino;
-	agbno = XFS_AGINO_TO_AGBNO(mp, agino);
+	/* Map this inode cluster */
+	agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino + chunk_ioff);
 
-	/* Compute the holemask mask for this cluster. */
-	for (clusterino = 0, holemask = 0; clusterino < nr_inodes;
-	     clusterino += XFS_INODES_PER_HOLEMASK_BIT)
-		holemask |= XFS_INOBT_MASK((chunkino + clusterino) /
+	/* Compute a bitmask for this cluster that can be used for holemask. */
+	for (loop_ioff = 0;
+	     loop_ioff < nr_inodes;
+	     loop_ioff += XFS_INODES_PER_HOLEMASK_BIT)
+		cluster_mask |= XFS_INOBT_MASK((chunk_ioff + loop_ioff) /
 				XFS_INODES_PER_HOLEMASK_BIT);
 
+	ir_holemask = (irec->ir_holemask & cluster_mask);
+	imap.im_blkno = XFS_AGB_TO_DADDR(mp, agno, agbno);
+	imap.im_len = XFS_FSB_TO_BB(mp, iabt->blocks_per_cluster);
+	imap.im_boffset = 0;
+
+	trace_xchk_iallocbt_check_cluster(mp, agno, irec->ir_startino,
+			imap.im_blkno, imap.im_len, chunk_ioff, nr_inodes,
+			cluster_mask, ir_holemask,
+			XFS_INO_TO_OFFSET(mp, irec->ir_startino + chunk_ioff));
+
 	/* The whole cluster must be a hole or not a hole. */
-	ir_holemask = (irec->ir_holemask & holemask);
-	if (ir_holemask != holemask && ir_holemask != 0) {
+	if (ir_holemask != cluster_mask && ir_holemask != 0) {
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 		return 0;
 	}
@@ -254,18 +289,14 @@ xchk_iallocbt_check_cluster(
 			&iabt->oinfo);
 
 	/* Grab the inode cluster buffer. */
-	imap.im_blkno = XFS_AGB_TO_DADDR(mp, bs->cur->bc_private.a.agno, agbno);
-	imap.im_len = XFS_FSB_TO_BB(mp, iabt->blocks_per_cluster);
-	imap.im_boffset = 0;
-
 	error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap, &dip, &bp, 0, 0);
 	if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0, &error))
-		return 0;
+		return error;
 
-	/* Which inodes are free? */
-	for (clusterino = 0; clusterino < nr_inodes; clusterino++) {
-		error = xchk_iallocbt_check_cluster_freemask(bs, fsino,
-				chunkino, clusterino, irec, bp);
+	/* Check free status of each inode within this cluster. */
+	for (loop_ioff = 0; loop_ioff < nr_inodes; loop_ioff++) {
+		error = xchk_iallocbt_check_cluster_ifree(bs, irec, chunk_ioff,
+				bp, loop_ioff);
 		if (error)
 			break;
 	}
@@ -281,14 +312,13 @@ xchk_iallocbt_check_freemask(
 	struct xchk_iallocbt		*iabt,
 	struct xfs_inobt_rec_incore	*irec)
 {
-	struct xfs_mount		*mp = bs->cur->bc_mp;
-	xfs_agino_t			agino;
+	unsigned int			chunk_ioff;
 	int				error = 0;
 
-	for (agino = irec->ir_startino;
-	     agino < irec->ir_startino + XFS_INODES_PER_CHUNK;
-	     agino += iabt->blocks_per_cluster * mp->m_sb.sb_inopblock) {
-		error = xchk_iallocbt_check_cluster(bs, iabt, irec, agino);
+	for (chunk_ioff = 0;
+	     chunk_ioff < XFS_INODES_PER_CHUNK;
+	     chunk_ioff += iabt->inodes_per_cluster) {
+		error = xchk_iallocbt_check_cluster(bs, iabt, irec, chunk_ioff);
 		if (error)
 			break;
 	}
diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index 4e20f0e48232..f1c144c6dcc5 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -480,6 +480,51 @@ TRACE_EVENT(xchk_xref_error,
 		  __entry->ret_ip)
 );
 
+TRACE_EVENT(xchk_iallocbt_check_cluster,
+	TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno,
+		 xfs_agino_t startino, xfs_daddr_t map_daddr,
+		 unsigned short map_len, unsigned int chunk_ino,
+		 unsigned int nr_inodes, uint16_t cluster_mask,
+		 uint16_t holemask, unsigned int cluster_ino),
+	TP_ARGS(mp, agno, startino, map_daddr, map_len, chunk_ino, nr_inodes,
+		cluster_mask, holemask, cluster_ino),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(xfs_agnumber_t, agno)
+		__field(xfs_agino_t, startino)
+		__field(xfs_daddr_t, map_daddr)
+		__field(unsigned short, map_len)
+		__field(unsigned int, chunk_ino)
+		__field(unsigned int, nr_inodes)
+		__field(unsigned int, cluster_ino)
+		__field(uint16_t, cluster_mask)
+		__field(uint16_t, holemask)
+	),
+	TP_fast_assign(
+		__entry->dev = mp->m_super->s_dev;
+		__entry->agno = agno;
+		__entry->startino = startino;
+		__entry->map_daddr = map_daddr;
+		__entry->map_len = map_len;
+		__entry->chunk_ino = chunk_ino;
+		__entry->nr_inodes = nr_inodes;
+		__entry->cluster_mask = cluster_mask;
+		__entry->holemask = holemask;
+		__entry->cluster_ino = cluster_ino;
+	),
+	TP_printk("dev %d:%d agno %d startino %u daddr 0x%llx len %d chunkino %u nr_inodes %u cluster_mask 0x%x holemask 0x%x cluster_ino %u",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->agno,
+		  __entry->startino,
+		  __entry->map_daddr,
+		  __entry->map_len,
+		  __entry->chunk_ino,
+		  __entry->nr_inodes,
+		  __entry->cluster_mask,
+		  __entry->holemask,
+		  __entry->cluster_ino)
+)
+
 /* repair tracepoints */
 #if IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR)
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (5 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 22:13   ` Dave Chinner
  2018-11-06  4:08 ` [PATCH 8/8] xfs: scrub big block inode btrees correctly Darrick J. Wong
  2018-11-09 15:07 ` [PATCH 0/8] xfs-5.0: inode btree scrub fixes Christoph Hellwig
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Change the name of xchk_iallocbt_check_freemask so that it represents
what the function actually does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 645b248faa80..10c3aaefc2a3 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -305,9 +305,13 @@ xchk_iallocbt_check_cluster(
 	return error;
 }
 
-/* Make sure the free mask is consistent with what the inodes think. */
+/*
+ * For all the inode clusters that could map to this inobt record, make sure
+ * that the holemask makes sense and that the allocation status of each inode
+ * matches the freemask.
+ */
 STATIC int
-xchk_iallocbt_check_freemask(
+xchk_iallocbt_check_clusters(
 	struct xchk_btree		*bs,
 	struct xchk_iallocbt		*iabt,
 	struct xfs_inobt_rec_incore	*irec)
@@ -315,6 +319,13 @@ xchk_iallocbt_check_freemask(
 	unsigned int			chunk_ioff;
 	int				error = 0;
 
+	/*
+	 * For the common case where this inobt record maps to multiple inode
+	 * clusters this will call _check_cluster for each cluster.
+	 *
+	 * For the case that multiple inobt records map to a single cluster,
+	 * this will call _check_cluster once.
+	 */
 	for (chunk_ioff = 0;
 	     chunk_ioff < XFS_INODES_PER_CHUNK;
 	     chunk_ioff += iabt->inodes_per_cluster) {
@@ -461,7 +472,7 @@ xchk_iallocbt_rec(
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 
 check_freemask:
-	error = xchk_iallocbt_check_freemask(bs, iabt, &irec);
+	error = xchk_iallocbt_check_clusters(bs, iabt, &irec);
 	if (error)
 		goto out;
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/8] xfs: scrub big block inode btrees correctly
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (6 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask Darrick J. Wong
@ 2018-11-06  4:08 ` Darrick J. Wong
  2018-11-06 22:26   ` Dave Chinner
  2018-11-09 15:07 ` [PATCH 0/8] xfs-5.0: inode btree scrub fixes Christoph Hellwig
  8 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-06  4:08 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Teach scrub how to handle the case that there are one or more inobt
records covering a given inode cluster.  This fixes the operation on big
block filesystems (e.g. 64k blocks, 512 byte inodes).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/ialloc.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index 10c3aaefc2a3..5a5f8ae31803 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -169,6 +169,7 @@ xchk_iallocbt_check_cluster_ifree(
 	xfs_ino_t			fsino;
 	xfs_agino_t			agino;
 	unsigned int			offset;
+	unsigned int			cluster_ioff;
 	bool				irec_free;
 	bool				ino_inuse;
 	bool				freemask_ok;
@@ -181,11 +182,13 @@ xchk_iallocbt_check_cluster_ifree(
 	 * Given an inobt record, an offset of a cluster within the record,
 	 * and an offset of an inode within a cluster, compute which fs inode
 	 * we're talking about and the offset of the inode record within the
-	 * inode buffer.
+	 * inode buffer, being careful about inobt records that don't align
+	 * with the start of the inode buffer when block sizes are large.
 	 */
 	agino = irec->ir_startino + chunk_ioff + loop_ioff;
 	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
-	offset = loop_ioff * mp->m_sb.sb_inodesize;
+	cluster_ioff = XFS_INO_TO_OFFSET(mp, irec->ir_startino + chunk_ioff);
+	offset = (cluster_ioff + loop_ioff) * mp->m_sb.sb_inodesize;
 	if (offset >= BBTOB(bp->b_length)) {
 		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
 		goto out;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub
  2018-11-06  4:08 ` [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub Darrick J. Wong
@ 2018-11-06 21:20   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 21:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:12PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> A big block filesystem might require more than one inobt record to cover
> all the inodes in the block.  In these cases it is not correct to round
> the irec count up to the nearest block because this causes us to
> overestimate the number of inode blocks we expect to find.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] xfs: calculate inode cluster geometry once during inobt scrub
  2018-11-06  4:08 ` [PATCH 2/8] xfs: calculate inode cluster geometry once during " Darrick J. Wong
@ 2018-11-06 21:32   ` Dave Chinner
  2018-11-07  3:40     ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 21:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:18PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Hoist all the inode cluster geometry information into struct
> xchk_iallocbt instead of recomputing them over and over.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/ialloc.c |   42 ++++++++++++++++++++++++++++--------------
>  1 file changed, 28 insertions(+), 14 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 76ae8338a42e..3c12a0fe3b38 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -45,8 +45,18 @@ xchk_setup_ag_iallocbt(
>  /* Inode btree scrubber. */
>  
>  struct xchk_iallocbt {
> +	/* owner info for inode blocks */
> +	struct xfs_owner_info	oinfo;
> +
>  	/* Number of inodes we see while scanning inobt. */
>  	unsigned long long	inodes;
> +
> +	/* Blocks and inodes per inode cluster. */
> +	unsigned int		blocks_per_cluster;
> +	unsigned int		inodes_per_cluster;
> +
> +	/* Block alignment of inode clusters. */
> +	unsigned int		cluster_align;
>  };
>  
>  /*
> @@ -189,32 +199,30 @@ xchk_iallocbt_check_cluster_freemask(
>  STATIC int
>  xchk_iallocbt_check_freemask(
>  	struct xchk_btree		*bs,
> +	struct xchk_iallocbt		*iabt,
>  	struct xfs_inobt_rec_incore	*irec)
>  {
> -	struct xfs_owner_info		oinfo;
>  	struct xfs_imap			imap;
>  	struct xfs_mount		*mp = bs->cur->bc_mp;
>  	struct xfs_dinode		*dip;
>  	struct xfs_buf			*bp;
>  	xfs_ino_t			fsino;
> -	xfs_agino_t			nr_inodes;
> +	unsigned int			nr_inodes;
>  	xfs_agino_t			agino;
>  	xfs_agino_t			chunkino;
>  	xfs_agino_t			clusterino;
>  	xfs_agblock_t			agbno;
> -	int				blks_per_cluster;
>  	uint16_t			holemask;
>  	uint16_t			ir_holemask;
>  	int				error = 0;
>  
>  	/* Make sure the freemask matches the inode records. */
> -	blks_per_cluster = xfs_icluster_size_fsb(mp);
> -	nr_inodes = XFS_OFFBNO_TO_AGINO(mp, blks_per_cluster, 0);
> -	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INODES);
> +	nr_inodes = min_t(unsigned int, iabt->inodes_per_cluster,
> +			XFS_INODES_PER_CHUNK);

That's a bug fix not just a mechanical change, right? Can you either
call it out in the commit message, or put it in a separate patch?

Apart from that, the patch is good.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

> @@ -440,9 +448,15 @@ xchk_iallocbt(
>  	struct xfs_owner_info	oinfo;
>  	struct xchk_iallocbt	iabt = {
>  		.inodes		= 0,
> +		.cluster_align	= xfs_ialloc_cluster_alignment(sc->mp),
> +		.blocks_per_cluster = xfs_icluster_size_fsb(sc->mp),
>  	};
>  	int			error;
>  
> +	xfs_rmap_ag_owner(&iabt.oinfo, XFS_RMAP_OWN_INODES);
> +	iabt.inodes_per_cluster = XFS_OFFBNO_TO_AGINO(sc->mp,
> +			iabt.blocks_per_cluster, 0);

As an aside, this Seems like an obtuse way to calculate inodes per
cluster. It is just:

	inodes_per_cluster = blocks_per_cluster << mp->m_sb.sb_inopblog;

I guess that's exactly what the macro degenerates to, but there's
lots of code with that exact open coded calculation for inodes per
cluster.

/me wonders if we should just calculate these two values at mount
time and stuff them in the struct xfs_mount...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/8] xfs: check the ir_startino alignment directly
  2018-11-06  4:08 ` [PATCH 3/8] xfs: check the ir_startino alignment directly Darrick J. Wong
@ 2018-11-06 21:34   ` Dave Chinner
  2018-11-07  3:50     ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 21:34 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:25PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In xchk_iallocbt_rec, check the alignment of ir_startino by converting
> the inode cluster block alignment into units of inodes instead of the
> other way around (converting ir_startino to blocks).  This prevents us
> from tripping over off-by-one errors in ir_startino which are obscured
> by the inode -> block conversion.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/ialloc.c |   32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 3c12a0fe3b38..f498dfca3312 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -278,6 +278,29 @@ xchk_iallocbt_check_freemask(
>  	return error;
>  }
>  
> +/* Make sure this record is aligned to cluster and inoalignmnt size. */
> +STATIC void
> +xchk_iallocbt_rec_alignment(
> +	struct xchk_btree		*bs,
> +	struct xfs_inobt_rec_incore	*irec)
> +{
> +	struct xfs_mount		*mp = bs->cur->bc_mp;
> +	struct xchk_iallocbt		*iabt = bs->private;
> +	xfs_agino_t			imask;
> +
> +	imask = XFS_OFFBNO_TO_AGINO(mp, iabt->cluster_align, 0) - 1;

Again, this seems obtuse.

	imask = (iabt->cluster_align << mp->m_sb.sb_inopblog) - 1;

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] xfs: check inobt record alignment on big block filesystems
  2018-11-06  4:08 ` [PATCH 4/8] xfs: check inobt record alignment on big block filesystems Darrick J. Wong
@ 2018-11-06 21:36   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 21:36 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:31PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> On a big block filesystem, there may be multiple inobt records covering
> a single inode cluster.  These records obviously won't be aligned to
> cluster alignment rules, and they must cover the entire cluster.  Teach
> scrub to check for these things.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/8] xfs: hoist inode cluster checks out of loop
  2018-11-06  4:08 ` [PATCH 5/8] xfs: hoist inode cluster checks out of loop Darrick J. Wong
@ 2018-11-06 21:39   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 21:39 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:40PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Hoist the inode cluster checks out of the inobt record check loop into
> a separate function in preparation for refactoring of that loop.  No
> functional changes here; that's in the next patch.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

looks ok,

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub
  2018-11-06  4:08 ` [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub Darrick J. Wong
@ 2018-11-06 22:11   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 22:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:46PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The code to check inobt records against inode clusters is a mess of
> poorly named variables and unnecessary parameters.  Clean the
> unnecessary inode number parameters out of _check_cluster_freemask in
> favor of computing them inside the function instead of making the caller
> do it.  In xchk_iallocbt_check_cluster, rename the variables to make it
> more obvious just what chunk_ino and cluster_ino represent.

Comment about new trace point? (used to debug change?)

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/ialloc.c |  130 ++++++++++++++++++++++++++++++-------------------
>  fs/xfs/scrub/trace.h  |   45 +++++++++++++++++
>  2 files changed, 125 insertions(+), 50 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 12158ed8ad29..645b248faa80 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -147,41 +147,63 @@ xchk_iallocbt_freecount(
>  	return hweight64(freemask);
>  }
>  
> -/* Check a particular inode with ir_free. */
> +/*
> + * Given the number of an inode within an inode cluster, check that the inode's
> + * allocation status matches ir_free in the inobt record.
> + *
> + * @chunk_ioff is the inode offset of the cluster within the @irec.
> + * @irec is the inobt record.
> + * @bp is the cluster buffer.
> + * @loop_ioff is the inode offset within the inode cluster.
> + */
>  STATIC int
> -xchk_iallocbt_check_cluster_freemask(
> +xchk_iallocbt_check_cluster_ifree(
>  	struct xchk_btree		*bs,
> -	xfs_ino_t			fsino,
> -	xfs_agino_t			chunkino,
> -	xfs_agino_t			clusterino,
>  	struct xfs_inobt_rec_incore	*irec,
> -	struct xfs_buf			*bp)
> +	unsigned int			chunk_ioff,
> +	struct xfs_buf			*bp,
> +	unsigned int			loop_ioff)

"loop_ioff" really isn't very descriptive. I've got no idea what it
means in the context of this function. More below.

>  {
> -	struct xfs_dinode		*dip;
>  	struct xfs_mount		*mp = bs->cur->bc_mp;
> -	bool				inode_is_free = false;
> +	struct xfs_dinode		*dip;
> +	xfs_ino_t			fsino;
> +	xfs_agino_t			agino;
> +	unsigned int			offset;
> +	bool				irec_free;
> +	bool				ino_inuse;
>  	bool				freemask_ok;
> -	bool				inuse;
> -	int				error = 0;
> +	int				error;
>  
>  	if (xchk_should_terminate(bs->sc, &error))
>  		return error;
>  
> -	dip = xfs_buf_offset(bp, clusterino * mp->m_sb.sb_inodesize);
> +	/*
> +	 * Given an inobt record, an offset of a cluster within the record,
> +	 * and an offset of an inode within a cluster, compute which fs inode
> +	 * we're talking about and the offset of the inode record within the
> +	 * inode buffer.
> +	 */
> +	agino = irec->ir_startino + chunk_ioff + loop_ioff;
> +	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
> +	offset = loop_ioff * mp->m_sb.sb_inodesize;
> +	if (offset >= BBTOB(bp->b_length)) {
> +		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
> +		goto out;
> +	}

New check...

> +	dip = xfs_buf_offset(bp, offset);
> +	irec_free = (irec->ir_free & XFS_INOBT_MASK(chunk_ioff + loop_ioff));

And after 15 minutes of trying to work out if this is safe, I can
say that the new code isn't much clearer than the old code.

chunk_ioff is actually the first inode in the current cluster,
and loop_ioff is the current of the inode in the cluster.

They are indexes, not offsets. i.e. chunk_ioff is cluster_base,
and loop_ioff is cluster_index. (or something similar).


> +
>  	if (be16_to_cpu(dip->di_magic) != XFS_DINODE_MAGIC ||
> -	    (dip->di_version >= 3 &&
> -	     be64_to_cpu(dip->di_ino) != fsino + clusterino)) {
> +	    (dip->di_version >= 3 && be64_to_cpu(dip->di_ino) != fsino)) {
>  		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
>  		goto out;
>  	}
>  
> -	if (irec->ir_free & XFS_INOBT_MASK(chunkino + clusterino))
> -		inode_is_free = true;
> -	error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp,
> -			fsino + clusterino, &inuse);
> +	error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp, fsino,
> +			&ino_inuse);
>  	if (error == -ENODATA) {
>  		/* Not cached, just read the disk buffer */
> -		freemask_ok = inode_is_free ^ !!(dip->di_mode);
> +		freemask_ok = irec_free ^ !!(dip->di_mode);
>  		if (!bs->sc->try_harder && !freemask_ok)
>  			return -EDEADLOCK;
>  	} else if (error < 0) {
> @@ -193,7 +215,7 @@ xchk_iallocbt_check_cluster_freemask(
>  		goto out;
>  	} else {
>  		/* Inode is all there. */
> -		freemask_ok = inode_is_free ^ inuse;
> +		freemask_ok = irec_free ^ ino_inuse;
>  	}
>  	if (!freemask_ok)
>  		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
> @@ -201,44 +223,57 @@ xchk_iallocbt_check_cluster_freemask(
>  	return 0;
>  }
>  
> -/* Check an inode cluster. */
> +/*
> + * Check that the holemask and freemask of a hypothetical inode cluster match
> + * what's actually on disk.  If sparse inodes are enabled, the cluster does
> + * not actually have to map to inodes if the corresponding holemask bit is set.
> + *
> + * @chunk_ioff is the inode offset of the cluster within the @irec.
> + */
>  STATIC int
>  xchk_iallocbt_check_cluster(
>  	struct xchk_btree		*bs,
>  	struct xchk_iallocbt		*iabt,
>  	struct xfs_inobt_rec_incore	*irec,
> -	xfs_agino_t			agino)
> +	unsigned int			chunk_ioff)

cluster_base

>  {
>  	struct xfs_imap			imap;
>  	struct xfs_mount		*mp = bs->cur->bc_mp;
>  	struct xfs_dinode		*dip;
>  	struct xfs_buf			*bp;
> -	xfs_ino_t			fsino;
>  	unsigned int			nr_inodes;
> -	xfs_agino_t			chunkino;
> -	xfs_agino_t			clusterino;
> +	xfs_agnumber_t			agno = bs->cur->bc_private.a.agno;
>  	xfs_agblock_t			agbno;
> -	uint16_t			holemask;
> +	unsigned int			loop_ioff;

cluster_index

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask
  2018-11-06  4:08 ` [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask Darrick J. Wong
@ 2018-11-06 22:13   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 22:13 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:53PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Change the name of xchk_iallocbt_check_freemask so that it represents
> what the function actually does.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/ialloc.c |   17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 645b248faa80..10c3aaefc2a3 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -305,9 +305,13 @@ xchk_iallocbt_check_cluster(
>  	return error;
>  }
>  
> -/* Make sure the free mask is consistent with what the inodes think. */
> +/*
> + * For all the inode clusters that could map to this inobt record, make sure
> + * that the holemask makes sense and that the allocation status of each inode
> + * matches the freemask.
> + */
>  STATIC int
> -xchk_iallocbt_check_freemask(
> +xchk_iallocbt_check_clusters(
>  	struct xchk_btree		*bs,
>  	struct xchk_iallocbt		*iabt,
>  	struct xfs_inobt_rec_incore	*irec)
> @@ -315,6 +319,13 @@ xchk_iallocbt_check_freemask(
>  	unsigned int			chunk_ioff;
>  	int				error = 0;
>  
> +	/*
> +	 * For the common case where this inobt record maps to multiple inode
> +	 * clusters this will call _check_cluster for each cluster.
> +	 *
> +	 * For the case that multiple inobt records map to a single cluster,
> +	 * this will call _check_cluster once.
> +	 */
>  	for (chunk_ioff = 0;
>  	     chunk_ioff < XFS_INODES_PER_CHUNK;
>  	     chunk_ioff += iabt->inodes_per_cluster) {

This comment really belongs in the previous patch. And, with that, I
think this can be merged totally into the previous patch.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/8] xfs: scrub big block inode btrees correctly
  2018-11-06  4:08 ` [PATCH 8/8] xfs: scrub big block inode btrees correctly Darrick J. Wong
@ 2018-11-06 22:26   ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2018-11-06 22:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 05, 2018 at 08:08:59PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Teach scrub how to handle the case that there are one or more inobt
> records covering a given inode cluster.  This fixes the operation on big
> block filesystems (e.g. 64k blocks, 512 byte inodes).
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/ialloc.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> index 10c3aaefc2a3..5a5f8ae31803 100644
> --- a/fs/xfs/scrub/ialloc.c
> +++ b/fs/xfs/scrub/ialloc.c
> @@ -169,6 +169,7 @@ xchk_iallocbt_check_cluster_ifree(
>  	xfs_ino_t			fsino;
>  	xfs_agino_t			agino;
>  	unsigned int			offset;
> +	unsigned int			cluster_ioff;
>  	bool				irec_free;
>  	bool				ino_inuse;
>  	bool				freemask_ok;
> @@ -181,11 +182,13 @@ xchk_iallocbt_check_cluster_ifree(
>  	 * Given an inobt record, an offset of a cluster within the record,
>  	 * and an offset of an inode within a cluster, compute which fs inode
>  	 * we're talking about and the offset of the inode record within the
> -	 * inode buffer.
> +	 * inode buffer, being careful about inobt records that don't align
> +	 * with the start of the inode buffer when block sizes are large.
>  	 */
>  	agino = irec->ir_startino + chunk_ioff + loop_ioff;
>  	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
> -	offset = loop_ioff * mp->m_sb.sb_inodesize;
> +	cluster_ioff = XFS_INO_TO_OFFSET(mp, irec->ir_startino + chunk_ioff);
> +	offset = (cluster_ioff + loop_ioff) * mp->m_sb.sb_inodesize;

urk.

That's even more confusing because now we have a cluster offset into
the chunk, a chunk offset into the cluster, and an inode offset
into the cluster.

And the cluster inode offset into the chunk is call "chunk offset", the
chunk offset into the cluster is called "cluster offset" and the
inode offset is called "loop offset". And none of them are offsets -
we then calculate and "offset" from all these other "not quite
offsets"....

I'm definitely not going to remember this when I look at the code
again in a week. More help needed, I think.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] xfs: calculate inode cluster geometry once during inobt scrub
  2018-11-06 21:32   ` Dave Chinner
@ 2018-11-07  3:40     ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-07  3:40 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Wed, Nov 07, 2018 at 08:32:18AM +1100, Dave Chinner wrote:
> On Mon, Nov 05, 2018 at 08:08:18PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Hoist all the inode cluster geometry information into struct
> > xchk_iallocbt instead of recomputing them over and over.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/scrub/ialloc.c |   42 ++++++++++++++++++++++++++++--------------
> >  1 file changed, 28 insertions(+), 14 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> > index 76ae8338a42e..3c12a0fe3b38 100644
> > --- a/fs/xfs/scrub/ialloc.c
> > +++ b/fs/xfs/scrub/ialloc.c
> > @@ -45,8 +45,18 @@ xchk_setup_ag_iallocbt(
> >  /* Inode btree scrubber. */
> >  
> >  struct xchk_iallocbt {
> > +	/* owner info for inode blocks */
> > +	struct xfs_owner_info	oinfo;
> > +
> >  	/* Number of inodes we see while scanning inobt. */
> >  	unsigned long long	inodes;
> > +
> > +	/* Blocks and inodes per inode cluster. */
> > +	unsigned int		blocks_per_cluster;
> > +	unsigned int		inodes_per_cluster;
> > +
> > +	/* Block alignment of inode clusters. */
> > +	unsigned int		cluster_align;
> >  };
> >  
> >  /*
> > @@ -189,32 +199,30 @@ xchk_iallocbt_check_cluster_freemask(
> >  STATIC int
> >  xchk_iallocbt_check_freemask(
> >  	struct xchk_btree		*bs,
> > +	struct xchk_iallocbt		*iabt,
> >  	struct xfs_inobt_rec_incore	*irec)
> >  {
> > -	struct xfs_owner_info		oinfo;
> >  	struct xfs_imap			imap;
> >  	struct xfs_mount		*mp = bs->cur->bc_mp;
> >  	struct xfs_dinode		*dip;
> >  	struct xfs_buf			*bp;
> >  	xfs_ino_t			fsino;
> > -	xfs_agino_t			nr_inodes;
> > +	unsigned int			nr_inodes;
> >  	xfs_agino_t			agino;
> >  	xfs_agino_t			chunkino;
> >  	xfs_agino_t			clusterino;
> >  	xfs_agblock_t			agbno;
> > -	int				blks_per_cluster;
> >  	uint16_t			holemask;
> >  	uint16_t			ir_holemask;
> >  	int				error = 0;
> >  
> >  	/* Make sure the freemask matches the inode records. */
> > -	blks_per_cluster = xfs_icluster_size_fsb(mp);
> > -	nr_inodes = XFS_OFFBNO_TO_AGINO(mp, blks_per_cluster, 0);
> > -	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INODES);
> > +	nr_inodes = min_t(unsigned int, iabt->inodes_per_cluster,
> > +			XFS_INODES_PER_CHUNK);
> 
> That's a bug fix not just a mechanical change, right? Can you either
> call it out in the commit message, or put it in a separate patch?

I'm pulling this into a separate patch.

> Apart from that, the patch is good.
> 
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> 
> > @@ -440,9 +448,15 @@ xchk_iallocbt(
> >  	struct xfs_owner_info	oinfo;
> >  	struct xchk_iallocbt	iabt = {
> >  		.inodes		= 0,
> > +		.cluster_align	= xfs_ialloc_cluster_alignment(sc->mp),
> > +		.blocks_per_cluster = xfs_icluster_size_fsb(sc->mp),
> >  	};
> >  	int			error;
> >  
> > +	xfs_rmap_ag_owner(&iabt.oinfo, XFS_RMAP_OWN_INODES);
> > +	iabt.inodes_per_cluster = XFS_OFFBNO_TO_AGINO(sc->mp,
> > +			iabt.blocks_per_cluster, 0);
> 
> As an aside, this Seems like an obtuse way to calculate inodes per
> cluster. It is just:
> 
> 	inodes_per_cluster = blocks_per_cluster << mp->m_sb.sb_inopblog;
> 
> I guess that's exactly what the macro degenerates to, but there's
> lots of code with that exact open coded calculation for inodes per
> cluster.
> 
> /me wonders if we should just calculate these two values at mount
> time and stuff them in the struct xfs_mount...

FWIW I went with:

#define	XFS_FSB_TO_INO(mp,b)	(((b) << (mp)->m_sb.sb_inopblog))

and converted the existing XFS_OFFBNO_TO_AGINO(..., 0) callers to use
it.  I'll work on moving all that inode geometry stuff to a separate
structure, but that'll come after this series.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/8] xfs: check the ir_startino alignment directly
  2018-11-06 21:34   ` Dave Chinner
@ 2018-11-07  3:50     ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-07  3:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Wed, Nov 07, 2018 at 08:34:47AM +1100, Dave Chinner wrote:
> On Mon, Nov 05, 2018 at 08:08:25PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In xchk_iallocbt_rec, check the alignment of ir_startino by converting
> > the inode cluster block alignment into units of inodes instead of the
> > other way around (converting ir_startino to blocks).  This prevents us
> > from tripping over off-by-one errors in ir_startino which are obscured
> > by the inode -> block conversion.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/scrub/ialloc.c |   32 ++++++++++++++++++++++++++------
> >  1 file changed, 26 insertions(+), 6 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> > index 3c12a0fe3b38..f498dfca3312 100644
> > --- a/fs/xfs/scrub/ialloc.c
> > +++ b/fs/xfs/scrub/ialloc.c
> > @@ -278,6 +278,29 @@ xchk_iallocbt_check_freemask(
> >  	return error;
> >  }
> >  
> > +/* Make sure this record is aligned to cluster and inoalignmnt size. */
> > +STATIC void
> > +xchk_iallocbt_rec_alignment(
> > +	struct xchk_btree		*bs,
> > +	struct xfs_inobt_rec_incore	*irec)
> > +{
> > +	struct xfs_mount		*mp = bs->cur->bc_mp;
> > +	struct xchk_iallocbt		*iabt = bs->private;
> > +	xfs_agino_t			imask;
> > +
> > +	imask = XFS_OFFBNO_TO_AGINO(mp, iabt->cluster_align, 0) - 1;
> 
> Again, this seems obtuse.
> 
> 	imask = (iabt->cluster_align << mp->m_sb.sb_inopblog) - 1;

Or I'll just add another field to xchk_iallocbt for "inode cluster
alignment in inodes" and use that here and elsewhere.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] xfs-5.0: inode btree scrub fixes
  2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
                   ` (7 preceding siblings ...)
  2018-11-06  4:08 ` [PATCH 8/8] xfs: scrub big block inode btrees correctly Darrick J. Wong
@ 2018-11-09 15:07 ` Christoph Hellwig
  2018-11-09 16:39   ` Darrick J. Wong
  8 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2018-11-09 15:07 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs


What is the xfs-5.0 in the subject line?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] xfs-5.0: inode btree scrub fixes
  2018-11-09 15:07 ` [PATCH 0/8] xfs-5.0: inode btree scrub fixes Christoph Hellwig
@ 2018-11-09 16:39   ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2018-11-09 16:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Nov 09, 2018 at 07:07:06AM -0800, Christoph Hellwig wrote:
> 
> What is the xfs-5.0 in the subject line?

The 5.0 was intended to convey "patches for the next merge window" (as
opposed to "fixes for 4.20").  Maybe I'll just say "xfs-next" from now
on as that clearly didn't come across. :)

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-11-10  2:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06  4:08 [PATCH 0/8] xfs-5.0: inode btree scrub fixes Darrick J. Wong
2018-11-06  4:08 ` [PATCH 1/8] xfs: count inode blocks correctly in inobt scrub Darrick J. Wong
2018-11-06 21:20   ` Dave Chinner
2018-11-06  4:08 ` [PATCH 2/8] xfs: calculate inode cluster geometry once during " Darrick J. Wong
2018-11-06 21:32   ` Dave Chinner
2018-11-07  3:40     ` Darrick J. Wong
2018-11-06  4:08 ` [PATCH 3/8] xfs: check the ir_startino alignment directly Darrick J. Wong
2018-11-06 21:34   ` Dave Chinner
2018-11-07  3:50     ` Darrick J. Wong
2018-11-06  4:08 ` [PATCH 4/8] xfs: check inobt record alignment on big block filesystems Darrick J. Wong
2018-11-06 21:36   ` Dave Chinner
2018-11-06  4:08 ` [PATCH 5/8] xfs: hoist inode cluster checks out of loop Darrick J. Wong
2018-11-06 21:39   ` Dave Chinner
2018-11-06  4:08 ` [PATCH 6/8] xfs: clean up the inode cluster checking in the inobt scrub Darrick J. Wong
2018-11-06 22:11   ` Dave Chinner
2018-11-06  4:08 ` [PATCH 7/8] xfs: rename xchk_iallocbt_check_freemask Darrick J. Wong
2018-11-06 22:13   ` Dave Chinner
2018-11-06  4:08 ` [PATCH 8/8] xfs: scrub big block inode btrees correctly Darrick J. Wong
2018-11-06 22:26   ` Dave Chinner
2018-11-09 15:07 ` [PATCH 0/8] xfs-5.0: inode btree scrub fixes Christoph Hellwig
2018-11-09 16:39   ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).