All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] xfs_repair: do not trash valid root dirs
@ 2019-12-04 17:04 Darrick J. Wong
  2019-12-04 17:04 ` [PATCH 1/6] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures Darrick J. Wong
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

Hi all,

Alex Lyakas observed that xfs_repair can accidentally trash the root
directory on a filesystem.  Specifically, if one formats a V4 filesystem
without sparse inodes but with sunit/swidth set and then mounts that
filesystem with a different sunit/swidth, the kernel will update the
values in the superblock.  This causes xfs_repair's sb_rootino
estimation to differ from the actual root directory, and it discards the
actual root directory even though there's nothing wrong with it.

Therefore, hoist the logic that computes the root inode location into
libxfs so that the kernel will avoid the sb update if the proposed
sunit/swidth changes would alter the sb_rootino estimation; and then
teach xfs_repair to retain the root directory even if the estimation
doesn't add up, as long as sb_rootino points to a directory whose '..'
entry points to itself.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/6] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
@ 2019-12-04 17:04 ` Darrick J. Wong
  2019-12-04 17:04 ` [PATCH 2/6] mkfs: check root inode location Darrick J. Wong
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit
and swidth values could cause xfs_repair to fail loudly.  The problem
here is that repair calculates the where mkfs should have allocated the
root inode, based on the superblock geometry.  The allocation decisions
depend on sunit, which means that we really can't go updating sunit if
it would lead to a subsequent repair failure on an otherwise correct
filesystem.

Port the computation code from xfs_repair and teach mount to avoid the
ondisk update if it would cause problems for repair.  We allow the mount
to proceed (and new allocations will reflect this new geometry) because
we've never screened this kind of thing before.

[1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d

Reported-by: Alex Lyakas <alex@zadara.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/xfs_ialloc.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++
 libxfs/xfs_ialloc.h |    1 +
 2 files changed, 82 insertions(+)


diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index f039b287..1d56d764 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -2848,3 +2848,84 @@ xfs_ialloc_setup_geometry(
 	else
 		igeo->ialloc_align = 0;
 }
+
+/*
+ * Compute the location of the root directory inode that is laid out by mkfs.
+ * The @sunit parameter will be copied from the superblock if it is negative.
+ */
+xfs_ino_t
+xfs_ialloc_calc_rootino(
+	struct xfs_mount	*mp,
+	int			sunit)
+{
+	struct xfs_ino_geometry	*igeo = M_IGEO(mp);
+	xfs_agino_t		first_agino;
+	xfs_agblock_t		first_bno;
+
+	if (sunit < 0)
+		sunit = mp->m_sb.sb_unit;
+
+	/*
+	 * Pre-calculate the geometry of ag 0. We know what it looks like
+	 * because we know what mkfs does: 2 allocation btree roots (by block
+	 * and by size), the inode allocation btree root, the free inode
+	 * allocation btree root (if enabled) and some number of blocks to
+	 * prefill the agfl.
+	 *
+	 * Because the current shape of the btrees may differ from the current
+	 * shape, we open code the mkfs freelist block count here. mkfs creates
+	 * single level trees, so the calculation is pretty straight forward for
+	 * the trees that use the AGFL.
+	 */
+
+	/* free space by block btree root comes after the ag headers */
+	first_bno = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize);
+
+	/* free space by length btree root */
+	first_bno += 1;
+
+	/* inode btree root */
+	first_bno += 1;
+
+	/* agfl */
+	first_bno += (2 * min_t(xfs_agblock_t, 2, mp->m_ag_maxlevels)) + 1;
+
+	if (xfs_sb_version_hasfinobt(&mp->m_sb))
+		first_bno++;
+
+	if (xfs_sb_version_hasrmapbt(&mp->m_sb)) {
+		first_bno++;
+		/* agfl blocks */
+		first_bno += min_t(xfs_agblock_t, 2, mp->m_rmap_maxlevels);
+	}
+
+	if (xfs_sb_version_hasreflink(&mp->m_sb))
+		first_bno++;
+
+	/*
+	 * If the log is allocated in the first allocation group we need to
+	 * add the number of blocks used by the log to the above calculation.
+	 *
+	 * This can happens with filesystems that only have a single
+	 * allocation group, or very odd geometries created by old mkfs
+	 * versions on very small filesystems.
+	 */
+	if (mp->m_sb.sb_logstart &&
+	    XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == 0)
+		 first_bno += mp->m_sb.sb_logblocks;
+
+	/*
+	 * ditto the location of the first inode chunks in the fs ('/')
+	 */
+	if (xfs_sb_version_hasdalign(&mp->m_sb) && igeo->ialloc_align > 0) {
+		first_agino = XFS_AGB_TO_AGINO(mp, roundup(first_bno, sunit));
+	} else if (xfs_sb_version_hasalign(&mp->m_sb) &&
+		   mp->m_sb.sb_inoalignmt > 1)  {
+		first_agino = XFS_AGB_TO_AGINO(mp,
+				roundup(first_bno, mp->m_sb.sb_inoalignmt));
+	} else  {
+		first_agino = XFS_AGB_TO_AGINO(mp, first_bno);
+	}
+
+	return XFS_AGINO_TO_INO(mp, 0, first_agino);
+}
diff --git a/libxfs/xfs_ialloc.h b/libxfs/xfs_ialloc.h
index 323592d5..72b3468b 100644
--- a/libxfs/xfs_ialloc.h
+++ b/libxfs/xfs_ialloc.h
@@ -152,5 +152,6 @@ int xfs_inobt_insert_rec(struct xfs_btree_cur *cur, uint16_t holemask,
 
 int xfs_ialloc_cluster_alignment(struct xfs_mount *mp);
 void xfs_ialloc_setup_geometry(struct xfs_mount *mp);
+xfs_ino_t xfs_ialloc_calc_rootino(struct xfs_mount *mp, int sunit);
 
 #endif	/* __XFS_IALLOC_H__ */


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/6] mkfs: check root inode location
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
  2019-12-04 17:04 ` [PATCH 1/6] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures Darrick J. Wong
@ 2019-12-04 17:04 ` Darrick J. Wong
  2019-12-05 14:36   ` Brian Foster
  2019-12-04 17:04 ` [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers Darrick J. Wong
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

Make sure the root inode gets created where repair thinks it should be
created.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/libxfs_api_defs.h |    1 +
 mkfs/xfs_mkfs.c          |   39 +++++++++++++++++++++++++++++++++------
 2 files changed, 34 insertions(+), 6 deletions(-)


diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index 645c9b1b..f8e7d82d 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -156,5 +156,6 @@
 
 #define xfs_ag_init_headers		libxfs_ag_init_headers
 #define xfs_buf_delwri_submit		libxfs_buf_delwri_submit
+#define xfs_ialloc_calc_rootino		libxfs_ialloc_calc_rootino
 
 #endif /* __LIBXFS_API_DEFS_H__ */
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 18338a61..71be033d 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3521,6 +3521,38 @@ rewrite_secondary_superblocks(
 	libxfs_writebuf(buf, LIBXFS_EXIT_ON_FAILURE);
 }
 
+static void
+check_root_ino(
+	struct xfs_mount	*mp)
+{
+	xfs_ino_t		ino;
+
+	if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) {
+		fprintf(stderr,
+			_("%s: root inode created in AG %u, not AG 0\n"),
+			progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino));
+		exit(1);
+	}
+
+	/*
+	 * The superblock points to the root directory inode, but xfs_repair
+	 * expects to find the root inode in a very specific location computed
+	 * from the filesystem geometry for an extra level of verification.
+	 *
+	 * Fail the format immediately if those assumptions ever break, because
+	 * repair will toss the root directory.
+	 */
+	ino = libxfs_ialloc_calc_rootino(mp, -1);
+	if (mp->m_sb.sb_rootino != ino) {
+		fprintf(stderr,
+	_("%s: root inode (%llu) not allocated in expected location (%llu)\n"),
+			progname,
+			(unsigned long long)mp->m_sb.sb_rootino,
+			(unsigned long long)ino);
+		exit(1);
+	}
+}
+
 int
 main(
 	int			argc,
@@ -3807,12 +3839,7 @@ main(
 	/*
 	 * Protect ourselves against possible stupidity
 	 */
-	if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) {
-		fprintf(stderr,
-			_("%s: root inode created in AG %u, not AG 0\n"),
-			progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino));
-		exit(1);
-	}
+	check_root_ino(mp);
 
 	/*
 	 * Re-write multiple secondary superblocks with rootinode field set


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
  2019-12-04 17:04 ` [PATCH 1/6] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures Darrick J. Wong
  2019-12-04 17:04 ` [PATCH 2/6] mkfs: check root inode location Darrick J. Wong
@ 2019-12-04 17:04 ` Darrick J. Wong
  2019-12-05 14:37   ` Brian Foster
  2019-12-12 20:38   ` Eric Sandeen
  2019-12-04 17:04 ` [PATCH 4/6] xfs_repair: refactor fixed inode location checks Darrick J. Wong
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair has a very old check that evidently excuses the AG 0 inode
btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
AG headers).  mkfs never formats filesystems that way and it looks like
an error, so purge the check.  After this, we always complain if inodes
overlap with AG headers because that should never happen.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/globals.c    |    1 -
 repair/globals.h    |    1 -
 repair/scan.c       |   19 -------------------
 repair/xfs_repair.c |    7 -------
 4 files changed, 28 deletions(-)


diff --git a/repair/globals.c b/repair/globals.c
index dcd79ea4..8a60e706 100644
--- a/repair/globals.c
+++ b/repair/globals.c
@@ -73,7 +73,6 @@ int	lost_gquotino;
 int	lost_pquotino;
 
 xfs_agino_t	first_prealloc_ino;
-xfs_agino_t	last_prealloc_ino;
 xfs_agblock_t	bnobt_root;
 xfs_agblock_t	bcntbt_root;
 xfs_agblock_t	inobt_root;
diff --git a/repair/globals.h b/repair/globals.h
index 008bdd90..2ed5c894 100644
--- a/repair/globals.h
+++ b/repair/globals.h
@@ -114,7 +114,6 @@ extern int		lost_gquotino;
 extern int		lost_pquotino;
 
 extern xfs_agino_t	first_prealloc_ino;
-extern xfs_agino_t	last_prealloc_ino;
 extern xfs_agblock_t	bnobt_root;
 extern xfs_agblock_t	bcntbt_root;
 extern xfs_agblock_t	inobt_root;
diff --git a/repair/scan.c b/repair/scan.c
index c383f3aa..05707dd2 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
 				break;
 			case XR_E_INUSE_FS:
 			case XR_E_INUSE_FS1:
-				if (agno == 0 &&
-				    ino + j >= first_prealloc_ino &&
-				    ino + j < last_prealloc_ino) {
-					set_bmap(agno, agbno, XR_E_INO);
-					break;
-				}
-				/* fall through */
 			default:
 				/* XXX - maybe should mark block a duplicate */
 				do_warn(
@@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
 				break;
 			case XR_E_INUSE_FS:
 			case XR_E_INUSE_FS1:
-				if (agno == 0 &&
-				    ino + j >= first_prealloc_ino &&
-				    ino + j < last_prealloc_ino) {
-					do_warn(
-_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
-						agno, agbno, mp->m_sb.sb_inopblock);
-
-					set_bmap(agno, agbno, XR_E_INO);
-					suspect++;
-					break;
-				}
-				/* fall through */
 			default:
 				do_warn(
 _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index 9295673d..3e9059f3 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
 		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
 	}
 
-	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
-
-	if (M_IGEO(mp)->ialloc_blks > 1)
-		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
-	else
-		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
-
 	/*
 	 * now the first 3 inodes in the system
 	 */


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/6] xfs_repair: refactor fixed inode location checks
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-12-04 17:04 ` [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers Darrick J. Wong
@ 2019-12-04 17:04 ` Darrick J. Wong
  2019-12-05 14:37   ` Brian Foster
  2019-12-04 17:04 ` [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location Darrick J. Wong
  2019-12-04 17:05 ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it Darrick J. Wong
  5 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

Refactor the checking and resetting of fixed-location inodes (root,
rbmino, rsumino) into a helper function.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/xfs_repair.c |  106 ++++++++++++++++++---------------------------------
 1 file changed, 37 insertions(+), 69 deletions(-)


diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index 3e9059f3..94673750 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -395,6 +395,37 @@ do_log(char const *msg, ...)
 	va_end(args);
 }
 
+/* Make sure a fixed-location inode is where it should be. */
+static void
+ensure_fixed_ino(
+	xfs_ino_t	*ino,
+	xfs_ino_t	expected_ino,
+	const char	*tag)
+{
+	if (*ino == expected_ino)
+		return;
+
+	do_warn(
+_("sb %s inode value %" PRIu64 " %sinconsistent with calculated value %"PRIu64"\n"),
+		tag, *ino, *ino == NULLFSINO ? "(NULLFSINO) " : "",
+		expected_ino);
+
+	if (!no_modify)
+		do_warn(
+_("resetting superblock %s inode pointer to %"PRIu64"\n"),
+			tag, expected_ino);
+	else
+		do_warn(
+_("would reset superblock %s inode pointer to %"PRIu64"\n"),
+			tag, expected_ino);
+
+	/*
+	 * Just set the value -- safe since the superblock doesn't get flushed
+	 * out if no_modify is set.
+	 */
+	*ino = expected_ino;
+}
+
 static void
 calc_mkfs(xfs_mount_t *mp)
 {
@@ -463,75 +494,12 @@ calc_mkfs(xfs_mount_t *mp)
 	/*
 	 * now the first 3 inodes in the system
 	 */
-	if (mp->m_sb.sb_rootino != first_prealloc_ino)  {
-		do_warn(
-_("sb root inode value %" PRIu64 " %sinconsistent with calculated value %u\n"),
-			mp->m_sb.sb_rootino,
-			(mp->m_sb.sb_rootino == NULLFSINO ? "(NULLFSINO) ":""),
-			first_prealloc_ino);
-
-		if (!no_modify)
-			do_warn(
-		_("resetting superblock root inode pointer to %u\n"),
-				first_prealloc_ino);
-		else
-			do_warn(
-		_("would reset superblock root inode pointer to %u\n"),
-				first_prealloc_ino);
-
-		/*
-		 * just set the value -- safe since the superblock
-		 * doesn't get flushed out if no_modify is set
-		 */
-		mp->m_sb.sb_rootino = first_prealloc_ino;
-	}
-
-	if (mp->m_sb.sb_rbmino != first_prealloc_ino + 1)  {
-		do_warn(
-_("sb realtime bitmap inode %" PRIu64 " %sinconsistent with calculated value %u\n"),
-			mp->m_sb.sb_rbmino,
-			(mp->m_sb.sb_rbmino == NULLFSINO ? "(NULLFSINO) ":""),
-			first_prealloc_ino + 1);
-
-		if (!no_modify)
-			do_warn(
-		_("resetting superblock realtime bitmap ino pointer to %u\n"),
-				first_prealloc_ino + 1);
-		else
-			do_warn(
-		_("would reset superblock realtime bitmap ino pointer to %u\n"),
-				first_prealloc_ino + 1);
-
-		/*
-		 * just set the value -- safe since the superblock
-		 * doesn't get flushed out if no_modify is set
-		 */
-		mp->m_sb.sb_rbmino = first_prealloc_ino + 1;
-	}
-
-	if (mp->m_sb.sb_rsumino != first_prealloc_ino + 2)  {
-		do_warn(
-_("sb realtime summary inode %" PRIu64 " %sinconsistent with calculated value %u\n"),
-			mp->m_sb.sb_rsumino,
-			(mp->m_sb.sb_rsumino == NULLFSINO ? "(NULLFSINO) ":""),
-			first_prealloc_ino + 2);
-
-		if (!no_modify)
-			do_warn(
-		_("resetting superblock realtime summary ino pointer to %u\n"),
-				first_prealloc_ino + 2);
-		else
-			do_warn(
-		_("would reset superblock realtime summary ino pointer to %u\n"),
-				first_prealloc_ino + 2);
-
-		/*
-		 * just set the value -- safe since the superblock
-		 * doesn't get flushed out if no_modify is set
-		 */
-		mp->m_sb.sb_rsumino = first_prealloc_ino + 2;
-	}
-
+	ensure_fixed_ino(&mp->m_sb.sb_rootino, first_prealloc_ino,
+			_("root"));
+	ensure_fixed_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1,
+			_("realtime bitmap"));
+	ensure_fixed_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2,
+			_("realtime summary"));
 }
 
 /*


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
                   ` (3 preceding siblings ...)
  2019-12-04 17:04 ` [PATCH 4/6] xfs_repair: refactor fixed inode location checks Darrick J. Wong
@ 2019-12-04 17:04 ` Darrick J. Wong
  2019-12-05 14:37   ` Brian Foster
  2019-12-04 17:05 ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it Darrick J. Wong
  5 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:04 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

Use libxfs_ialloc_calc_rootino to compute the location of the root
inode, and improve the function comments while we're at it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/globals.c    |    5 ---
 repair/globals.h    |    5 ---
 repair/xfs_repair.c |   78 +++++++--------------------------------------------
 3 files changed, 10 insertions(+), 78 deletions(-)


diff --git a/repair/globals.c b/repair/globals.c
index 8a60e706..299bacd1 100644
--- a/repair/globals.c
+++ b/repair/globals.c
@@ -72,11 +72,6 @@ int	lost_uquotino;
 int	lost_gquotino;
 int	lost_pquotino;
 
-xfs_agino_t	first_prealloc_ino;
-xfs_agblock_t	bnobt_root;
-xfs_agblock_t	bcntbt_root;
-xfs_agblock_t	inobt_root;
-
 /* configuration vars -- fs geometry dependent */
 
 int		inodes_per_block;
diff --git a/repair/globals.h b/repair/globals.h
index 2ed5c894..953e3dfb 100644
--- a/repair/globals.h
+++ b/repair/globals.h
@@ -113,11 +113,6 @@ extern int		lost_uquotino;
 extern int		lost_gquotino;
 extern int		lost_pquotino;
 
-extern xfs_agino_t	first_prealloc_ino;
-extern xfs_agblock_t	bnobt_root;
-extern xfs_agblock_t	bcntbt_root;
-extern xfs_agblock_t	inobt_root;
-
 /* configuration vars -- fs geometry dependent */
 
 extern int		inodes_per_block;
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index 94673750..abd568c9 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -426,79 +426,21 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
 	*ino = expected_ino;
 }
 
+/*
+ * Make sure that the first 3 inodes in the filesystem are the root directory,
+ * the realtime bitmap, and the realtime summary, in that order.
+ */
 static void
-calc_mkfs(xfs_mount_t *mp)
+calc_mkfs(
+	struct xfs_mount	*mp)
 {
-	xfs_agblock_t	fino_bno;
-	int		do_inoalign;
-
-	do_inoalign = M_IGEO(mp)->ialloc_align;
-
-	/*
-	 * Pre-calculate the geometry of ag 0. We know what it looks like
-	 * because we know what mkfs does: 2 allocation btree roots (by block
-	 * and by size), the inode allocation btree root, the free inode
-	 * allocation btree root (if enabled) and some number of blocks to
-	 * prefill the agfl.
-	 *
-	 * Because the current shape of the btrees may differ from the current
-	 * shape, we open code the mkfs freelist block count here. mkfs creates
-	 * single level trees, so the calculation is pertty straight forward for
-	 * the trees that use the AGFL.
-	 */
-	bnobt_root = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize);
-	bcntbt_root = bnobt_root + 1;
-	inobt_root = bnobt_root + 2;
-	fino_bno = inobt_root + (2 * min(2, mp->m_ag_maxlevels)) + 1;
-	if (xfs_sb_version_hasfinobt(&mp->m_sb))
-		fino_bno++;
-	if (xfs_sb_version_hasrmapbt(&mp->m_sb)) {
-		fino_bno += min(2, mp->m_rmap_maxlevels); /* agfl blocks */
-		fino_bno++;
-	}
-	if (xfs_sb_version_hasreflink(&mp->m_sb))
-		fino_bno++;
-
-	/*
-	 * If the log is allocated in the first allocation group we need to
-	 * add the number of blocks used by the log to the above calculation.
-	 *
-	 * This can happens with filesystems that only have a single
-	 * allocation group, or very odd geometries created by old mkfs
-	 * versions on very small filesystems.
-	 */
-	if (mp->m_sb.sb_logstart &&
-	    XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == 0) {
-
-		/*
-		 * XXX(hch): verify that sb_logstart makes sense?
-		 */
-		 fino_bno += mp->m_sb.sb_logblocks;
-	}
-
-	/*
-	 * ditto the location of the first inode chunks in the fs ('/')
-	 */
-	if (xfs_sb_version_hasdalign(&mp->m_sb) && do_inoalign)  {
-		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, roundup(fino_bno,
-					mp->m_sb.sb_unit));
-	} else if (xfs_sb_version_hasalign(&mp->m_sb) &&
-					mp->m_sb.sb_inoalignmt > 1)  {
-		first_prealloc_ino = XFS_AGB_TO_AGINO(mp,
-					roundup(fino_bno,
-						mp->m_sb.sb_inoalignmt));
-	} else  {
-		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
-	}
+	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
 
-	/*
-	 * now the first 3 inodes in the system
-	 */
-	ensure_fixed_ino(&mp->m_sb.sb_rootino, first_prealloc_ino,
+	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
 			_("root"));
-	ensure_fixed_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1,
+	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
 			_("realtime bitmap"));
-	ensure_fixed_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2,
+	ensure_fixed_ino(&mp->m_sb.sb_rsumino, rootino + 2,
 			_("realtime summary"));
 }
 


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it
  2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
                   ` (4 preceding siblings ...)
  2019-12-04 17:04 ` [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location Darrick J. Wong
@ 2019-12-04 17:05 ` Darrick J. Wong
  2019-12-05 14:38   ` Brian Foster
  5 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-04 17:05 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs, alex

From: Darrick J. Wong <darrick.wong@oracle.com>

If sb_rootino doesn't point to where we think mkfs should have allocated
the root directory, check to see if the alleged root directory actually
looks like a root directory.  If so, we'll let it live because someone
could have changed sunit since formatting time, and that changes the
root directory inode estimate.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)


diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index abd568c9..b0407f4b 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
 	*ino = expected_ino;
 }
 
+/* Does the root directory inode look like a plausible root directory? */
+static bool
+has_plausible_rootdir(
+	struct xfs_mount	*mp)
+{
+	struct xfs_inode	*ip;
+	xfs_ino_t		ino;
+	int			error;
+	bool			ret = false;
+
+	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
+			&xfs_default_ifork_ops);
+	if (error)
+		goto out;
+	if (!S_ISDIR(VFS_I(ip)->i_mode))
+		goto out_rele;
+
+	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
+	if (error)
+		goto out_rele;
+
+	/* The root directory '..' entry points to the directory. */
+	if (ino == mp->m_sb.sb_rootino)
+		ret = true;
+
+out_rele:
+	libxfs_irele(ip);
+out:
+	return ret;
+}
+
 /*
  * Make sure that the first 3 inodes in the filesystem are the root directory,
  * the realtime bitmap, and the realtime summary, in that order.
@@ -436,6 +467,20 @@ calc_mkfs(
 {
 	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
 
+	/*
+	 * If the root inode isn't where we think it is, check its plausibility
+	 * as a root directory.  It's possible that somebody changed sunit
+	 * since the filesystem was created, which can change the value of the
+	 * above computation.  Don't blow up the root directory if this is the
+	 * case.
+	 */
+	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
+		do_warn(
+_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
+			mp->m_sb.sb_rootino, rootino);
+		rootino = mp->m_sb.sb_rootino;
+	}
+
 	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
 			_("root"));
 	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/6] mkfs: check root inode location
  2019-12-04 17:04 ` [PATCH 2/6] mkfs: check root inode location Darrick J. Wong
@ 2019-12-05 14:36   ` Brian Foster
  0 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2019-12-05 14:36 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Wed, Dec 04, 2019 at 09:04:35AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Make sure the root inode gets created where repair thinks it should be
> created.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  libxfs/libxfs_api_defs.h |    1 +
>  mkfs/xfs_mkfs.c          |   39 +++++++++++++++++++++++++++++++++------
>  2 files changed, 34 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
> index 645c9b1b..f8e7d82d 100644
> --- a/libxfs/libxfs_api_defs.h
> +++ b/libxfs/libxfs_api_defs.h
> @@ -156,5 +156,6 @@
>  
>  #define xfs_ag_init_headers		libxfs_ag_init_headers
>  #define xfs_buf_delwri_submit		libxfs_buf_delwri_submit
> +#define xfs_ialloc_calc_rootino		libxfs_ialloc_calc_rootino
>  
>  #endif /* __LIBXFS_API_DEFS_H__ */
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index 18338a61..71be033d 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3521,6 +3521,38 @@ rewrite_secondary_superblocks(
>  	libxfs_writebuf(buf, LIBXFS_EXIT_ON_FAILURE);
>  }
>  
> +static void
> +check_root_ino(
> +	struct xfs_mount	*mp)
> +{
> +	xfs_ino_t		ino;
> +
> +	if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) {
> +		fprintf(stderr,
> +			_("%s: root inode created in AG %u, not AG 0\n"),
> +			progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino));
> +		exit(1);
> +	}
> +
> +	/*
> +	 * The superblock points to the root directory inode, but xfs_repair
> +	 * expects to find the root inode in a very specific location computed
> +	 * from the filesystem geometry for an extra level of verification.
> +	 *
> +	 * Fail the format immediately if those assumptions ever break, because
> +	 * repair will toss the root directory.
> +	 */
> +	ino = libxfs_ialloc_calc_rootino(mp, -1);
> +	if (mp->m_sb.sb_rootino != ino) {
> +		fprintf(stderr,
> +	_("%s: root inode (%llu) not allocated in expected location (%llu)\n"),
> +			progname,
> +			(unsigned long long)mp->m_sb.sb_rootino,
> +			(unsigned long long)ino);
> +		exit(1);
> +	}
> +}
> +
>  int
>  main(
>  	int			argc,
> @@ -3807,12 +3839,7 @@ main(
>  	/*
>  	 * Protect ourselves against possible stupidity
>  	 */
> -	if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) {
> -		fprintf(stderr,
> -			_("%s: root inode created in AG %u, not AG 0\n"),
> -			progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino));
> -		exit(1);
> -	}
> +	check_root_ino(mp);
>  
>  	/*
>  	 * Re-write multiple secondary superblocks with rootinode field set
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-04 17:04 ` [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers Darrick J. Wong
@ 2019-12-05 14:37   ` Brian Foster
  2019-12-05 16:28     ` Darrick J. Wong
  2019-12-12 20:38   ` Eric Sandeen
  1 sibling, 1 reply; 21+ messages in thread
From: Brian Foster @ 2019-12-05 14:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Wed, Dec 04, 2019 at 09:04:43AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_repair has a very old check that evidently excuses the AG 0 inode
> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> AG headers).  mkfs never formats filesystems that way and it looks like
> an error, so purge the check.  After this, we always complain if inodes
> overlap with AG headers because that should never happen.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Strange.. This seems reasonable to me, but any idea on how this might
have been used in the past? The only thing I can see so far is that
perhaps if the superblock (blocksize/sectorsize) is corrupted, the
in-core state trees could be badly initialized such that the inode falls
into the "in use" state. Of course if that were the case the fs probably
has bigger problems..

Brian

>  repair/globals.c    |    1 -
>  repair/globals.h    |    1 -
>  repair/scan.c       |   19 -------------------
>  repair/xfs_repair.c |    7 -------
>  4 files changed, 28 deletions(-)
> 
> 
> diff --git a/repair/globals.c b/repair/globals.c
> index dcd79ea4..8a60e706 100644
> --- a/repair/globals.c
> +++ b/repair/globals.c
> @@ -73,7 +73,6 @@ int	lost_gquotino;
>  int	lost_pquotino;
>  
>  xfs_agino_t	first_prealloc_ino;
> -xfs_agino_t	last_prealloc_ino;
>  xfs_agblock_t	bnobt_root;
>  xfs_agblock_t	bcntbt_root;
>  xfs_agblock_t	inobt_root;
> diff --git a/repair/globals.h b/repair/globals.h
> index 008bdd90..2ed5c894 100644
> --- a/repair/globals.h
> +++ b/repair/globals.h
> @@ -114,7 +114,6 @@ extern int		lost_gquotino;
>  extern int		lost_pquotino;
>  
>  extern xfs_agino_t	first_prealloc_ino;
> -extern xfs_agino_t	last_prealloc_ino;
>  extern xfs_agblock_t	bnobt_root;
>  extern xfs_agblock_t	bcntbt_root;
>  extern xfs_agblock_t	inobt_root;
> diff --git a/repair/scan.c b/repair/scan.c
> index c383f3aa..05707dd2 100644
> --- a/repair/scan.c
> +++ b/repair/scan.c
> @@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
>  				break;
>  			case XR_E_INUSE_FS:
>  			case XR_E_INUSE_FS1:
> -				if (agno == 0 &&
> -				    ino + j >= first_prealloc_ino &&
> -				    ino + j < last_prealloc_ino) {
> -					set_bmap(agno, agbno, XR_E_INO);
> -					break;
> -				}
> -				/* fall through */
>  			default:
>  				/* XXX - maybe should mark block a duplicate */
>  				do_warn(
> @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
>  				break;
>  			case XR_E_INUSE_FS:
>  			case XR_E_INUSE_FS1:
> -				if (agno == 0 &&
> -				    ino + j >= first_prealloc_ino &&
> -				    ino + j < last_prealloc_ino) {
> -					do_warn(
> -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
> -						agno, agbno, mp->m_sb.sb_inopblock);
> -
> -					set_bmap(agno, agbno, XR_E_INO);
> -					suspect++;
> -					break;
> -				}
> -				/* fall through */
>  			default:
>  				do_warn(
>  _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index 9295673d..3e9059f3 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
>  		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
>  	}
>  
> -	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
> -
> -	if (M_IGEO(mp)->ialloc_blks > 1)
> -		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
> -	else
> -		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
> -
>  	/*
>  	 * now the first 3 inodes in the system
>  	 */
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/6] xfs_repair: refactor fixed inode location checks
  2019-12-04 17:04 ` [PATCH 4/6] xfs_repair: refactor fixed inode location checks Darrick J. Wong
@ 2019-12-05 14:37   ` Brian Foster
  0 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2019-12-05 14:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Wed, Dec 04, 2019 at 09:04:50AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Refactor the checking and resetting of fixed-location inodes (root,
> rbmino, rsumino) into a helper function.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  repair/xfs_repair.c |  106 ++++++++++++++++++---------------------------------
>  1 file changed, 37 insertions(+), 69 deletions(-)
> 
> 
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index 3e9059f3..94673750 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -395,6 +395,37 @@ do_log(char const *msg, ...)
>  	va_end(args);
>  }
>  
> +/* Make sure a fixed-location inode is where it should be. */
> +static void
> +ensure_fixed_ino(
> +	xfs_ino_t	*ino,
> +	xfs_ino_t	expected_ino,
> +	const char	*tag)
> +{
> +	if (*ino == expected_ino)
> +		return;
> +
> +	do_warn(
> +_("sb %s inode value %" PRIu64 " %sinconsistent with calculated value %"PRIu64"\n"),
> +		tag, *ino, *ino == NULLFSINO ? "(NULLFSINO) " : "",
> +		expected_ino);
> +
> +	if (!no_modify)
> +		do_warn(
> +_("resetting superblock %s inode pointer to %"PRIu64"\n"),
> +			tag, expected_ino);
> +	else
> +		do_warn(
> +_("would reset superblock %s inode pointer to %"PRIu64"\n"),
> +			tag, expected_ino);
> +
> +	/*
> +	 * Just set the value -- safe since the superblock doesn't get flushed
> +	 * out if no_modify is set.
> +	 */
> +	*ino = expected_ino;
> +}
> +
>  static void
>  calc_mkfs(xfs_mount_t *mp)
>  {
> @@ -463,75 +494,12 @@ calc_mkfs(xfs_mount_t *mp)
>  	/*
>  	 * now the first 3 inodes in the system
>  	 */
> -	if (mp->m_sb.sb_rootino != first_prealloc_ino)  {
> -		do_warn(
> -_("sb root inode value %" PRIu64 " %sinconsistent with calculated value %u\n"),
> -			mp->m_sb.sb_rootino,
> -			(mp->m_sb.sb_rootino == NULLFSINO ? "(NULLFSINO) ":""),
> -			first_prealloc_ino);
> -
> -		if (!no_modify)
> -			do_warn(
> -		_("resetting superblock root inode pointer to %u\n"),
> -				first_prealloc_ino);
> -		else
> -			do_warn(
> -		_("would reset superblock root inode pointer to %u\n"),
> -				first_prealloc_ino);
> -
> -		/*
> -		 * just set the value -- safe since the superblock
> -		 * doesn't get flushed out if no_modify is set
> -		 */
> -		mp->m_sb.sb_rootino = first_prealloc_ino;
> -	}
> -
> -	if (mp->m_sb.sb_rbmino != first_prealloc_ino + 1)  {
> -		do_warn(
> -_("sb realtime bitmap inode %" PRIu64 " %sinconsistent with calculated value %u\n"),
> -			mp->m_sb.sb_rbmino,
> -			(mp->m_sb.sb_rbmino == NULLFSINO ? "(NULLFSINO) ":""),
> -			first_prealloc_ino + 1);
> -
> -		if (!no_modify)
> -			do_warn(
> -		_("resetting superblock realtime bitmap ino pointer to %u\n"),
> -				first_prealloc_ino + 1);
> -		else
> -			do_warn(
> -		_("would reset superblock realtime bitmap ino pointer to %u\n"),
> -				first_prealloc_ino + 1);
> -
> -		/*
> -		 * just set the value -- safe since the superblock
> -		 * doesn't get flushed out if no_modify is set
> -		 */
> -		mp->m_sb.sb_rbmino = first_prealloc_ino + 1;
> -	}
> -
> -	if (mp->m_sb.sb_rsumino != first_prealloc_ino + 2)  {
> -		do_warn(
> -_("sb realtime summary inode %" PRIu64 " %sinconsistent with calculated value %u\n"),
> -			mp->m_sb.sb_rsumino,
> -			(mp->m_sb.sb_rsumino == NULLFSINO ? "(NULLFSINO) ":""),
> -			first_prealloc_ino + 2);
> -
> -		if (!no_modify)
> -			do_warn(
> -		_("resetting superblock realtime summary ino pointer to %u\n"),
> -				first_prealloc_ino + 2);
> -		else
> -			do_warn(
> -		_("would reset superblock realtime summary ino pointer to %u\n"),
> -				first_prealloc_ino + 2);
> -
> -		/*
> -		 * just set the value -- safe since the superblock
> -		 * doesn't get flushed out if no_modify is set
> -		 */
> -		mp->m_sb.sb_rsumino = first_prealloc_ino + 2;
> -	}
> -
> +	ensure_fixed_ino(&mp->m_sb.sb_rootino, first_prealloc_ino,
> +			_("root"));
> +	ensure_fixed_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1,
> +			_("realtime bitmap"));
> +	ensure_fixed_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2,
> +			_("realtime summary"));
>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location
  2019-12-04 17:04 ` [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location Darrick J. Wong
@ 2019-12-05 14:37   ` Brian Foster
  0 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2019-12-05 14:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Wed, Dec 04, 2019 at 09:04:56AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Use libxfs_ialloc_calc_rootino to compute the location of the root
> inode, and improve the function comments while we're at it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  repair/globals.c    |    5 ---
>  repair/globals.h    |    5 ---
>  repair/xfs_repair.c |   78 +++++++--------------------------------------------
>  3 files changed, 10 insertions(+), 78 deletions(-)
> 
> 
> diff --git a/repair/globals.c b/repair/globals.c
> index 8a60e706..299bacd1 100644
> --- a/repair/globals.c
> +++ b/repair/globals.c
> @@ -72,11 +72,6 @@ int	lost_uquotino;
>  int	lost_gquotino;
>  int	lost_pquotino;
>  
> -xfs_agino_t	first_prealloc_ino;
> -xfs_agblock_t	bnobt_root;
> -xfs_agblock_t	bcntbt_root;
> -xfs_agblock_t	inobt_root;
> -
>  /* configuration vars -- fs geometry dependent */
>  
>  int		inodes_per_block;
> diff --git a/repair/globals.h b/repair/globals.h
> index 2ed5c894..953e3dfb 100644
> --- a/repair/globals.h
> +++ b/repair/globals.h
> @@ -113,11 +113,6 @@ extern int		lost_uquotino;
>  extern int		lost_gquotino;
>  extern int		lost_pquotino;
>  
> -extern xfs_agino_t	first_prealloc_ino;
> -extern xfs_agblock_t	bnobt_root;
> -extern xfs_agblock_t	bcntbt_root;
> -extern xfs_agblock_t	inobt_root;
> -
>  /* configuration vars -- fs geometry dependent */
>  
>  extern int		inodes_per_block;
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index 94673750..abd568c9 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -426,79 +426,21 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
>  	*ino = expected_ino;
>  }
>  
> +/*
> + * Make sure that the first 3 inodes in the filesystem are the root directory,
> + * the realtime bitmap, and the realtime summary, in that order.
> + */
>  static void
> -calc_mkfs(xfs_mount_t *mp)
> +calc_mkfs(
> +	struct xfs_mount	*mp)
>  {
> -	xfs_agblock_t	fino_bno;
> -	int		do_inoalign;
> -
> -	do_inoalign = M_IGEO(mp)->ialloc_align;
> -
> -	/*
> -	 * Pre-calculate the geometry of ag 0. We know what it looks like
> -	 * because we know what mkfs does: 2 allocation btree roots (by block
> -	 * and by size), the inode allocation btree root, the free inode
> -	 * allocation btree root (if enabled) and some number of blocks to
> -	 * prefill the agfl.
> -	 *
> -	 * Because the current shape of the btrees may differ from the current
> -	 * shape, we open code the mkfs freelist block count here. mkfs creates
> -	 * single level trees, so the calculation is pertty straight forward for
> -	 * the trees that use the AGFL.
> -	 */
> -	bnobt_root = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize);
> -	bcntbt_root = bnobt_root + 1;
> -	inobt_root = bnobt_root + 2;
> -	fino_bno = inobt_root + (2 * min(2, mp->m_ag_maxlevels)) + 1;
> -	if (xfs_sb_version_hasfinobt(&mp->m_sb))
> -		fino_bno++;
> -	if (xfs_sb_version_hasrmapbt(&mp->m_sb)) {
> -		fino_bno += min(2, mp->m_rmap_maxlevels); /* agfl blocks */
> -		fino_bno++;
> -	}
> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
> -		fino_bno++;
> -
> -	/*
> -	 * If the log is allocated in the first allocation group we need to
> -	 * add the number of blocks used by the log to the above calculation.
> -	 *
> -	 * This can happens with filesystems that only have a single
> -	 * allocation group, or very odd geometries created by old mkfs
> -	 * versions on very small filesystems.
> -	 */
> -	if (mp->m_sb.sb_logstart &&
> -	    XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == 0) {
> -
> -		/*
> -		 * XXX(hch): verify that sb_logstart makes sense?
> -		 */
> -		 fino_bno += mp->m_sb.sb_logblocks;
> -	}
> -
> -	/*
> -	 * ditto the location of the first inode chunks in the fs ('/')
> -	 */
> -	if (xfs_sb_version_hasdalign(&mp->m_sb) && do_inoalign)  {
> -		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, roundup(fino_bno,
> -					mp->m_sb.sb_unit));
> -	} else if (xfs_sb_version_hasalign(&mp->m_sb) &&
> -					mp->m_sb.sb_inoalignmt > 1)  {
> -		first_prealloc_ino = XFS_AGB_TO_AGINO(mp,
> -					roundup(fino_bno,
> -						mp->m_sb.sb_inoalignmt));
> -	} else  {
> -		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
> -	}
> +	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
>  
> -	/*
> -	 * now the first 3 inodes in the system
> -	 */
> -	ensure_fixed_ino(&mp->m_sb.sb_rootino, first_prealloc_ino,
> +	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
>  			_("root"));
> -	ensure_fixed_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1,
> +	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
>  			_("realtime bitmap"));
> -	ensure_fixed_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2,
> +	ensure_fixed_ino(&mp->m_sb.sb_rsumino, rootino + 2,
>  			_("realtime summary"));
>  }
>  
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it
  2019-12-04 17:05 ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it Darrick J. Wong
@ 2019-12-05 14:38   ` Brian Foster
  2019-12-12 22:46     ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\ Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Brian Foster @ 2019-12-05 14:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Wed, Dec 04, 2019 at 09:05:02AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If sb_rootino doesn't point to where we think mkfs should have allocated
> the root directory, check to see if the alleged root directory actually
> looks like a root directory.  If so, we'll let it live because someone
> could have changed sunit since formatting time, and that changes the
> root directory inode estimate.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
> 
> 
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index abd568c9..b0407f4b 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
>  	*ino = expected_ino;
>  }
>  
> +/* Does the root directory inode look like a plausible root directory? */
> +static bool
> +has_plausible_rootdir(
> +	struct xfs_mount	*mp)
> +{
> +	struct xfs_inode	*ip;
> +	xfs_ino_t		ino;
> +	int			error;
> +	bool			ret = false;
> +
> +	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
> +			&xfs_default_ifork_ops);
> +	if (error)
> +		goto out;
> +	if (!S_ISDIR(VFS_I(ip)->i_mode))
> +		goto out_rele;
> +
> +	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
> +	if (error)
> +		goto out_rele;
> +
> +	/* The root directory '..' entry points to the directory. */
> +	if (ino == mp->m_sb.sb_rootino)
> +		ret = true;
> +
> +out_rele:
> +	libxfs_irele(ip);
> +out:
> +	return ret;
> +}
> +
>  /*
>   * Make sure that the first 3 inodes in the filesystem are the root directory,
>   * the realtime bitmap, and the realtime summary, in that order.
> @@ -436,6 +467,20 @@ calc_mkfs(
>  {
>  	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
>  
> +	/*
> +	 * If the root inode isn't where we think it is, check its plausibility
> +	 * as a root directory.  It's possible that somebody changed sunit
> +	 * since the filesystem was created, which can change the value of the
> +	 * above computation.  Don't blow up the root directory if this is the
> +	 * case.
> +	 */
> +	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
> +		do_warn(
> +_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
> +			mp->m_sb.sb_rootino, rootino);
> +		rootino = mp->m_sb.sb_rootino;
> +	}
> +

A slightly unfortunate side effect of this is that there's seemingly no
straightforward way for a user to "clear" this state/warning. We've
solved the major problem by allowing repair to handle this condition,
but AFAICT this warning will persist unless the stripe unit is changed
back to its original value.

IOW, what if this problem exists simply because a user made a mistake
and wants to undo it? It's probably easy enough for us to say "use
whatever you did at mkfs time," but what if that's unknown or was set
automatically? I feel like that is the type of thing that in practice
could result in unnecessary bugs or error reports unless the tool can
make a better suggestion to the end user. For example, could we check
the geometry on secondary supers (if they exist) against the current
rootino and use that as a secondary form of verification and/or suggest
the user reset to that geometry (if desired)? OTOH, I guess we'd have to
consider what happens if the filesystem was grown in that scenario too..
:/

(Actually on a quick test, it looks like growfs updates every super,
even preexisting..).

Brian

>  	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
>  			_("root"));
>  	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-05 14:37   ` Brian Foster
@ 2019-12-05 16:28     ` Darrick J. Wong
  2019-12-06 16:00       ` Brian Foster
  2019-12-12 19:11       ` Eric Sandeen
  0 siblings, 2 replies; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-05 16:28 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs, alex

On Thu, Dec 05, 2019 at 09:37:27AM -0500, Brian Foster wrote:
> On Wed, Dec 04, 2019 at 09:04:43AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > xfs_repair has a very old check that evidently excuses the AG 0 inode
> > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> > AG headers).  mkfs never formats filesystems that way and it looks like
> > an error, so purge the check.  After this, we always complain if inodes
> > overlap with AG headers because that should never happen.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> Strange.. This seems reasonable to me, but any idea on how this might
> have been used in the past?

I don't have a clue -- this code has been there since the start of the
xfsprogs git repo and I don't have the pre-git history.  Dave said
"hysterical raisins".

> The only thing I can see so far is that
> perhaps if the superblock (blocksize/sectorsize) is corrupted, the
> in-core state trees could be badly initialized such that the inode falls
> into the "in use" state. Of course if that were the case the fs probably
> has bigger problems..

Yeah.  These days if all those things collide (or look like they
collide) then chances are the filesystem is already toast.

--D

> Brian
> 
> >  repair/globals.c    |    1 -
> >  repair/globals.h    |    1 -
> >  repair/scan.c       |   19 -------------------
> >  repair/xfs_repair.c |    7 -------
> >  4 files changed, 28 deletions(-)
> > 
> > 
> > diff --git a/repair/globals.c b/repair/globals.c
> > index dcd79ea4..8a60e706 100644
> > --- a/repair/globals.c
> > +++ b/repair/globals.c
> > @@ -73,7 +73,6 @@ int	lost_gquotino;
> >  int	lost_pquotino;
> >  
> >  xfs_agino_t	first_prealloc_ino;
> > -xfs_agino_t	last_prealloc_ino;
> >  xfs_agblock_t	bnobt_root;
> >  xfs_agblock_t	bcntbt_root;
> >  xfs_agblock_t	inobt_root;
> > diff --git a/repair/globals.h b/repair/globals.h
> > index 008bdd90..2ed5c894 100644
> > --- a/repair/globals.h
> > +++ b/repair/globals.h
> > @@ -114,7 +114,6 @@ extern int		lost_gquotino;
> >  extern int		lost_pquotino;
> >  
> >  extern xfs_agino_t	first_prealloc_ino;
> > -extern xfs_agino_t	last_prealloc_ino;
> >  extern xfs_agblock_t	bnobt_root;
> >  extern xfs_agblock_t	bcntbt_root;
> >  extern xfs_agblock_t	inobt_root;
> > diff --git a/repair/scan.c b/repair/scan.c
> > index c383f3aa..05707dd2 100644
> > --- a/repair/scan.c
> > +++ b/repair/scan.c
> > @@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
> >  				break;
> >  			case XR_E_INUSE_FS:
> >  			case XR_E_INUSE_FS1:
> > -				if (agno == 0 &&
> > -				    ino + j >= first_prealloc_ino &&
> > -				    ino + j < last_prealloc_ino) {
> > -					set_bmap(agno, agbno, XR_E_INO);
> > -					break;
> > -				}
> > -				/* fall through */
> >  			default:
> >  				/* XXX - maybe should mark block a duplicate */
> >  				do_warn(
> > @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
> >  				break;
> >  			case XR_E_INUSE_FS:
> >  			case XR_E_INUSE_FS1:
> > -				if (agno == 0 &&
> > -				    ino + j >= first_prealloc_ino &&
> > -				    ino + j < last_prealloc_ino) {
> > -					do_warn(
> > -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
> > -						agno, agbno, mp->m_sb.sb_inopblock);
> > -
> > -					set_bmap(agno, agbno, XR_E_INO);
> > -					suspect++;
> > -					break;
> > -				}
> > -				/* fall through */
> >  			default:
> >  				do_warn(
> >  _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
> > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > index 9295673d..3e9059f3 100644
> > --- a/repair/xfs_repair.c
> > +++ b/repair/xfs_repair.c
> > @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
> >  		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
> >  	}
> >  
> > -	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
> > -
> > -	if (M_IGEO(mp)->ialloc_blks > 1)
> > -		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
> > -	else
> > -		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
> > -
> >  	/*
> >  	 * now the first 3 inodes in the system
> >  	 */
> > 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-05 16:28     ` Darrick J. Wong
@ 2019-12-06 16:00       ` Brian Foster
  2019-12-12 19:11       ` Eric Sandeen
  1 sibling, 0 replies; 21+ messages in thread
From: Brian Foster @ 2019-12-06 16:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Thu, Dec 05, 2019 at 08:28:18AM -0800, Darrick J. Wong wrote:
> On Thu, Dec 05, 2019 at 09:37:27AM -0500, Brian Foster wrote:
> > On Wed, Dec 04, 2019 at 09:04:43AM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > xfs_repair has a very old check that evidently excuses the AG 0 inode
> > > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> > > AG headers).  mkfs never formats filesystems that way and it looks like
> > > an error, so purge the check.  After this, we always complain if inodes
> > > overlap with AG headers because that should never happen.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > 
> > Strange.. This seems reasonable to me, but any idea on how this might
> > have been used in the past?
> 
> I don't have a clue -- this code has been there since the start of the
> xfsprogs git repo and I don't have the pre-git history.  Dave said
> "hysterical raisins".
> 

Heh, Ok.

> > The only thing I can see so far is that
> > perhaps if the superblock (blocksize/sectorsize) is corrupted, the
> > in-core state trees could be badly initialized such that the inode falls
> > into the "in use" state. Of course if that were the case the fs probably
> > has bigger problems..
> 
> Yeah.  These days if all those things collide (or look like they
> collide) then chances are the filesystem is already toast.
> 

I guess I'm curious if/how this could change behavior in some way. It
kind of looks like this could be some kind of override to try and
preserve/prioritize the root inode if something else happens to be
corrupted and conflict. E.g., what happens if a stray rmapbt record
(incorrectly) categorizes this range as something other than inodes
before the inode scan gets to it? Would this change recovery behavior
from something that treats that as a broken rmapbt to something broader,
or is the outcome generally the same?

It looks to me it _could_ change behavior, but that's also considering a
very targeted corruption vs. something more likely to manifest in the
wild. This code clearly predates rmapbt, so that's obviously not the
original intent. I do also find it odd the hysterical code doesn't warn
if this condition occurs..

Brian

> --D
> 
> > Brian
> > 
> > >  repair/globals.c    |    1 -
> > >  repair/globals.h    |    1 -
> > >  repair/scan.c       |   19 -------------------
> > >  repair/xfs_repair.c |    7 -------
> > >  4 files changed, 28 deletions(-)
> > > 
> > > 
> > > diff --git a/repair/globals.c b/repair/globals.c
> > > index dcd79ea4..8a60e706 100644
> > > --- a/repair/globals.c
> > > +++ b/repair/globals.c
> > > @@ -73,7 +73,6 @@ int	lost_gquotino;
> > >  int	lost_pquotino;
> > >  
> > >  xfs_agino_t	first_prealloc_ino;
> > > -xfs_agino_t	last_prealloc_ino;
> > >  xfs_agblock_t	bnobt_root;
> > >  xfs_agblock_t	bcntbt_root;
> > >  xfs_agblock_t	inobt_root;
> > > diff --git a/repair/globals.h b/repair/globals.h
> > > index 008bdd90..2ed5c894 100644
> > > --- a/repair/globals.h
> > > +++ b/repair/globals.h
> > > @@ -114,7 +114,6 @@ extern int		lost_gquotino;
> > >  extern int		lost_pquotino;
> > >  
> > >  extern xfs_agino_t	first_prealloc_ino;
> > > -extern xfs_agino_t	last_prealloc_ino;
> > >  extern xfs_agblock_t	bnobt_root;
> > >  extern xfs_agblock_t	bcntbt_root;
> > >  extern xfs_agblock_t	inobt_root;
> > > diff --git a/repair/scan.c b/repair/scan.c
> > > index c383f3aa..05707dd2 100644
> > > --- a/repair/scan.c
> > > +++ b/repair/scan.c
> > > @@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
> > >  				break;
> > >  			case XR_E_INUSE_FS:
> > >  			case XR_E_INUSE_FS1:
> > > -				if (agno == 0 &&
> > > -				    ino + j >= first_prealloc_ino &&
> > > -				    ino + j < last_prealloc_ino) {
> > > -					set_bmap(agno, agbno, XR_E_INO);
> > > -					break;
> > > -				}
> > > -				/* fall through */
> > >  			default:
> > >  				/* XXX - maybe should mark block a duplicate */
> > >  				do_warn(
> > > @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
> > >  				break;
> > >  			case XR_E_INUSE_FS:
> > >  			case XR_E_INUSE_FS1:
> > > -				if (agno == 0 &&
> > > -				    ino + j >= first_prealloc_ino &&
> > > -				    ino + j < last_prealloc_ino) {
> > > -					do_warn(
> > > -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
> > > -						agno, agbno, mp->m_sb.sb_inopblock);
> > > -
> > > -					set_bmap(agno, agbno, XR_E_INO);
> > > -					suspect++;
> > > -					break;
> > > -				}
> > > -				/* fall through */
> > >  			default:
> > >  				do_warn(
> > >  _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
> > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > > index 9295673d..3e9059f3 100644
> > > --- a/repair/xfs_repair.c
> > > +++ b/repair/xfs_repair.c
> > > @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
> > >  		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
> > >  	}
> > >  
> > > -	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
> > > -
> > > -	if (M_IGEO(mp)->ialloc_blks > 1)
> > > -		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
> > > -	else
> > > -		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
> > > -
> > >  	/*
> > >  	 * now the first 3 inodes in the system
> > >  	 */
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-05 16:28     ` Darrick J. Wong
  2019-12-06 16:00       ` Brian Foster
@ 2019-12-12 19:11       ` Eric Sandeen
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Sandeen @ 2019-12-12 19:11 UTC (permalink / raw)
  To: Darrick J. Wong, Brian Foster; +Cc: linux-xfs, alex

On 12/5/19 10:28 AM, Darrick J. Wong wrote:
> On Thu, Dec 05, 2019 at 09:37:27AM -0500, Brian Foster wrote:
>> On Wed, Dec 04, 2019 at 09:04:43AM -0800, Darrick J. Wong wrote:
>>> From: Darrick J. Wong <darrick.wong@oracle.com>
>>>
>>> xfs_repair has a very old check that evidently excuses the AG 0 inode
>>> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
>>> AG headers).  mkfs never formats filesystems that way and it looks like
>>> an error, so purge the check.  After this, we always complain if inodes
>>> overlap with AG headers because that should never happen.
>>>
>>> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>>> ---
>>
>> Strange.. This seems reasonable to me, but any idea on how this might
>> have been used in the past?
> 
> I don't have a clue -- this code has been there since the start of the
> xfsprogs git repo and I don't have the pre-git history.  Dave said
> "hysterical raisins".

Going back to 2001, I still don't know.  It was in the original import.

-Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-04 17:04 ` [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers Darrick J. Wong
  2019-12-05 14:37   ` Brian Foster
@ 2019-12-12 20:38   ` Eric Sandeen
  2019-12-12 22:10     ` Darrick J. Wong
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Sandeen @ 2019-12-12 20:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, alex

On 12/4/19 11:04 AM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_repair has a very old check that evidently excuses the AG 0 inode
> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> AG headers).  mkfs never formats filesystems that way and it looks like
> an error, so purge the check.  After this, we always complain if inodes
> overlap with AG headers because that should never happen.

So the only thing I can think here is that if you make a 64k block filesystem
with a 512-byte inode size, you can actually get your first user-created
inode in the range between first_prealloc and last_prealloc.  So that's
maybe a clue.

But then we'd need something to mark all the blocks in that range as
XR_E_INUSE_FS to justify the existence of the test you're removing here,
and I can't make up any story or find anything in old code
that indicates that was ever done.

Still, that's my best guess, that maybe the system preallocated inodes
used to be marked as XR_E_INUSE_FS or something.  But then it's weird
to code "if it's marked XR_E_INUSE_FS and it's in the preallocated
inode range then re-mark it as XR_E_INO"

Maybe the prealloc inode range got calculated later or something...

</handwave>

Anyway, on to the nitpicking!

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  repair/globals.c    |    1 -
>  repair/globals.h    |    1 -
>  repair/scan.c       |   19 -------------------
>  repair/xfs_repair.c |    7 -------
>  4 files changed, 28 deletions(-)
> 
> 
> diff --git a/repair/globals.c b/repair/globals.c
> index dcd79ea4..8a60e706 100644
> --- a/repair/globals.c
> +++ b/repair/globals.c
> @@ -73,7 +73,6 @@ int	lost_gquotino;
>  int	lost_pquotino;
>  
>  xfs_agino_t	first_prealloc_ino;
> -xfs_agino_t	last_prealloc_ino;
>  xfs_agblock_t	bnobt_root;
>  xfs_agblock_t	bcntbt_root;
>  xfs_agblock_t	inobt_root;
> diff --git a/repair/globals.h b/repair/globals.h
> index 008bdd90..2ed5c894 100644
> --- a/repair/globals.h
> +++ b/repair/globals.h
> @@ -114,7 +114,6 @@ extern int		lost_gquotino;
>  extern int		lost_pquotino;
>  
>  extern xfs_agino_t	first_prealloc_ino;
> -extern xfs_agino_t	last_prealloc_ino;
>  extern xfs_agblock_t	bnobt_root;
>  extern xfs_agblock_t	bcntbt_root;
>  extern xfs_agblock_t	inobt_root;
> diff --git a/repair/scan.c b/repair/scan.c
> index c383f3aa..05707dd2 100644
> --- a/repair/scan.c
> +++ b/repair/scan.c
> @@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
>  				break;
>  			case XR_E_INUSE_FS:
>  			case XR_E_INUSE_FS1:
> -				if (agno == 0 &&
> -				    ino + j >= first_prealloc_ino &&
> -				    ino + j < last_prealloc_ino) {
> -					set_bmap(agno, agbno, XR_E_INO);
> -					break;
> -				}
> -				/* fall through */
>  			default:
>  				/* XXX - maybe should mark block a duplicate */
>  				do_warn(
> @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
>  				break;
>  			case XR_E_INUSE_FS:
>  			case XR_E_INUSE_FS1:

I guess there's no real reason to list a couple cases that all fall through
to default:, I'd just remove them as well since they aren't any more special
than the other unmentioned cases.

> -				if (agno == 0 &&
> -				    ino + j >= first_prealloc_ino &&
> -				    ino + j < last_prealloc_ino) {
> -					do_warn(
> -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
> -						agno, agbno, mp->m_sb.sb_inopblock);
> -
> -					set_bmap(agno, agbno, XR_E_INO);
> -					suspect++;
> -					break;
> -				}
> -				/* fall through */
>  			default:
>  				do_warn(
>  _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index 9295673d..3e9059f3 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
>  		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
>  	}
>  
> -	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
> -
> -	if (M_IGEO(mp)->ialloc_blks > 1)
> -		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
> -	else
> -		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
> -
>  	/*
>  	 * now the first 3 inodes in the system
>  	 */
> 

Otherwise I think I'm ok with this, I can't convince myself that there's any
reason to keep it.

I can drop the extra cases if you agree.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers
  2019-12-12 20:38   ` Eric Sandeen
@ 2019-12-12 22:10     ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-12 22:10 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs, alex

On Thu, Dec 12, 2019 at 02:38:12PM -0600, Eric Sandeen wrote:
> On 12/4/19 11:04 AM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > xfs_repair has a very old check that evidently excuses the AG 0 inode
> > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> > AG headers).  mkfs never formats filesystems that way and it looks like
> > an error, so purge the check.  After this, we always complain if inodes
> > overlap with AG headers because that should never happen.
> 
> So the only thing I can think here is that if you make a 64k block filesystem
> with a 512-byte inode size, you can actually get your first user-created
> inode in the range between first_prealloc and last_prealloc.  So that's
> maybe a clue.
> 
> But then we'd need something to mark all the blocks in that range as
> XR_E_INUSE_FS to justify the existence of the test you're removing here,
> and I can't make up any story or find anything in old code
> that indicates that was ever done.
> 
> Still, that's my best guess, that maybe the system preallocated inodes
> used to be marked as XR_E_INUSE_FS or something.  But then it's weird
> to code "if it's marked XR_E_INUSE_FS and it's in the preallocated
> inode range then re-mark it as XR_E_INO"
> 
> Maybe the prealloc inode range got calculated later or something...
> 
> </handwave>
> 
> Anyway, on to the nitpicking!
> 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/globals.c    |    1 -
> >  repair/globals.h    |    1 -
> >  repair/scan.c       |   19 -------------------
> >  repair/xfs_repair.c |    7 -------
> >  4 files changed, 28 deletions(-)
> > 
> > 
> > diff --git a/repair/globals.c b/repair/globals.c
> > index dcd79ea4..8a60e706 100644
> > --- a/repair/globals.c
> > +++ b/repair/globals.c
> > @@ -73,7 +73,6 @@ int	lost_gquotino;
> >  int	lost_pquotino;
> >  
> >  xfs_agino_t	first_prealloc_ino;
> > -xfs_agino_t	last_prealloc_ino;
> >  xfs_agblock_t	bnobt_root;
> >  xfs_agblock_t	bcntbt_root;
> >  xfs_agblock_t	inobt_root;
> > diff --git a/repair/globals.h b/repair/globals.h
> > index 008bdd90..2ed5c894 100644
> > --- a/repair/globals.h
> > +++ b/repair/globals.h
> > @@ -114,7 +114,6 @@ extern int		lost_gquotino;
> >  extern int		lost_pquotino;
> >  
> >  extern xfs_agino_t	first_prealloc_ino;
> > -extern xfs_agino_t	last_prealloc_ino;
> >  extern xfs_agblock_t	bnobt_root;
> >  extern xfs_agblock_t	bcntbt_root;
> >  extern xfs_agblock_t	inobt_root;
> > diff --git a/repair/scan.c b/repair/scan.c
> > index c383f3aa..05707dd2 100644
> > --- a/repair/scan.c
> > +++ b/repair/scan.c
> > @@ -1645,13 +1645,6 @@ scan_single_ino_chunk(
> >  				break;
> >  			case XR_E_INUSE_FS:
> >  			case XR_E_INUSE_FS1:
> > -				if (agno == 0 &&
> > -				    ino + j >= first_prealloc_ino &&
> > -				    ino + j < last_prealloc_ino) {
> > -					set_bmap(agno, agbno, XR_E_INO);
> > -					break;
> > -				}
> > -				/* fall through */
> >  			default:
> >  				/* XXX - maybe should mark block a duplicate */
> >  				do_warn(
> > @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\
> >  				break;
> >  			case XR_E_INUSE_FS:
> >  			case XR_E_INUSE_FS1:
> 
> I guess there's no real reason to list a couple cases that all fall through
> to default:, I'd just remove them as well since they aren't any more special
> than the other unmentioned cases.
> 
> > -				if (agno == 0 &&
> > -				    ino + j >= first_prealloc_ino &&
> > -				    ino + j < last_prealloc_ino) {
> > -					do_warn(
> > -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
> > -						agno, agbno, mp->m_sb.sb_inopblock);
> > -
> > -					set_bmap(agno, agbno, XR_E_INO);
> > -					suspect++;
> > -					break;
> > -				}
> > -				/* fall through */
> >  			default:
> >  				do_warn(
> >  _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
> > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > index 9295673d..3e9059f3 100644
> > --- a/repair/xfs_repair.c
> > +++ b/repair/xfs_repair.c
> > @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp)
> >  		first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno);
> >  	}
> >  
> > -	ASSERT(M_IGEO(mp)->ialloc_blks > 0);
> > -
> > -	if (M_IGEO(mp)->ialloc_blks > 1)
> > -		last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK;
> > -	else
> > -		last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1);
> > -
> >  	/*
> >  	 * now the first 3 inodes in the system
> >  	 */
> > 
> 
> Otherwise I think I'm ok with this, I can't convince myself that there's any
> reason to keep it.
> 
> I can drop the extra cases if you agree.

Yeah, seems fine to me.

--D

> Reviewed-by: Eric Sandeen <sandeen@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\
  2019-12-05 14:38   ` Brian Foster
@ 2019-12-12 22:46     ` Darrick J. Wong
  2019-12-13 11:19       ` Brian Foster
  0 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-12 22:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs, alex

On Thu, Dec 05, 2019 at 09:38:58AM -0500, Brian Foster wrote:
> On Wed, Dec 04, 2019 at 09:05:02AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If sb_rootino doesn't point to where we think mkfs should have allocated
> > the root directory, check to see if the alleged root directory actually
> > looks like a root directory.  If so, we'll let it live because someone
> > could have changed sunit since formatting time, and that changes the
> > root directory inode estimate.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> > 
> > 
> > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > index abd568c9..b0407f4b 100644
> > --- a/repair/xfs_repair.c
> > +++ b/repair/xfs_repair.c
> > @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
> >  	*ino = expected_ino;
> >  }
> >  
> > +/* Does the root directory inode look like a plausible root directory? */
> > +static bool
> > +has_plausible_rootdir(
> > +	struct xfs_mount	*mp)
> > +{
> > +	struct xfs_inode	*ip;
> > +	xfs_ino_t		ino;
> > +	int			error;
> > +	bool			ret = false;
> > +
> > +	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
> > +			&xfs_default_ifork_ops);
> > +	if (error)
> > +		goto out;
> > +	if (!S_ISDIR(VFS_I(ip)->i_mode))
> > +		goto out_rele;
> > +
> > +	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
> > +	if (error)
> > +		goto out_rele;
> > +
> > +	/* The root directory '..' entry points to the directory. */
> > +	if (ino == mp->m_sb.sb_rootino)
> > +		ret = true;
> > +
> > +out_rele:
> > +	libxfs_irele(ip);
> > +out:
> > +	return ret;
> > +}
> > +
> >  /*
> >   * Make sure that the first 3 inodes in the filesystem are the root directory,
> >   * the realtime bitmap, and the realtime summary, in that order.
> > @@ -436,6 +467,20 @@ calc_mkfs(
> >  {
> >  	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
> >  
> > +	/*
> > +	 * If the root inode isn't where we think it is, check its plausibility
> > +	 * as a root directory.  It's possible that somebody changed sunit
> > +	 * since the filesystem was created, which can change the value of the
> > +	 * above computation.  Don't blow up the root directory if this is the
> > +	 * case.
> > +	 */
> > +	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
> > +		do_warn(
> > +_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
> > +			mp->m_sb.sb_rootino, rootino);
> > +		rootino = mp->m_sb.sb_rootino;
> > +	}
> > +
> 
> A slightly unfortunate side effect of this is that there's seemingly no
> straightforward way for a user to "clear" this state/warning. We've
> solved the major problem by allowing repair to handle this condition,
> but AFAICT this warning will persist unless the stripe unit is changed
> back to its original value.

Heh, I apparently never replied to this. :(

> IOW, what if this problem exists simply because a user made a mistake
> and wants to undo it? It's probably easy enough for us to say "use
> whatever you did at mkfs time," but what if that's unknown or was set
> automatically? I feel like that is the type of thing that in practice
> could result in unnecessary bugs or error reports unless the tool can
> make a better suggestion to the end user. For example, could we check
> the geometry on secondary supers (if they exist) against the current
> rootino and use that as a secondary form of verification and/or suggest
> the user reset to that geometry (if desired)?

That sounds reasonable.

> OTOH, I guess we'd have to consider what happens if the filesystem was
> grown in that scenario too..  :/

I think it would be fine, so long as we're careful with the if-then
chain.  Specifically:

a. If we dislike the rootino that we compute with the ondisk sunit value,
and...

b. The thing sb_rootino points to actually does look like the root
directory, and...

c. One of the secondary supers has an sunit value that gives us a
rootino calculation that matches the sb_rootino that we just checked
out...

...then we'll propose correcting the primary sb_unit to the value we
found in (c).

> 
> (Actually on a quick test, it looks like growfs updates every super,
> even preexisting..).

I'll throw that onto the V3 series.

--D

> 
> Brian
> 
> >  	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
> >  			_("root"));
> >  	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> > 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\
  2019-12-12 22:46     ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\ Darrick J. Wong
@ 2019-12-13 11:19       ` Brian Foster
  2019-12-16 16:34         ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Brian Foster @ 2019-12-13 11:19 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Thu, Dec 12, 2019 at 02:46:18PM -0800, Darrick J. Wong wrote:
> On Thu, Dec 05, 2019 at 09:38:58AM -0500, Brian Foster wrote:
> > On Wed, Dec 04, 2019 at 09:05:02AM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > If sb_rootino doesn't point to where we think mkfs should have allocated
> > > the root directory, check to see if the alleged root directory actually
> > > looks like a root directory.  If so, we'll let it live because someone
> > > could have changed sunit since formatting time, and that changes the
> > > root directory inode estimate.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 45 insertions(+)
> > > 
> > > 
> > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > > index abd568c9..b0407f4b 100644
> > > --- a/repair/xfs_repair.c
> > > +++ b/repair/xfs_repair.c
> > > @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
> > >  	*ino = expected_ino;
> > >  }
> > >  
> > > +/* Does the root directory inode look like a plausible root directory? */
> > > +static bool
> > > +has_plausible_rootdir(
> > > +	struct xfs_mount	*mp)
> > > +{
> > > +	struct xfs_inode	*ip;
> > > +	xfs_ino_t		ino;
> > > +	int			error;
> > > +	bool			ret = false;
> > > +
> > > +	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
> > > +			&xfs_default_ifork_ops);
> > > +	if (error)
> > > +		goto out;
> > > +	if (!S_ISDIR(VFS_I(ip)->i_mode))
> > > +		goto out_rele;
> > > +
> > > +	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
> > > +	if (error)
> > > +		goto out_rele;
> > > +
> > > +	/* The root directory '..' entry points to the directory. */
> > > +	if (ino == mp->m_sb.sb_rootino)
> > > +		ret = true;
> > > +
> > > +out_rele:
> > > +	libxfs_irele(ip);
> > > +out:
> > > +	return ret;
> > > +}
> > > +
> > >  /*
> > >   * Make sure that the first 3 inodes in the filesystem are the root directory,
> > >   * the realtime bitmap, and the realtime summary, in that order.
> > > @@ -436,6 +467,20 @@ calc_mkfs(
> > >  {
> > >  	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
> > >  
> > > +	/*
> > > +	 * If the root inode isn't where we think it is, check its plausibility
> > > +	 * as a root directory.  It's possible that somebody changed sunit
> > > +	 * since the filesystem was created, which can change the value of the
> > > +	 * above computation.  Don't blow up the root directory if this is the
> > > +	 * case.
> > > +	 */
> > > +	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
> > > +		do_warn(
> > > +_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
> > > +			mp->m_sb.sb_rootino, rootino);
> > > +		rootino = mp->m_sb.sb_rootino;
> > > +	}
> > > +
> > 
> > A slightly unfortunate side effect of this is that there's seemingly no
> > straightforward way for a user to "clear" this state/warning. We've
> > solved the major problem by allowing repair to handle this condition,
> > but AFAICT this warning will persist unless the stripe unit is changed
> > back to its original value.
> 
> Heh, I apparently never replied to this. :(
> 
> > IOW, what if this problem exists simply because a user made a mistake
> > and wants to undo it? It's probably easy enough for us to say "use
> > whatever you did at mkfs time," but what if that's unknown or was set
> > automatically? I feel like that is the type of thing that in practice
> > could result in unnecessary bugs or error reports unless the tool can
> > make a better suggestion to the end user. For example, could we check
> > the geometry on secondary supers (if they exist) against the current
> > rootino and use that as a secondary form of verification and/or suggest
> > the user reset to that geometry (if desired)?
> 
> That sounds reasonable.
> 
> > OTOH, I guess we'd have to consider what happens if the filesystem was
> > grown in that scenario too..  :/
> 
> I think it would be fine, so long as we're careful with the if-then
> chain.  Specifically:
> 
> a. If we dislike the rootino that we compute with the ondisk sunit value,
> and...
> 
> b. The thing sb_rootino points to actually does look like the root
> directory, and...
> 
> c. One of the secondary supers has an sunit value that gives us a
> rootino calculation that matches the sb_rootino that we just checked
> out...
> 
> ...then we'll propose correcting the primary sb_unit to the value we
> found in (c).
> 

Yeah, that makes sense. My broader concern was addressing the situation
where we aren't lucky enough to glean original alignment from the fs.
Perhaps we could 1.) update the warning message to unconditionally
recommend an alignment and 2.) if nothing is gleaned from secondary
supers (and all your above conditions apply), calculate and recommend
the max alignment that accommodates the root inode chunk..? It might not
be the original value, but at least guides the user to a solution to
quiet the warning..

Brian

> > 
> > (Actually on a quick test, it looks like growfs updates every super,
> > even preexisting..).
> 
> I'll throw that onto the V3 series.
> 
> --D
> 
> > 
> > Brian
> > 
> > >  	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
> > >  			_("root"));
> > >  	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\
  2019-12-13 11:19       ` Brian Foster
@ 2019-12-16 16:34         ` Darrick J. Wong
  2019-12-17 11:32           ` Brian Foster
  0 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2019-12-16 16:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs, alex

On Fri, Dec 13, 2019 at 06:19:08AM -0500, Brian Foster wrote:
> On Thu, Dec 12, 2019 at 02:46:18PM -0800, Darrick J. Wong wrote:
> > On Thu, Dec 05, 2019 at 09:38:58AM -0500, Brian Foster wrote:
> > > On Wed, Dec 04, 2019 at 09:05:02AM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > If sb_rootino doesn't point to where we think mkfs should have allocated
> > > > the root directory, check to see if the alleged root directory actually
> > > > looks like a root directory.  If so, we'll let it live because someone
> > > > could have changed sunit since formatting time, and that changes the
> > > > root directory inode estimate.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 45 insertions(+)
> > > > 
> > > > 
> > > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > > > index abd568c9..b0407f4b 100644
> > > > --- a/repair/xfs_repair.c
> > > > +++ b/repair/xfs_repair.c
> > > > @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
> > > >  	*ino = expected_ino;
> > > >  }
> > > >  
> > > > +/* Does the root directory inode look like a plausible root directory? */
> > > > +static bool
> > > > +has_plausible_rootdir(
> > > > +	struct xfs_mount	*mp)
> > > > +{
> > > > +	struct xfs_inode	*ip;
> > > > +	xfs_ino_t		ino;
> > > > +	int			error;
> > > > +	bool			ret = false;
> > > > +
> > > > +	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
> > > > +			&xfs_default_ifork_ops);
> > > > +	if (error)
> > > > +		goto out;
> > > > +	if (!S_ISDIR(VFS_I(ip)->i_mode))
> > > > +		goto out_rele;
> > > > +
> > > > +	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
> > > > +	if (error)
> > > > +		goto out_rele;
> > > > +
> > > > +	/* The root directory '..' entry points to the directory. */
> > > > +	if (ino == mp->m_sb.sb_rootino)
> > > > +		ret = true;
> > > > +
> > > > +out_rele:
> > > > +	libxfs_irele(ip);
> > > > +out:
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  /*
> > > >   * Make sure that the first 3 inodes in the filesystem are the root directory,
> > > >   * the realtime bitmap, and the realtime summary, in that order.
> > > > @@ -436,6 +467,20 @@ calc_mkfs(
> > > >  {
> > > >  	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
> > > >  
> > > > +	/*
> > > > +	 * If the root inode isn't where we think it is, check its plausibility
> > > > +	 * as a root directory.  It's possible that somebody changed sunit
> > > > +	 * since the filesystem was created, which can change the value of the
> > > > +	 * above computation.  Don't blow up the root directory if this is the
> > > > +	 * case.
> > > > +	 */
> > > > +	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
> > > > +		do_warn(
> > > > +_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
> > > > +			mp->m_sb.sb_rootino, rootino);
> > > > +		rootino = mp->m_sb.sb_rootino;
> > > > +	}
> > > > +
> > > 
> > > A slightly unfortunate side effect of this is that there's seemingly no
> > > straightforward way for a user to "clear" this state/warning. We've
> > > solved the major problem by allowing repair to handle this condition,
> > > but AFAICT this warning will persist unless the stripe unit is changed
> > > back to its original value.
> > 
> > Heh, I apparently never replied to this. :(
> > 
> > > IOW, what if this problem exists simply because a user made a mistake
> > > and wants to undo it? It's probably easy enough for us to say "use
> > > whatever you did at mkfs time," but what if that's unknown or was set
> > > automatically? I feel like that is the type of thing that in practice
> > > could result in unnecessary bugs or error reports unless the tool can
> > > make a better suggestion to the end user. For example, could we check
> > > the geometry on secondary supers (if they exist) against the current
> > > rootino and use that as a secondary form of verification and/or suggest
> > > the user reset to that geometry (if desired)?
> > 
> > That sounds reasonable.
> > 
> > > OTOH, I guess we'd have to consider what happens if the filesystem was
> > > grown in that scenario too..  :/
> > 
> > I think it would be fine, so long as we're careful with the if-then
> > chain.  Specifically:
> > 
> > a. If we dislike the rootino that we compute with the ondisk sunit value,
> > and...
> > 
> > b. The thing sb_rootino points to actually does look like the root
> > directory, and...
> > 
> > c. One of the secondary supers has an sunit value that gives us a
> > rootino calculation that matches the sb_rootino that we just checked
> > out...
> > 
> > ...then we'll propose correcting the primary sb_unit to the value we
> > found in (c).
> > 
> 
> Yeah, that makes sense. My broader concern was addressing the situation
> where we aren't lucky enough to glean original alignment from the fs.
> Perhaps we could 1.) update the warning message to unconditionally
> recommend an alignment and 2.) if nothing is gleaned from secondary
> supers (and all your above conditions apply), calculate and recommend
> the max alignment that accommodates the root inode chunk..? It might not
> be the original value, but at least guides the user to a solution to
> quiet the warning..

Hmm, I suppose if the secondary sb scan didn't produce any usable values
then we could just try increasing powers of two until the computed
rootino value >= sb_rootino in the hopes of finding one.

I'm not sure how I feel about repair guessing values until it finds one
that shuts off the warning light, though.  Is doing so foolishness, or
is it AI? :)

--D

> Brian
> 
> > > 
> > > (Actually on a quick test, it looks like growfs updates every super,
> > > even preexisting..).
> > 
> > I'll throw that onto the V3 series.
> > 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > >  	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
> > > >  			_("root"));
> > > >  	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\
  2019-12-16 16:34         ` Darrick J. Wong
@ 2019-12-17 11:32           ` Brian Foster
  0 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2019-12-17 11:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs, alex

On Mon, Dec 16, 2019 at 08:34:57AM -0800, Darrick J. Wong wrote:
> On Fri, Dec 13, 2019 at 06:19:08AM -0500, Brian Foster wrote:
> > On Thu, Dec 12, 2019 at 02:46:18PM -0800, Darrick J. Wong wrote:
> > > On Thu, Dec 05, 2019 at 09:38:58AM -0500, Brian Foster wrote:
> > > > On Wed, Dec 04, 2019 at 09:05:02AM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > If sb_rootino doesn't point to where we think mkfs should have allocated
> > > > > the root directory, check to see if the alleged root directory actually
> > > > > looks like a root directory.  If so, we'll let it live because someone
> > > > > could have changed sunit since formatting time, and that changes the
> > > > > root directory inode estimate.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > >  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 45 insertions(+)
> > > > > 
> > > > > 
> > > > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > > > > index abd568c9..b0407f4b 100644
> > > > > --- a/repair/xfs_repair.c
> > > > > +++ b/repair/xfs_repair.c
> > > > > @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"),
> > > > >  	*ino = expected_ino;
> > > > >  }
> > > > >  
> > > > > +/* Does the root directory inode look like a plausible root directory? */
> > > > > +static bool
> > > > > +has_plausible_rootdir(
> > > > > +	struct xfs_mount	*mp)
> > > > > +{
> > > > > +	struct xfs_inode	*ip;
> > > > > +	xfs_ino_t		ino;
> > > > > +	int			error;
> > > > > +	bool			ret = false;
> > > > > +
> > > > > +	error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip,
> > > > > +			&xfs_default_ifork_ops);
> > > > > +	if (error)
> > > > > +		goto out;
> > > > > +	if (!S_ISDIR(VFS_I(ip)->i_mode))
> > > > > +		goto out_rele;
> > > > > +
> > > > > +	error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL);
> > > > > +	if (error)
> > > > > +		goto out_rele;
> > > > > +
> > > > > +	/* The root directory '..' entry points to the directory. */
> > > > > +	if (ino == mp->m_sb.sb_rootino)
> > > > > +		ret = true;
> > > > > +
> > > > > +out_rele:
> > > > > +	libxfs_irele(ip);
> > > > > +out:
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  /*
> > > > >   * Make sure that the first 3 inodes in the filesystem are the root directory,
> > > > >   * the realtime bitmap, and the realtime summary, in that order.
> > > > > @@ -436,6 +467,20 @@ calc_mkfs(
> > > > >  {
> > > > >  	xfs_ino_t		rootino = libxfs_ialloc_calc_rootino(mp, -1);
> > > > >  
> > > > > +	/*
> > > > > +	 * If the root inode isn't where we think it is, check its plausibility
> > > > > +	 * as a root directory.  It's possible that somebody changed sunit
> > > > > +	 * since the filesystem was created, which can change the value of the
> > > > > +	 * above computation.  Don't blow up the root directory if this is the
> > > > > +	 * case.
> > > > > +	 */
> > > > > +	if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) {
> > > > > +		do_warn(
> > > > > +_("sb root inode value %" PRIu64 " inconsistent with alignment (expected %"PRIu64")\n"),
> > > > > +			mp->m_sb.sb_rootino, rootino);
> > > > > +		rootino = mp->m_sb.sb_rootino;
> > > > > +	}
> > > > > +
> > > > 
> > > > A slightly unfortunate side effect of this is that there's seemingly no
> > > > straightforward way for a user to "clear" this state/warning. We've
> > > > solved the major problem by allowing repair to handle this condition,
> > > > but AFAICT this warning will persist unless the stripe unit is changed
> > > > back to its original value.
> > > 
> > > Heh, I apparently never replied to this. :(
> > > 
> > > > IOW, what if this problem exists simply because a user made a mistake
> > > > and wants to undo it? It's probably easy enough for us to say "use
> > > > whatever you did at mkfs time," but what if that's unknown or was set
> > > > automatically? I feel like that is the type of thing that in practice
> > > > could result in unnecessary bugs or error reports unless the tool can
> > > > make a better suggestion to the end user. For example, could we check
> > > > the geometry on secondary supers (if they exist) against the current
> > > > rootino and use that as a secondary form of verification and/or suggest
> > > > the user reset to that geometry (if desired)?
> > > 
> > > That sounds reasonable.
> > > 
> > > > OTOH, I guess we'd have to consider what happens if the filesystem was
> > > > grown in that scenario too..  :/
> > > 
> > > I think it would be fine, so long as we're careful with the if-then
> > > chain.  Specifically:
> > > 
> > > a. If we dislike the rootino that we compute with the ondisk sunit value,
> > > and...
> > > 
> > > b. The thing sb_rootino points to actually does look like the root
> > > directory, and...
> > > 
> > > c. One of the secondary supers has an sunit value that gives us a
> > > rootino calculation that matches the sb_rootino that we just checked
> > > out...
> > > 
> > > ...then we'll propose correcting the primary sb_unit to the value we
> > > found in (c).
> > > 
> > 
> > Yeah, that makes sense. My broader concern was addressing the situation
> > where we aren't lucky enough to glean original alignment from the fs.
> > Perhaps we could 1.) update the warning message to unconditionally
> > recommend an alignment and 2.) if nothing is gleaned from secondary
> > supers (and all your above conditions apply), calculate and recommend
> > the max alignment that accommodates the root inode chunk..? It might not
> > be the original value, but at least guides the user to a solution to
> > quiet the warning..
> 
> Hmm, I suppose if the secondary sb scan didn't produce any usable values
> then we could just try increasing powers of two until the computed
> rootino value >= sb_rootino in the hopes of finding one.
> 
> I'm not sure how I feel about repair guessing values until it finds one
> that shuts off the warning light, though.  Is doing so foolishness, or
> is it AI? :)
> 

Heh. I'm not sure what the right answer is on that. I guess if we dumped
out the minimum valid alignment and then made it clear that we couldn't
detect historical alignment from the fs (so it's a guess/recommendation
to clear the warning), that might be the best we can do.

Brian

> --D
> 
> > Brian
> > 
> > > > 
> > > > (Actually on a quick test, it looks like growfs updates every super,
> > > > even preexisting..).
> > > 
> > > I'll throw that onto the V3 series.
> > > 
> > > --D
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > >  	ensure_fixed_ino(&mp->m_sb.sb_rootino, rootino,
> > > > >  			_("root"));
> > > > >  	ensure_fixed_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-12-17 11:32 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-04 17:04 [PATCH v2 0/6] xfs_repair: do not trash valid root dirs Darrick J. Wong
2019-12-04 17:04 ` [PATCH 1/6] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures Darrick J. Wong
2019-12-04 17:04 ` [PATCH 2/6] mkfs: check root inode location Darrick J. Wong
2019-12-05 14:36   ` Brian Foster
2019-12-04 17:04 ` [PATCH 3/6] xfs_repair: enforce that inode btree chunks can't point to AG headers Darrick J. Wong
2019-12-05 14:37   ` Brian Foster
2019-12-05 16:28     ` Darrick J. Wong
2019-12-06 16:00       ` Brian Foster
2019-12-12 19:11       ` Eric Sandeen
2019-12-12 20:38   ` Eric Sandeen
2019-12-12 22:10     ` Darrick J. Wong
2019-12-04 17:04 ` [PATCH 4/6] xfs_repair: refactor fixed inode location checks Darrick J. Wong
2019-12-05 14:37   ` Brian Foster
2019-12-04 17:04 ` [PATCH 5/6] xfs_repair: use libxfs function to calculate root inode location Darrick J. Wong
2019-12-05 14:37   ` Brian Foster
2019-12-04 17:05 ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it Darrick J. Wong
2019-12-05 14:38   ` Brian Foster
2019-12-12 22:46     ` [PATCH 6/6] xfs_repair: check plausibility of root dir pointer before trashing it\ Darrick J. Wong
2019-12-13 11:19       ` Brian Foster
2019-12-16 16:34         ` Darrick J. Wong
2019-12-17 11:32           ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.