Hi all, Alex Lyakas observed that xfs_repair can accidentally trash the root directory on a filesystem. Specifically, if one formats a V4 filesystem without sparse inodes but with sunit/swidth set and then mounts that filesystem with a different sunit/swidth, the kernel will update the values in the superblock. This causes xfs_repair's sb_rootino estimation to differ from the actual root directory, and it discards the actual root directory even though there's nothing wrong with it. Therefore, hoist the logic that computes the root inode location into libxfs so that the kernel will avoid the sb update if the proposed sunit/swidth changes would alter the sb_rootino estimation; and then teach xfs_repair to retain the root directory even if the estimation doesn't add up, as long as sb_rootino points to a directory whose '..' entry points to itself. v4 rebases on the latest for-next and adds in a few things that Eric and I discussed during the review of v3. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=repair-find-rootdir fstests git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=repair-find-rootdir
From: Darrick J. Wong <darrick.wong@oracle.com> Repair uses the verify_inum function to validate inode numbers that it finds in the superblock and in directories. libxfs now has validator functions to cover that kind of thing, so remove verify_inum(). As a side bonus, this means that we will flag directories that point to the quota/realtime metadata inodes. This fixes a regression found by fuzzing u3.sfdir3.hdr.parent.i4 to lastbit (aka making a directory's .. point to the user quota inode) in xfs/384. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- libxfs/libxfs_api_defs.h | 1 + repair/dino_chunks.c | 2 +- repair/dinode.c | 29 ----------------------------- repair/dinode.h | 4 ---- repair/dir2.c | 7 +++---- repair/phase4.c | 12 ++++++------ repair/phase6.c | 8 ++++---- 7 files changed, 15 insertions(+), 48 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 6e09685b..9daf2635 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -176,6 +176,7 @@ #define xfs_trans_roll libxfs_trans_roll #define xfs_verify_cksum libxfs_verify_cksum +#define xfs_verify_dir_ino libxfs_verify_dir_ino #define xfs_verify_ino libxfs_verify_ino #define xfs_verify_rtbno libxfs_verify_rtbno #define xfs_zero_extent libxfs_zero_extent diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c index 00b67468..dbf3d37a 100644 --- a/repair/dino_chunks.c +++ b/repair/dino_chunks.c @@ -65,7 +65,7 @@ check_aginode_block(xfs_mount_t *mp, * inode chunk. returns number of new inodes if things are good * and 0 if bad. start is the start of the discovered inode chunk. * routine assumes that ino is a legal inode number - * (verified by verify_inum()). If the inode chunk turns out + * (verified by libxfs_verify_ino()). If the inode chunk turns out * to be good, this routine will put the inode chunk into * the good inode chunk tree if required. * diff --git a/repair/dinode.c b/repair/dinode.c index 8af2cb25..0d9c96be 100644 --- a/repair/dinode.c +++ b/repair/dinode.c @@ -171,35 +171,6 @@ verify_ag_bno(xfs_sb_t *sbp, return 1; } -/* - * returns 0 if inode number is valid, 1 if bogus - */ -int -verify_inum(xfs_mount_t *mp, - xfs_ino_t ino) -{ - xfs_agnumber_t agno; - xfs_agino_t agino; - xfs_agblock_t agbno; - xfs_sb_t *sbp = &mp->m_sb;; - - /* range check ag #, ag block. range-checking offset is pointless */ - - agno = XFS_INO_TO_AGNO(mp, ino); - agino = XFS_INO_TO_AGINO(mp, ino); - agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno == 0) - return 1; - - if (ino == 0 || ino == NULLFSINO) - return(1); - - if (ino != XFS_AGINO_TO_INO(mp, agno, agino)) - return(1); - - return verify_ag_bno(sbp, agno, agbno); -} - /* * have a separate routine to ensure that we don't accidentally * lose illegally set bits in the agino by turning it into an FSINO diff --git a/repair/dinode.h b/repair/dinode.h index aa177465..98238357 100644 --- a/repair/dinode.h +++ b/repair/dinode.h @@ -77,10 +77,6 @@ verify_uncertain_dinode(xfs_mount_t *mp, xfs_agnumber_t agno, xfs_agino_t ino); -int -verify_inum(xfs_mount_t *mp, - xfs_ino_t ino); - int verify_aginum(xfs_mount_t *mp, xfs_agnumber_t agno, diff --git a/repair/dir2.c b/repair/dir2.c index e43a9732..723aee1f 100644 --- a/repair/dir2.c +++ b/repair/dir2.c @@ -215,7 +215,7 @@ process_sf_dir2( if (lino == ino) { junkit = 1; junkreason = _("current"); - } else if (verify_inum(mp, lino)) { + } else if (!libxfs_verify_dir_ino(mp, lino)) { junkit = 1; junkreason = _("invalid"); } else if (lino == mp->m_sb.sb_rbmino) { @@ -486,8 +486,7 @@ _("corrected entry offsets in directory %" PRIu64 "\n"), * If the validation fails for the root inode we fix it in * the next else case. */ - if (verify_inum(mp, *parent) && ino != mp->m_sb.sb_rootino) { - + if (!libxfs_verify_dir_ino(mp, *parent) && ino != mp->m_sb.sb_rootino) { do_warn( _("bogus .. inode number (%" PRIu64 ") in directory inode %" PRIu64 ", "), *parent, ino); @@ -674,7 +673,7 @@ process_dir2_data( * (or did it ourselves) during phase 3. */ clearino = 0; - } else if (verify_inum(mp, ent_ino)) { + } else if (!libxfs_verify_dir_ino(mp, ent_ino)) { /* * Bad inode number. Clear the inode number and the * entry will get removed later. We don't trash the diff --git a/repair/phase4.c b/repair/phase4.c index e1ba778f..8197db06 100644 --- a/repair/phase4.c +++ b/repair/phase4.c @@ -36,7 +36,7 @@ quotino_check(xfs_mount_t *mp) ino_tree_node_t *irec; if (mp->m_sb.sb_uquotino != NULLFSINO && mp->m_sb.sb_uquotino != 0) { - if (verify_inum(mp, mp->m_sb.sb_uquotino)) + if (!libxfs_verify_ino(mp, mp->m_sb.sb_uquotino)) irec = NULL; else irec = find_inode_rec(mp, @@ -52,7 +52,7 @@ quotino_check(xfs_mount_t *mp) } if (mp->m_sb.sb_gquotino != NULLFSINO && mp->m_sb.sb_gquotino != 0) { - if (verify_inum(mp, mp->m_sb.sb_gquotino)) + if (!libxfs_verify_ino(mp, mp->m_sb.sb_gquotino)) irec = NULL; else irec = find_inode_rec(mp, @@ -68,7 +68,7 @@ quotino_check(xfs_mount_t *mp) } if (mp->m_sb.sb_pquotino != NULLFSINO && mp->m_sb.sb_pquotino != 0) { - if (verify_inum(mp, mp->m_sb.sb_pquotino)) + if (!libxfs_verify_ino(mp, mp->m_sb.sb_pquotino)) irec = NULL; else irec = find_inode_rec(mp, @@ -112,9 +112,9 @@ quota_sb_check(xfs_mount_t *mp) (mp->m_sb.sb_pquotino == NULLFSINO || mp->m_sb.sb_pquotino == 0)) { lost_quotas = 1; fs_quotas = 0; - } else if (!verify_inum(mp, mp->m_sb.sb_uquotino) && - !verify_inum(mp, mp->m_sb.sb_gquotino) && - !verify_inum(mp, mp->m_sb.sb_pquotino)) { + } else if (libxfs_verify_ino(mp, mp->m_sb.sb_uquotino) && + libxfs_verify_ino(mp, mp->m_sb.sb_gquotino) && + libxfs_verify_ino(mp, mp->m_sb.sb_pquotino)) { fs_quotas = 1; } } diff --git a/repair/phase6.c b/repair/phase6.c index 0874b649..70135694 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -1814,7 +1814,7 @@ longform_dir2_entry_check_data( } continue; } - ASSERT(no_modify || !verify_inum(mp, inum)); + ASSERT(no_modify || libxfs_verify_dir_ino(mp, inum)); /* * special case the . entry. we know there's only one * '.' and only '.' points to itself because bogus entries @@ -1845,7 +1845,7 @@ longform_dir2_entry_check_data( /* * skip entries with bogus inumbers if we're in no modify mode */ - if (no_modify && verify_inum(mp, inum)) + if (no_modify && !libxfs_verify_dir_ino(mp, inum)) continue; /* validate ftype field if supported */ @@ -2634,14 +2634,14 @@ shortform_dir2_entry_check(xfs_mount_t *mp, fname[sfep->namelen] = '\0'; ASSERT(no_modify || (lino != NULLFSINO && lino != 0)); - ASSERT(no_modify || !verify_inum(mp, lino)); + ASSERT(no_modify || libxfs_verify_dir_ino(mp, lino)); /* * Also skip entries with bogus inode numbers if we're * in no modify mode. */ - if (no_modify && verify_inum(mp, lino)) { + if (no_modify && !libxfs_verify_dir_ino(mp, lino)) { next_sfep = libxfs_dir2_sf_nextentry(mp, sfp, sfep); continue; }
From: Darrick J. Wong <darrick.wong@oracle.com> Make sure the root inode gets created where repair thinks it should be created. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> --- libxfs/libxfs_api_defs.h | 1 + mkfs/xfs_mkfs.c | 39 +++++++++++++++++++++++++++++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 9daf2635..7c629c62 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -98,6 +98,7 @@ #define xfs_fs_geometry libxfs_fs_geometry #define xfs_highbit32 libxfs_highbit32 #define xfs_highbit64 libxfs_highbit64 +#define xfs_ialloc_calc_rootino libxfs_ialloc_calc_rootino #define xfs_idata_realloc libxfs_idata_realloc #define xfs_idestroy_fork libxfs_idestroy_fork #define xfs_iext_lookup_extent libxfs_iext_lookup_extent diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index 606f79da..cc71fd39 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -3549,6 +3549,38 @@ rewrite_secondary_superblocks( libxfs_writebuf(buf, LIBXFS_EXIT_ON_FAILURE); } +static void +check_root_ino( + struct xfs_mount *mp) +{ + xfs_ino_t ino; + + if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) { + fprintf(stderr, + _("%s: root inode created in AG %u, not AG 0\n"), + progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino)); + exit(1); + } + + /* + * The superblock points to the root directory inode, but xfs_repair + * expects to find the root inode in a very specific location computed + * from the filesystem geometry for an extra level of verification. + * + * Fail the format immediately if those assumptions ever break, because + * repair will toss the root directory. + */ + ino = libxfs_ialloc_calc_rootino(mp, mp->m_sb.sb_unit); + if (mp->m_sb.sb_rootino != ino) { + fprintf(stderr, + _("%s: root inode (%llu) not allocated in expected location (%llu)\n"), + progname, + (unsigned long long)mp->m_sb.sb_rootino, + (unsigned long long)ino); + exit(1); + } +} + int main( int argc, @@ -3835,12 +3867,7 @@ main( /* * Protect ourselves against possible stupidity */ - if (XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino) != 0) { - fprintf(stderr, - _("%s: root inode created in AG %u, not AG 0\n"), - progname, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino)); - exit(1); - } + check_root_ino(mp); /* * Re-write multiple secondary superblocks with rootinode field set
From: Darrick J. Wong <darrick.wong@oracle.com> xfs_repair has a very old check that evidently excuses the AG 0 inode btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g. AG headers). mkfs never formats filesystems that way and it looks like an error, so purge the check. After this, we always complain if inodes overlap with AG headers because that should never happen. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- repair/globals.c | 1 - repair/globals.h | 1 - repair/scan.c | 19 ------------------- repair/xfs_repair.c | 7 ------- 4 files changed, 28 deletions(-) diff --git a/repair/globals.c b/repair/globals.c index dcd79ea4..8a60e706 100644 --- a/repair/globals.c +++ b/repair/globals.c @@ -73,7 +73,6 @@ int lost_gquotino; int lost_pquotino; xfs_agino_t first_prealloc_ino; -xfs_agino_t last_prealloc_ino; xfs_agblock_t bnobt_root; xfs_agblock_t bcntbt_root; xfs_agblock_t inobt_root; diff --git a/repair/globals.h b/repair/globals.h index 008bdd90..2ed5c894 100644 --- a/repair/globals.h +++ b/repair/globals.h @@ -114,7 +114,6 @@ extern int lost_gquotino; extern int lost_pquotino; extern xfs_agino_t first_prealloc_ino; -extern xfs_agino_t last_prealloc_ino; extern xfs_agblock_t bnobt_root; extern xfs_agblock_t bcntbt_root; extern xfs_agblock_t inobt_root; diff --git a/repair/scan.c b/repair/scan.c index c383f3aa..05707dd2 100644 --- a/repair/scan.c +++ b/repair/scan.c @@ -1645,13 +1645,6 @@ scan_single_ino_chunk( break; case XR_E_INUSE_FS: case XR_E_INUSE_FS1: - if (agno == 0 && - ino + j >= first_prealloc_ino && - ino + j < last_prealloc_ino) { - set_bmap(agno, agbno, XR_E_INO); - break; - } - /* fall through */ default: /* XXX - maybe should mark block a duplicate */ do_warn( @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\ break; case XR_E_INUSE_FS: case XR_E_INUSE_FS1: - if (agno == 0 && - ino + j >= first_prealloc_ino && - ino + j < last_prealloc_ino) { - do_warn( -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"), - agno, agbno, mp->m_sb.sb_inopblock); - - set_bmap(agno, agbno, XR_E_INO); - suspect++; - break; - } - /* fall through */ default: do_warn( _("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"), diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 9295673d..3e9059f3 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -460,13 +460,6 @@ calc_mkfs(xfs_mount_t *mp) first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno); } - ASSERT(M_IGEO(mp)->ialloc_blks > 0); - - if (M_IGEO(mp)->ialloc_blks > 1) - last_prealloc_ino = first_prealloc_ino + XFS_INODES_PER_CHUNK; - else - last_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno + 1); - /* * now the first 3 inodes in the system */
From: Darrick J. Wong <darrick.wong@oracle.com> Refactor the checking and resetting of fixed-location inodes (root, rbmino, rsumino) into a helper function. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> --- repair/xfs_repair.c | 106 ++++++++++++++++++--------------------------------- 1 file changed, 37 insertions(+), 69 deletions(-) diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 3e9059f3..f8005f8a 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -395,6 +395,37 @@ do_log(char const *msg, ...) va_end(args); } +/* Make sure a fixed-location inode is where it should be. */ +static void +validate_sb_ino( + xfs_ino_t *ino, + xfs_ino_t expected_ino, + const char *tag) +{ + if (*ino == expected_ino) + return; + + do_warn( +_("sb %s inode value %" PRIu64 " %sinconsistent with calculated value %"PRIu64"\n"), + tag, *ino, *ino == NULLFSINO ? "(NULLFSINO) " : "", + expected_ino); + + if (!no_modify) + do_warn( +_("resetting superblock %s inode pointer to %"PRIu64"\n"), + tag, expected_ino); + else + do_warn( +_("would reset superblock %s inode pointer to %"PRIu64"\n"), + tag, expected_ino); + + /* + * Just set the value -- safe since the superblock doesn't get flushed + * out if no_modify is set. + */ + *ino = expected_ino; +} + static void calc_mkfs(xfs_mount_t *mp) { @@ -463,75 +494,12 @@ calc_mkfs(xfs_mount_t *mp) /* * now the first 3 inodes in the system */ - if (mp->m_sb.sb_rootino != first_prealloc_ino) { - do_warn( -_("sb root inode value %" PRIu64 " %sinconsistent with calculated value %u\n"), - mp->m_sb.sb_rootino, - (mp->m_sb.sb_rootino == NULLFSINO ? "(NULLFSINO) ":""), - first_prealloc_ino); - - if (!no_modify) - do_warn( - _("resetting superblock root inode pointer to %u\n"), - first_prealloc_ino); - else - do_warn( - _("would reset superblock root inode pointer to %u\n"), - first_prealloc_ino); - - /* - * just set the value -- safe since the superblock - * doesn't get flushed out if no_modify is set - */ - mp->m_sb.sb_rootino = first_prealloc_ino; - } - - if (mp->m_sb.sb_rbmino != first_prealloc_ino + 1) { - do_warn( -_("sb realtime bitmap inode %" PRIu64 " %sinconsistent with calculated value %u\n"), - mp->m_sb.sb_rbmino, - (mp->m_sb.sb_rbmino == NULLFSINO ? "(NULLFSINO) ":""), - first_prealloc_ino + 1); - - if (!no_modify) - do_warn( - _("resetting superblock realtime bitmap ino pointer to %u\n"), - first_prealloc_ino + 1); - else - do_warn( - _("would reset superblock realtime bitmap ino pointer to %u\n"), - first_prealloc_ino + 1); - - /* - * just set the value -- safe since the superblock - * doesn't get flushed out if no_modify is set - */ - mp->m_sb.sb_rbmino = first_prealloc_ino + 1; - } - - if (mp->m_sb.sb_rsumino != first_prealloc_ino + 2) { - do_warn( -_("sb realtime summary inode %" PRIu64 " %sinconsistent with calculated value %u\n"), - mp->m_sb.sb_rsumino, - (mp->m_sb.sb_rsumino == NULLFSINO ? "(NULLFSINO) ":""), - first_prealloc_ino + 2); - - if (!no_modify) - do_warn( - _("resetting superblock realtime summary ino pointer to %u\n"), - first_prealloc_ino + 2); - else - do_warn( - _("would reset superblock realtime summary ino pointer to %u\n"), - first_prealloc_ino + 2); - - /* - * just set the value -- safe since the superblock - * doesn't get flushed out if no_modify is set - */ - mp->m_sb.sb_rsumino = first_prealloc_ino + 2; - } - + validate_sb_ino(&mp->m_sb.sb_rootino, first_prealloc_ino, + _("root")); + validate_sb_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1, + _("realtime bitmap")); + validate_sb_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2, + _("realtime summary")); } /*
From: Darrick J. Wong <darrick.wong@oracle.com> Use libxfs_ialloc_calc_rootino to compute the location of the root inode, and improve the function comments while we're at it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> --- repair/globals.c | 5 --- repair/globals.h | 5 --- repair/xfs_repair.c | 78 +++++++-------------------------------------------- 3 files changed, 11 insertions(+), 77 deletions(-) diff --git a/repair/globals.c b/repair/globals.c index 8a60e706..299bacd1 100644 --- a/repair/globals.c +++ b/repair/globals.c @@ -72,11 +72,6 @@ int lost_uquotino; int lost_gquotino; int lost_pquotino; -xfs_agino_t first_prealloc_ino; -xfs_agblock_t bnobt_root; -xfs_agblock_t bcntbt_root; -xfs_agblock_t inobt_root; - /* configuration vars -- fs geometry dependent */ int inodes_per_block; diff --git a/repair/globals.h b/repair/globals.h index 2ed5c894..953e3dfb 100644 --- a/repair/globals.h +++ b/repair/globals.h @@ -113,11 +113,6 @@ extern int lost_uquotino; extern int lost_gquotino; extern int lost_pquotino; -extern xfs_agino_t first_prealloc_ino; -extern xfs_agblock_t bnobt_root; -extern xfs_agblock_t bcntbt_root; -extern xfs_agblock_t inobt_root; - /* configuration vars -- fs geometry dependent */ extern int inodes_per_block; diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index f8005f8a..111306fe 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -426,79 +426,23 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"), *ino = expected_ino; } +/* + * Make sure that the first 3 inodes in the filesystem are the root directory, + * the realtime bitmap, and the realtime summary, in that order. + */ static void -calc_mkfs(xfs_mount_t *mp) +calc_mkfs( + struct xfs_mount *mp) { - xfs_agblock_t fino_bno; - int do_inoalign; - - do_inoalign = M_IGEO(mp)->ialloc_align; - - /* - * Pre-calculate the geometry of ag 0. We know what it looks like - * because we know what mkfs does: 2 allocation btree roots (by block - * and by size), the inode allocation btree root, the free inode - * allocation btree root (if enabled) and some number of blocks to - * prefill the agfl. - * - * Because the current shape of the btrees may differ from the current - * shape, we open code the mkfs freelist block count here. mkfs creates - * single level trees, so the calculation is pertty straight forward for - * the trees that use the AGFL. - */ - bnobt_root = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize); - bcntbt_root = bnobt_root + 1; - inobt_root = bnobt_root + 2; - fino_bno = inobt_root + (2 * min(2, mp->m_ag_maxlevels)) + 1; - if (xfs_sb_version_hasfinobt(&mp->m_sb)) - fino_bno++; - if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { - fino_bno += min(2, mp->m_rmap_maxlevels); /* agfl blocks */ - fino_bno++; - } - if (xfs_sb_version_hasreflink(&mp->m_sb)) - fino_bno++; + xfs_ino_t rootino; - /* - * If the log is allocated in the first allocation group we need to - * add the number of blocks used by the log to the above calculation. - * - * This can happens with filesystems that only have a single - * allocation group, or very odd geometries created by old mkfs - * versions on very small filesystems. - */ - if (mp->m_sb.sb_logstart && - XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == 0) { + rootino = libxfs_ialloc_calc_rootino(mp, mp->m_sb.sb_unit); - /* - * XXX(hch): verify that sb_logstart makes sense? - */ - fino_bno += mp->m_sb.sb_logblocks; - } - - /* - * ditto the location of the first inode chunks in the fs ('/') - */ - if (xfs_sb_version_hasdalign(&mp->m_sb) && do_inoalign) { - first_prealloc_ino = XFS_AGB_TO_AGINO(mp, roundup(fino_bno, - mp->m_sb.sb_unit)); - } else if (xfs_sb_version_hasalign(&mp->m_sb) && - mp->m_sb.sb_inoalignmt > 1) { - first_prealloc_ino = XFS_AGB_TO_AGINO(mp, - roundup(fino_bno, - mp->m_sb.sb_inoalignmt)); - } else { - first_prealloc_ino = XFS_AGB_TO_AGINO(mp, fino_bno); - } - - /* - * now the first 3 inodes in the system - */ - validate_sb_ino(&mp->m_sb.sb_rootino, first_prealloc_ino, + validate_sb_ino(&mp->m_sb.sb_rootino, rootino, _("root")); - validate_sb_ino(&mp->m_sb.sb_rbmino, first_prealloc_ino + 1, + validate_sb_ino(&mp->m_sb.sb_rbmino, rootino + 1, _("realtime bitmap")); - validate_sb_ino(&mp->m_sb.sb_rsumino, first_prealloc_ino + 2, + validate_sb_ino(&mp->m_sb.sb_rsumino, rootino + 2, _("realtime summary")); }
From: Darrick J. Wong <darrick.wong@oracle.com> If sb_rootino doesn't point to where we think mkfs should have allocated the root directory, check to see if the alleged root directory actually looks like a root directory. If so, we'll let it live because someone could have changed sunit since formatting time, and that changes the root directory inode estimate. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- repair/xfs_repair.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 111306fe..b34d41d4 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -426,6 +426,37 @@ _("would reset superblock %s inode pointer to %"PRIu64"\n"), *ino = expected_ino; } +/* Does the root directory inode look like a plausible root directory? */ +static bool +has_plausible_rootdir( + struct xfs_mount *mp) +{ + struct xfs_inode *ip; + xfs_ino_t ino; + int error; + bool ret = false; + + error = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &ip, + &xfs_default_ifork_ops); + if (error) + goto out; + if (!S_ISDIR(VFS_I(ip)->i_mode)) + goto out_rele; + + error = -libxfs_dir_lookup(NULL, ip, &xfs_name_dotdot, &ino, NULL); + if (error) + goto out_rele; + + /* The root directory '..' entry points to the directory. */ + if (ino == mp->m_sb.sb_rootino) + ret = true; + +out_rele: + libxfs_irele(ip); +out: + return ret; +} + /* * Make sure that the first 3 inodes in the filesystem are the root directory, * the realtime bitmap, and the realtime summary, in that order. @@ -438,6 +469,20 @@ calc_mkfs( rootino = libxfs_ialloc_calc_rootino(mp, mp->m_sb.sb_unit); + /* + * If the root inode isn't where we think it is, check its plausibility + * as a root directory. It's possible that somebody changed sunit + * since the filesystem was created, which can change the value of the + * above computation. Don't blow up the root directory if this is the + * case. + */ + if (mp->m_sb.sb_rootino != rootino && has_plausible_rootdir(mp)) { + do_warn( +_("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"), + mp->m_sb.sb_rootino, rootino); + rootino = mp->m_sb.sb_rootino; + } + validate_sb_ino(&mp->m_sb.sb_rootino, rootino, _("root")); validate_sb_ino(&mp->m_sb.sb_rbmino, rootino + 1,
From: Darrick J. Wong <darrick.wong@oracle.com> If the primary superblock's sb_unit leads to a rootino calculation that doesn't match sb_rootino /but/ we can find a secondary superblock whose sb_unit does match, fix the primary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- libxfs/libxfs_api_defs.h | 1 + repair/xfs_repair.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 80 insertions(+) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 7c629c62..1a438e58 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -143,6 +143,7 @@ #define xfs_rtfree_extent libxfs_rtfree_extent #define xfs_sb_from_disk libxfs_sb_from_disk #define xfs_sb_quota_from_disk libxfs_sb_quota_from_disk +#define xfs_sb_read_secondary libxfs_sb_read_secondary #define xfs_sb_to_disk libxfs_sb_to_disk #define xfs_symlink_blocks libxfs_symlink_blocks #define xfs_symlink_hdr_ok libxfs_symlink_hdr_ok diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index b34d41d4..eb1ce546 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -457,6 +457,84 @@ has_plausible_rootdir( return ret; } +/* + * If any of the secondary SBs contain a *correct* value for sunit, write that + * back to the primary superblock. + */ +static void +guess_correct_sunit( + struct xfs_mount *mp) +{ + struct xfs_sb sb; + struct xfs_buf *bp; + xfs_ino_t calc_rootino = NULLFSINO; + xfs_agnumber_t agno; + unsigned int new_sunit; + unsigned int sunit_guess; + int error; + + /* Try reading secondary supers to see if we find a good sb_unit. */ + for (agno = 1; agno < mp->m_sb.sb_agcount; agno++) { + error = -libxfs_sb_read_secondary(mp, NULL, agno, &bp); + if (error) + continue; + libxfs_sb_from_disk(&sb, XFS_BUF_TO_SBP(bp)); + libxfs_putbuf(bp); + + calc_rootino = libxfs_ialloc_calc_rootino(mp, sb.sb_unit); + if (calc_rootino == mp->m_sb.sb_rootino) + break; + } + + /* If we found a reasonable value, log where we found it. */ + if (calc_rootino == mp->m_sb.sb_rootino) { + do_warn(_("AG %u superblock contains plausible sb_unit value\n"), + agno); + new_sunit = sb.sb_unit; + goto fix; + } + + /* Try successive powers of two. */ + for (sunit_guess = 1; + sunit_guess <= XFS_AG_MAX_BLOCKS(mp->m_sb.sb_blocklog); + sunit_guess *= 2) { + calc_rootino = libxfs_ialloc_calc_rootino(mp, sunit_guess); + if (calc_rootino == mp->m_sb.sb_rootino) + break; + } + + /* If we found a reasonable value, log where we found it. */ + if (calc_rootino == mp->m_sb.sb_rootino) { + do_warn(_("Found an sb_unit value that looks plausible\n")); + new_sunit = sunit_guess; + goto fix; + } + + do_warn(_("Could not estimate a plausible sb_unit value\n")); + return; + +fix: + if (!no_modify) + do_warn(_("Resetting sb_unit to %u\n"), new_sunit); + else + do_warn(_("Would reset sb_unit to %u\n"), new_sunit); + + /* + * Just set the value -- safe since the superblock doesn't get flushed + * out if no_modify is set. + */ + mp->m_sb.sb_unit = new_sunit; + + /* Make sure that swidth is still a multiple of sunit. */ + if (mp->m_sb.sb_width % mp->m_sb.sb_unit == 0) + return; + + if (!no_modify) + do_warn(_("Resetting sb_width to %u\n"), new_sunit); + else + do_warn(_("Would reset sb_width to %u\n"), new_sunit); +} + /* * Make sure that the first 3 inodes in the filesystem are the root directory, * the realtime bitmap, and the realtime summary, in that order. @@ -480,6 +558,7 @@ calc_mkfs( do_warn( _("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"), mp->m_sb.sb_rootino, rootino); + guess_correct_sunit(mp); rootino = mp->m_sb.sb_rootino; }
On Tue, Feb 04, 2020 at 04:46:44PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Repair uses the verify_inum function to validate inode numbers that it > finds in the superblock and in directories. libxfs now has validator > functions to cover that kind of thing, so remove verify_inum(). As a > side bonus, this means that we will flag directories that point to the > quota/realtime metadata inodes. > > This fixes a regression found by fuzzing u3.sfdir3.hdr.parent.i4 to > lastbit (aka making a directory's .. point to the user quota inode) in > xfs/384. Whoops, this was supposed to be in the previous series, not this one. --D > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > libxfs/libxfs_api_defs.h | 1 + > repair/dino_chunks.c | 2 +- > repair/dinode.c | 29 ----------------------------- > repair/dinode.h | 4 ---- > repair/dir2.c | 7 +++---- > repair/phase4.c | 12 ++++++------ > repair/phase6.c | 8 ++++---- > 7 files changed, 15 insertions(+), 48 deletions(-) > > > diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h > index 6e09685b..9daf2635 100644 > --- a/libxfs/libxfs_api_defs.h > +++ b/libxfs/libxfs_api_defs.h > @@ -176,6 +176,7 @@ > #define xfs_trans_roll libxfs_trans_roll > > #define xfs_verify_cksum libxfs_verify_cksum > +#define xfs_verify_dir_ino libxfs_verify_dir_ino > #define xfs_verify_ino libxfs_verify_ino > #define xfs_verify_rtbno libxfs_verify_rtbno > #define xfs_zero_extent libxfs_zero_extent > diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c > index 00b67468..dbf3d37a 100644 > --- a/repair/dino_chunks.c > +++ b/repair/dino_chunks.c > @@ -65,7 +65,7 @@ check_aginode_block(xfs_mount_t *mp, > * inode chunk. returns number of new inodes if things are good > * and 0 if bad. start is the start of the discovered inode chunk. > * routine assumes that ino is a legal inode number > - * (verified by verify_inum()). If the inode chunk turns out > + * (verified by libxfs_verify_ino()). If the inode chunk turns out > * to be good, this routine will put the inode chunk into > * the good inode chunk tree if required. > * > diff --git a/repair/dinode.c b/repair/dinode.c > index 8af2cb25..0d9c96be 100644 > --- a/repair/dinode.c > +++ b/repair/dinode.c > @@ -171,35 +171,6 @@ verify_ag_bno(xfs_sb_t *sbp, > return 1; > } > > -/* > - * returns 0 if inode number is valid, 1 if bogus > - */ > -int > -verify_inum(xfs_mount_t *mp, > - xfs_ino_t ino) > -{ > - xfs_agnumber_t agno; > - xfs_agino_t agino; > - xfs_agblock_t agbno; > - xfs_sb_t *sbp = &mp->m_sb;; > - > - /* range check ag #, ag block. range-checking offset is pointless */ > - > - agno = XFS_INO_TO_AGNO(mp, ino); > - agino = XFS_INO_TO_AGINO(mp, ino); > - agbno = XFS_AGINO_TO_AGBNO(mp, agino); > - if (agbno == 0) > - return 1; > - > - if (ino == 0 || ino == NULLFSINO) > - return(1); > - > - if (ino != XFS_AGINO_TO_INO(mp, agno, agino)) > - return(1); > - > - return verify_ag_bno(sbp, agno, agbno); > -} > - > /* > * have a separate routine to ensure that we don't accidentally > * lose illegally set bits in the agino by turning it into an FSINO > diff --git a/repair/dinode.h b/repair/dinode.h > index aa177465..98238357 100644 > --- a/repair/dinode.h > +++ b/repair/dinode.h > @@ -77,10 +77,6 @@ verify_uncertain_dinode(xfs_mount_t *mp, > xfs_agnumber_t agno, > xfs_agino_t ino); > > -int > -verify_inum(xfs_mount_t *mp, > - xfs_ino_t ino); > - > int > verify_aginum(xfs_mount_t *mp, > xfs_agnumber_t agno, > diff --git a/repair/dir2.c b/repair/dir2.c > index e43a9732..723aee1f 100644 > --- a/repair/dir2.c > +++ b/repair/dir2.c > @@ -215,7 +215,7 @@ process_sf_dir2( > if (lino == ino) { > junkit = 1; > junkreason = _("current"); > - } else if (verify_inum(mp, lino)) { > + } else if (!libxfs_verify_dir_ino(mp, lino)) { > junkit = 1; > junkreason = _("invalid"); > } else if (lino == mp->m_sb.sb_rbmino) { > @@ -486,8 +486,7 @@ _("corrected entry offsets in directory %" PRIu64 "\n"), > * If the validation fails for the root inode we fix it in > * the next else case. > */ > - if (verify_inum(mp, *parent) && ino != mp->m_sb.sb_rootino) { > - > + if (!libxfs_verify_dir_ino(mp, *parent) && ino != mp->m_sb.sb_rootino) { > do_warn( > _("bogus .. inode number (%" PRIu64 ") in directory inode %" PRIu64 ", "), > *parent, ino); > @@ -674,7 +673,7 @@ process_dir2_data( > * (or did it ourselves) during phase 3. > */ > clearino = 0; > - } else if (verify_inum(mp, ent_ino)) { > + } else if (!libxfs_verify_dir_ino(mp, ent_ino)) { > /* > * Bad inode number. Clear the inode number and the > * entry will get removed later. We don't trash the > diff --git a/repair/phase4.c b/repair/phase4.c > index e1ba778f..8197db06 100644 > --- a/repair/phase4.c > +++ b/repair/phase4.c > @@ -36,7 +36,7 @@ quotino_check(xfs_mount_t *mp) > ino_tree_node_t *irec; > > if (mp->m_sb.sb_uquotino != NULLFSINO && mp->m_sb.sb_uquotino != 0) { > - if (verify_inum(mp, mp->m_sb.sb_uquotino)) > + if (!libxfs_verify_ino(mp, mp->m_sb.sb_uquotino)) > irec = NULL; > else > irec = find_inode_rec(mp, > @@ -52,7 +52,7 @@ quotino_check(xfs_mount_t *mp) > } > > if (mp->m_sb.sb_gquotino != NULLFSINO && mp->m_sb.sb_gquotino != 0) { > - if (verify_inum(mp, mp->m_sb.sb_gquotino)) > + if (!libxfs_verify_ino(mp, mp->m_sb.sb_gquotino)) > irec = NULL; > else > irec = find_inode_rec(mp, > @@ -68,7 +68,7 @@ quotino_check(xfs_mount_t *mp) > } > > if (mp->m_sb.sb_pquotino != NULLFSINO && mp->m_sb.sb_pquotino != 0) { > - if (verify_inum(mp, mp->m_sb.sb_pquotino)) > + if (!libxfs_verify_ino(mp, mp->m_sb.sb_pquotino)) > irec = NULL; > else > irec = find_inode_rec(mp, > @@ -112,9 +112,9 @@ quota_sb_check(xfs_mount_t *mp) > (mp->m_sb.sb_pquotino == NULLFSINO || mp->m_sb.sb_pquotino == 0)) { > lost_quotas = 1; > fs_quotas = 0; > - } else if (!verify_inum(mp, mp->m_sb.sb_uquotino) && > - !verify_inum(mp, mp->m_sb.sb_gquotino) && > - !verify_inum(mp, mp->m_sb.sb_pquotino)) { > + } else if (libxfs_verify_ino(mp, mp->m_sb.sb_uquotino) && > + libxfs_verify_ino(mp, mp->m_sb.sb_gquotino) && > + libxfs_verify_ino(mp, mp->m_sb.sb_pquotino)) { > fs_quotas = 1; > } > } > diff --git a/repair/phase6.c b/repair/phase6.c > index 0874b649..70135694 100644 > --- a/repair/phase6.c > +++ b/repair/phase6.c > @@ -1814,7 +1814,7 @@ longform_dir2_entry_check_data( > } > continue; > } > - ASSERT(no_modify || !verify_inum(mp, inum)); > + ASSERT(no_modify || libxfs_verify_dir_ino(mp, inum)); > /* > * special case the . entry. we know there's only one > * '.' and only '.' points to itself because bogus entries > @@ -1845,7 +1845,7 @@ longform_dir2_entry_check_data( > /* > * skip entries with bogus inumbers if we're in no modify mode > */ > - if (no_modify && verify_inum(mp, inum)) > + if (no_modify && !libxfs_verify_dir_ino(mp, inum)) > continue; > > /* validate ftype field if supported */ > @@ -2634,14 +2634,14 @@ shortform_dir2_entry_check(xfs_mount_t *mp, > fname[sfep->namelen] = '\0'; > > ASSERT(no_modify || (lino != NULLFSINO && lino != 0)); > - ASSERT(no_modify || !verify_inum(mp, lino)); > + ASSERT(no_modify || libxfs_verify_dir_ino(mp, lino)); > > /* > * Also skip entries with bogus inode numbers if we're > * in no modify mode. > */ > > - if (no_modify && verify_inum(mp, lino)) { > + if (no_modify && !libxfs_verify_dir_ino(mp, lino)) { > next_sfep = libxfs_dir2_sf_nextentry(mp, sfp, sfep); > continue; > } >
On Tue, Feb 04, 2020 at 04:46:44PM -0800, Darrick J. Wong wrote:
> This fixes a regression found by fuzzing u3.sfdir3.hdr.parent.i4 to
> lastbit (aka making a directory's .. point to the user quota inode) in
> xfs/384.
Is that a bug or a regression? If the latter, what commit caused the
regression?
Otherwise looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:46:50PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Make sure the root inode gets created where repair thinks it should be
> created.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:46:56PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> xfs_repair has a very old check that evidently excuses the AG 0 inode
> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> AG headers). mkfs never formats filesystems that way and it looks like
> an error, so purge the check. After this, we always complain if inodes
> overlap with AG headers because that should never happen.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:47:02PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Refactor the checking and resetting of fixed-location inodes (root,
> rbmino, rsumino) into a helper function.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:47:09PM -0800, Darrick J. Wong wrote:
> + rootino = libxfs_ialloc_calc_rootino(mp, mp->m_sb.sb_unit);
>
> + validate_sb_ino(&mp->m_sb.sb_rootino, rootino,
> _("root"));
> + validate_sb_ino(&mp->m_sb.sb_rbmino, rootino + 1,
> _("realtime bitmap"));
> + validate_sb_ino(&mp->m_sb.sb_rsumino, rootino + 2,
I think the first two calls don't need to be split over two lines.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:47:15PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> If sb_rootino doesn't point to where we think mkfs should have allocated
> the root directory, check to see if the alleged root directory actually
> looks like a root directory. If so, we'll let it live because someone
> could have changed sunit since formatting time, and that changes the
> root directory inode estimate.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 04, 2020 at 04:47:21PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> If the primary superblock's sb_unit leads to a rootino calculation that
> doesn't match sb_rootino /but/ we can find a secondary superblock whose
> sb_unit does match, fix the primary.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Feb 17, 2020 at 05:50:01AM -0800, Christoph Hellwig wrote: > On Tue, Feb 04, 2020 at 04:46:44PM -0800, Darrick J. Wong wrote: > > This fixes a regression found by fuzzing u3.sfdir3.hdr.parent.i4 to > > lastbit (aka making a directory's .. point to the user quota inode) in > > xfs/384. > > Is that a bug or a regression? If the latter, what commit caused the > regression? Eh, it's a bug found by a fuzzer fstest, so I guess this should be reworded somehwat: "This fixes a bug found by fuzzing..." --D > Otherwise looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de>
On 2/4/20 4:46 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Repair uses the verify_inum function to validate inode numbers that it
> finds in the superblock and in directories. libxfs now has validator
> functions to cover that kind of thing, so remove verify_inum(). As a
> side bonus, this means that we will flag directories that point to the
> quota/realtime metadata inodes.
>
> This fixes a regression found by fuzzing u3.sfdir3.hdr.parent.i4 to
> lastbit (aka making a directory's .. point to the user quota inode) in
> xfs/384.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
(I wanted to be sure 0 and NULLFSINO were still properly rejected; they are.)
I'll do my best to remember to edit the changelog, and
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
On 2/4/20 4:46 PM, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > xfs_repair has a very old check that evidently excuses the AG 0 inode > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g. > AG headers). mkfs never formats filesystems that way and it looks like > an error, so purge the check. After this, we always complain if inodes > overlap with AG headers because that should never happen. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> I know it's hard to keep track, but it'd be nice if > - ASSERT(M_IGEO(mp)->ialloc_blks > 0); this line had been kept per the feedback on the last patchset... This also lost my feedback the first time, re: @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\ break; case XR_E_INUSE_FS: case XR_E_INUSE_FS1: "I guess there's no real reason to list a couple cases that all fall through to default:, I'd just remove them as well since they aren't any more special than the other unmentioned cases." - if (agno == 0 && - ino + j >= first_prealloc_ino && - ino + j < last_prealloc_ino) { - do_warn( -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"), - agno, agbno, mp->m_sb.sb_inopblock); - - set_bmap(agno, agbno, XR_E_INO); - suspect++; - break; - } - /* fall through */ default: I guess I should stop saying "I'll do that on the way in" if 2 more versions are going to hit the list, maybe it takes the feedback off your radar. -Eric
On 2/4/20 4:46 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> xfs_repair has a very old check that evidently excuses the AG 0 inode
> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g.
> AG headers). mkfs never formats filesystems that way and it looks like
> an error, so purge the check. After this, we always complain if inodes
> overlap with AG headers because that should never happen.
On a previous version, you and Brian had a fairly long conversation about
the warning this presents, and how it doesn't tell the user what to do
about it, and how the warning will persist, and may generate bug reports
or questions.
It sounded like you had a plan to address that, which does not seem to be
present in this patch? So I'm not sure Brian's concerns have been resolved
yet.
-Eric
On Wed, Feb 26, 2020 at 09:09:42AM -0800, Eric Sandeen wrote: > On 2/4/20 4:46 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > xfs_repair has a very old check that evidently excuses the AG 0 inode > > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g. > > AG headers). mkfs never formats filesystems that way and it looks like > > an error, so purge the check. After this, we always complain if inodes > > overlap with AG headers because that should never happen. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > I know it's hard to keep track, but it'd be nice if > > > - ASSERT(M_IGEO(mp)->ialloc_blks > 0); > > this line had been kept per the feedback on the last patchset... I've added that back. For real this time. > This also lost my feedback the first time, re: > > @@ -1782,18 +1775,6 @@ _("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\ > break; > case XR_E_INUSE_FS: > case XR_E_INUSE_FS1: > > "I guess there's no real reason to list a couple cases that all fall through > to default:, I'd just remove them as well since they aren't any more special > than the other unmentioned cases." > > - if (agno == 0 && > - ino + j >= first_prealloc_ino && > - ino + j < last_prealloc_ino) { > - do_warn( > -_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"), > - agno, agbno, mp->m_sb.sb_inopblock); > - > - set_bmap(agno, agbno, XR_E_INO); > - suspect++; > - break; > - } > - /* fall through */ > default: > > I guess I should stop saying "I'll do that on the way in" if 2 more > versions are going to hit the list, maybe it takes the feedback off > your radar. I (almost) always make the changes to my local tree even if you say you'll do it on the way in, because that makes it easier to compare the for-next tree vs. my about-to-be-rebased dev tree. Unfortunately, I do occasionally slip up and forget to make the changes, even if I've sent email assenting to the changes, because there's not anything linking "I will make this change" in the email thread to actually scribbling in the git tree. Add to that the fact that email clients don't maintain spatial locality between v3->v4->v5 of a patchset and that just makes it more difficult to stay on top of reviews as a developer, because I can't even self-check without having to scroll through hundreds of emails. So yeah, I guess I'll go review my reviews... sorry for the crap. --D > -Eric
On Wed, Feb 26, 2020 at 09:19:53AM -0800, Eric Sandeen wrote: > On 2/4/20 4:46 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > xfs_repair has a very old check that evidently excuses the AG 0 inode > > btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g. > > AG headers). mkfs never formats filesystems that way and it looks like > > an error, so purge the check. After this, we always complain if inodes > > overlap with AG headers because that should never happen. > > On a previous version, you and Brian had a fairly long conversation about > the warning this presents, and how it doesn't tell the user what to do > about it, and how the warning will persist, and may generate bug reports > or questions. > > It sounded like you had a plan to address that, which does not seem to be > present in this patch? So I'm not sure Brian's concerns have been resolved > yet. I'm confused about "the warning this presents" -- are you talking about this patch specifically, where we couldn't figure out the weird masking behavior that dated back to 2001 and the hysterical raisins? Or are you referring to Brian's criticism of earlier versions of this series that would whine about our root inode computation not leading to the root inode without actually telling the user what to do about it? If it's the second, then I the answer is that I added another patch ("xfs_repair: try to correct sb_unit value from secondaries") to try to recover a working sunit value from the backup superblocks, or try some power of two guesses to see if we find one that matches, and then reset the value to something that will make the computation work again. --D > > -Eric
On 2/26/20 9:24 AM, Darrick J. Wong wrote: > On Wed, Feb 26, 2020 at 09:09:42AM -0800, Eric Sandeen wrote: ... >> I guess I should stop saying "I'll do that on the way in" if 2 more >> versions are going to hit the list, maybe it takes the feedback off >> your radar. > > I (almost) always make the changes to my local tree even if you say > you'll do it on the way in, because that makes it easier to compare the > for-next tree vs. my about-to-be-rebased dev tree. > > Unfortunately, I do occasionally slip up and forget to make the changes, > even if I've sent email assenting to the changes, because there's not > anything linking "I will make this change" in the email thread to > actually scribbling in the git tree. > > Add to that the fact that email clients don't maintain spatial locality > between v3->v4->v5 of a patchset and that just makes it more difficult > to stay on top of reviews as a developer, because I can't even > self-check without having to scroll through hundreds of emails. > > So yeah, I guess I'll go review my reviews... sorry for the crap. No problem. We're all in this together. ;) Thanks, -Eric > --D
On 2/26/20 9:32 AM, Darrick J. Wong wrote: > On Wed, Feb 26, 2020 at 09:19:53AM -0800, Eric Sandeen wrote: >> On 2/4/20 4:46 PM, Darrick J. Wong wrote: >>> From: Darrick J. Wong <darrick.wong@oracle.com> >>> >>> xfs_repair has a very old check that evidently excuses the AG 0 inode >>> btrees pointing to blocks that are already marked XR_E_INUSE_FS* (e.g. >>> AG headers). mkfs never formats filesystems that way and it looks like >>> an error, so purge the check. After this, we always complain if inodes >>> overlap with AG headers because that should never happen. >> >> On a previous version, you and Brian had a fairly long conversation about >> the warning this presents, and how it doesn't tell the user what to do >> about it, and how the warning will persist, and may generate bug reports >> or questions. >> >> It sounded like you had a plan to address that, which does not seem to be >> present in this patch? So I'm not sure Brian's concerns have been resolved >> yet. > > I'm confused about "the warning this presents" -- are you talking about > this patch specifically, where we couldn't figure out the weird masking > behavior that dated back to 2001 and the hysterical raisins? nah I made my peace with that ;) > Or are you referring to Brian's criticism of earlier versions of this > series that would whine about our root inode computation not leading to > the root inode without actually telling the user what to do about it? Good grief I sent this reply on the wrong patch, the discussion was on "check plausibility of root dir pointer before trashing it" me-- > If it's the second, then I the answer is that I added another patch > ("xfs_repair: try to correct sb_unit value from secondaries") to try to > recover a working sunit value from the backup superblocks, or try some > power of two guesses to see if we find one that matches, and then reset > the value to something that will make the computation work again. Ah ok, got it. Let me go ask a question in reply to that. Sorry for the confusion. Thanks, -Eric
On 2/4/20 4:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> If the primary superblock's sb_unit leads to a rootino calculation that
> doesn't match sb_rootino /but/ we can find a secondary superblock whose
> sb_unit does match, fix the primary.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
With the previous patch issuing a warning
+_("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"),
What does this do in the case where the user intentionally changed the
sunit, which is (I think) the situation launched this work in the first
place?
Will that warning persist in the case of an intentional sunit change?
Thanks,
-Eric
On Wed, Feb 26, 2020 at 09:42:01AM -0800, Eric Sandeen wrote: > On 2/4/20 4:47 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > If the primary superblock's sb_unit leads to a rootino calculation that > > doesn't match sb_rootino /but/ we can find a secondary superblock whose > > sb_unit does match, fix the primary. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > With the previous patch issuing a warning > > +_("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"), > > What does this do in the case where the user intentionally changed the > sunit, which is (I think) the situation launched this work in the first > place? [echoing what we just discussed on irc] repair will print the above warning, and then it'll try to find either a backup sb with a sunit value that makes the computation work, or guess at sunit values until it finds one that makes the computation work. If it finds a reasonable sunit value it'll change it back to that. If the user mounts an old kernel with the same bad sunit value, the kernel will set it back to the bad value and we wash, rinse, and repeat. If the user mounts a new kernel with the same bad sunit value, it'll decline to change the sunit value. If the user runs the bad-sunit fs against an old xfs_repair it will descend into madness nuking the root directory, which is why we're trying to nudge ourselves away from the bad value. > Will that warning persist in the case of an intentional sunit change? Yes. --D > Thanks, > -Eric >