From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2120.oracle.com ([156.151.31.85]:37062 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729331AbeG3UVK (ORCPT ); Mon, 30 Jul 2018 16:21:10 -0400 Date: Mon, 30 Jul 2018 11:44:45 -0700 From: "Darrick J. Wong" Subject: Re: [PATCH 04/14] xfs: repair the AGI Message-ID: <20180730184445.GA30972@magnolia> References: <153292966714.24509.15809693393247424274.stgit@magnolia> <153292969532.24509.17576845400762793279.stgit@magnolia> <20180730182051.GE35107@bfoster> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180730182051.GE35107@bfoster> Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Brian Foster Cc: linux-xfs@vger.kernel.org, david@fromorbit.com, allison.henderson@oracle.com On Mon, Jul 30, 2018 at 02:20:51PM -0400, Brian Foster wrote: > On Sun, Jul 29, 2018 at 10:48:15PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong > > > > Rebuild the AGI header items with some help from the rmapbt. > > > > Signed-off-by: Darrick J. Wong > > --- > > A couple nits and future thoughts.. > > > fs/xfs/scrub/agheader_repair.c | 220 ++++++++++++++++++++++++++++++++++++++++ > > fs/xfs/scrub/repair.h | 2 > > fs/xfs/scrub/scrub.c | 2 > > 3 files changed, 223 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c > > index bfef066c87c3..921e7d42a2ef 100644 > > --- a/fs/xfs/scrub/agheader_repair.c > > +++ b/fs/xfs/scrub/agheader_repair.c > > @@ -700,3 +700,223 @@ xrep_agfl( > > xfs_bitmap_destroy(&agfl_extents); > > return error; > > } > ... > > +STATIC int > > +xrep_agi_find_btrees( > > + struct xfs_scrub *sc, > > + struct xrep_find_ag_btree *fab) > > +{ > > + struct xfs_buf *agf_bp; > > + struct xfs_mount *mp = sc->mp; > > + int error; > > + > > + /* Read the AGF. */ > > + error = xfs_alloc_read_agf(mp, sc->tp, sc->sa.agno, 0, &agf_bp); > > + if (error) > > + return error; > > + if (!agf_bp) > > + return -ENOMEM; > > + > > + /* Find the btree roots. */ > > + error = xrep_find_ag_btree_roots(sc, agf_bp, fab, NULL); > > + if (error) > > + return error; > > + > > + /* We must find the inobt root. */ > > + if (fab[XREP_AGI_INOBT].root == NULLAGBLOCK || > > + fab[XREP_AGI_INOBT].height > XFS_BTREE_MAXLEVELS) > > + return -EFSCORRUPTED; > > + > > + /* We must find the finobt root if that feature is enabled. */ > > + if (xfs_sb_version_hasfinobt(&mp->m_sb) && > > + (fab[XREP_AGI_FINOBT].root == NULLAGBLOCK || > > + fab[XREP_AGI_FINOBT].height > XFS_BTREE_MAXLEVELS)) > > + return -EFSCORRUPTED; > > Skimming around some of the existing code to find the btree roots, I > notice that the .root field is going to be != NULLAGBLOCK so long as we > find at least one appropriately typed block in the rmapbt. I know you > mentioned that we depend on a correct rmapbt atm, but I'm wondering if > there's room for slightly more robust error checks to at least prevent > us from doing something damaging. For example, perhaps we could unset > .root when we've found a second block at the same level as the current > "root," and/or check some of the generic characteristics of a root btree > block (no left/right siblings) once we're done..? Good suggestion, I'll add it to xrep_findroot_block. > > + > > + return 0; > > +} > > + > ... > > +/* Trigger reinitialization of the in-core data. */ > > +STATIC int > > +xrep_agi_commit_new( > > + struct xfs_scrub *sc, > > + struct xfs_buf *agi_bp, > > + const struct xfs_agi *old_agi) > > old_agi is unused here. > > > +{ > > + struct xfs_perag *pag; > > + struct xfs_agi *agi = XFS_BUF_TO_AGI(agi_bp); > > + > > + /* Trigger inode count recalculation */ > > + xfs_force_summary_recalc(sc->mp); > > + > > + /* Write this to disk. */ > > + xfs_trans_buf_set_type(sc->tp, agi_bp, XFS_BLFT_AGI_BUF); > > + xfs_trans_log_buf(sc->tp, agi_bp, 0, BBTOB(agi_bp->b_length) - 1); > > + > > + /* Now reinitialize the in-core counters if necessary. */ > > + pag = sc->sa.pag; > > + sc->sa.pag->pagi_init = 1; > > Same s/sc->sa.pag/pag/ nit here as before. Both fixed. > > + pag->pagi_count = be32_to_cpu(agi->agi_count); > > + pag->pagi_freecount = be32_to_cpu(agi->agi_freecount); > > + > > + return 0; > > +} > > + > > +/* Repair the AGI. */ > > +int > > +xrep_agi( > > + struct xfs_scrub *sc) > > +{ > > + struct xrep_find_ag_btree fab[XREP_AGI_MAX] = { > > + [XREP_AGI_INOBT] = { > > + .rmap_owner = XFS_RMAP_OWN_INOBT, > > + .buf_ops = &xfs_inobt_buf_ops, > > + .magic = XFS_IBT_CRC_MAGIC, > > + }, > > + [XREP_AGI_FINOBT] = { > > + .rmap_owner = XFS_RMAP_OWN_INOBT, > > + .buf_ops = &xfs_inobt_buf_ops, > > + .magic = XFS_FIBT_CRC_MAGIC, > > + }, > > + [XREP_AGI_END] = { > > + .buf_ops = NULL > > + }, > > + }; > > + struct xfs_agi old_agi; > > It's not immediately clear to me how much of a danger this is here, if > at all, but FWIW xfs_agi is one of our larger structures at 336 bytes > (mostly due to agi_unlinked). I'm not terribly concerned if this isn't > currently exploding, but it might be worth thinking about another > technique to preserve original behavior without the stack usage. Perhaps > we could use an uncached buffer to preserve the original data and > implement an xfs_buf_copy() helper to facilitate, for example. It's not a huge deal since (vmap) kernel stacks are 16K(!) these days on 64-bit machines, but if it ever becomes a problem we can simply allocate some memory in sc->buf in the setup routine. --D > Brian > > > + struct xfs_mount *mp = sc->mp; > > + struct xfs_buf *agi_bp; > > + struct xfs_agi *agi; > > + int error; > > + > > + /* We require the rmapbt to rebuild anything. */ > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) > > + return -EOPNOTSUPP; > > + > > + xchk_perag_get(sc->mp, &sc->sa); > > + /* > > + * Make sure we have the AGI buffer, as scrub might have decided it > > + * was corrupt after xfs_ialloc_read_agi failed with -EFSCORRUPTED. > > + */ > > + error = xfs_trans_read_buf(mp, sc->tp, mp->m_ddev_targp, > > + XFS_AG_DADDR(mp, sc->sa.agno, XFS_AGI_DADDR(mp)), > > + XFS_FSS_TO_BB(mp, 1), 0, &agi_bp, NULL); > > + if (error) > > + return error; > > + agi_bp->b_ops = &xfs_agi_buf_ops; > > + agi = XFS_BUF_TO_AGI(agi_bp); > > + > > + /* Find the AGI btree roots. */ > > + error = xrep_agi_find_btrees(sc, fab); > > + if (error) > > + return error; > > + > > + /* Start rewriting the header and implant the btrees we found. */ > > + xrep_agi_init_header(sc, agi_bp, &old_agi); > > + xrep_agi_set_roots(sc, agi, fab); > > + error = xrep_agi_calc_from_btrees(sc, agi_bp); > > + if (error) > > + goto out_revert; > > + > > + /* Reinitialize in-core state. */ > > + return xrep_agi_commit_new(sc, agi_bp, &old_agi); > > + > > +out_revert: > > + /* Mark the incore AGI state stale and revert the AGI. */ > > + sc->sa.pag->pagi_init = 0; > > + memcpy(agi, &old_agi, sizeof(old_agi)); > > + return error; > > +} > > diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h > > index 1d283360b5ab..9de321eee4ab 100644 > > --- a/fs/xfs/scrub/repair.h > > +++ b/fs/xfs/scrub/repair.h > > @@ -60,6 +60,7 @@ int xrep_probe(struct xfs_scrub *sc); > > int xrep_superblock(struct xfs_scrub *sc); > > int xrep_agf(struct xfs_scrub *sc); > > int xrep_agfl(struct xfs_scrub *sc); > > +int xrep_agi(struct xfs_scrub *sc); > > > > #else > > > > @@ -85,6 +86,7 @@ xrep_calc_ag_resblks( > > #define xrep_superblock xrep_notsupported > > #define xrep_agf xrep_notsupported > > #define xrep_agfl xrep_notsupported > > +#define xrep_agi xrep_notsupported > > > > #endif /* CONFIG_XFS_ONLINE_REPAIR */ > > > > diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c > > index 2670f4cf62f4..4bfae1e61d30 100644 > > --- a/fs/xfs/scrub/scrub.c > > +++ b/fs/xfs/scrub/scrub.c > > @@ -226,7 +226,7 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { > > .type = ST_PERAG, > > .setup = xchk_setup_fs, > > .scrub = xchk_agi, > > - .repair = xrep_notsupported, > > + .repair = xrep_agi, > > }, > > [XFS_SCRUB_TYPE_BNOBT] = { /* bnobt */ > > .type = ST_PERAG, > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html