From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ipmail03.adl2.internode.on.net ([150.101.137.141]:53187 "EHLO ipmail03.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751769AbeE2DfA (ORCPT ); Mon, 28 May 2018 23:35:00 -0400 Date: Tue, 29 May 2018 13:28:10 +1000 From: Dave Chinner Subject: Re: [PATCH v2 06/22] xfs: add a repair helper to reset superblock counters Message-ID: <20180529032810.GM10363@dastard> References: <152642361893.1556.9335169821674946249.stgit@magnolia> <152642365674.1556.6776151224606075985.stgit@magnolia> <20180518035623.GD23858@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180518035623.GD23858@magnolia> Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: "Darrick J. Wong" Cc: linux-xfs@vger.kernel.org On Thu, May 17, 2018 at 08:56:23PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Add a helper function to reset the superblock inode and block counters. > The AG rebuilding functions will need these to adjust the counts if they > need to change as a part of recovering from corruption. > > Signed-off-by: Darrick J. Wong > Reviewed-by: Allison Henderson > --- > v2: improve documentation > --- > fs/xfs/scrub/repair.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++++ > fs/xfs/scrub/repair.h | 7 ++++ > fs/xfs/scrub/scrub.c | 2 + > fs/xfs/scrub/scrub.h | 1 + > 4 files changed, 99 insertions(+) > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > index 877488ce4bc8..4b95a15c0bd0 100644 > --- a/fs/xfs/scrub/repair.c > +++ b/fs/xfs/scrub/repair.c > @@ -1026,3 +1026,92 @@ xfs_repair_find_ag_btree_roots( > > return error; > } > + > +/* > + * Reset the superblock counters. > + * > + * If a repair function changes the inode or free block counters, it must set > + * reset_counters to push this function to reset the global counters. Repair > + * functions are responsible for resetting all other in-core state. This > + * function runs outside of transaction context after the repair context has > + * been torn down, so if there's further filesystem corruption we'll error out > + * to userspace and give userspace a chance to call back to fix the further > + * errors. > + */ > +int > +xfs_repair_reset_counters( > + struct xfs_mount *mp) > +{ > + struct xfs_buf *agi_bp; > + struct xfs_buf *agf_bp; > + struct xfs_agi *agi; > + struct xfs_agf *agf; > + xfs_agnumber_t agno; > + xfs_ino_t icount = 0; > + xfs_ino_t ifree = 0; > + xfs_filblks_t fdblocks = 0; > + int64_t delta_icount; > + int64_t delta_ifree; > + int64_t delta_fdblocks; > + int error; > + > + trace_xfs_repair_reset_counters(mp); > + > + for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { > + /* Count all the inodes... */ > + error = xfs_ialloc_read_agi(mp, NULL, agno, &agi_bp); > + if (error) > + return error; > + agi = XFS_BUF_TO_AGI(agi_bp); > + icount += be32_to_cpu(agi->agi_count); > + ifree += be32_to_cpu(agi->agi_freecount); > + xfs_buf_relse(agi_bp); > + > + /* Add up the free/freelist/bnobt/cntbt blocks... */ > + error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agf_bp); > + if (error) > + return error; > + if (!agf_bp) > + return -ENOMEM; > + agf = XFS_BUF_TO_AGF(agf_bp); > + fdblocks += be32_to_cpu(agf->agf_freeblks); > + fdblocks += be32_to_cpu(agf->agf_flcount); > + fdblocks += be32_to_cpu(agf->agf_btreeblks); > + xfs_buf_relse(agf_bp); > + } > + > + /* > + * Reinitialize the counters. The on-disk and in-core counters differ > + * by the number of inodes/blocks reserved by the admin, the per-AG > + * reservation, and any transactions in progress, so we have to > + * account for that. First we take the sb lock and update its > + * counters... > + */ > + spin_lock(&mp->m_sb_lock); > + delta_icount = (int64_t)mp->m_sb.sb_icount - icount; > + delta_ifree = (int64_t)mp->m_sb.sb_ifree - ifree; > + delta_fdblocks = (int64_t)mp->m_sb.sb_fdblocks - fdblocks; > + mp->m_sb.sb_icount = icount; > + mp->m_sb.sb_ifree = ifree; > + mp->m_sb.sb_fdblocks = fdblocks; > + spin_unlock(&mp->m_sb_lock); This seems racy to me ? i.e. the per-ag counters can change while we are summing them, and once we've summed them then sb counters can change while we are waiting for the m_sb_lock. It's looks to me like the summed per-ag counters are not in any way coherent wit the superblock or the in-core per-CPU counters, so I'm struggling to understand why this is safe? We can do this sort of summation at mount time (in xfs_initialize_perag_data()) because the filesystem is running single threaded while the summation is taking place and so nothing is changing during th summation. The filesystem is active in this case, so I don't think we can do the same thing here. Also, it brought a question to mind because I haven't clearly noted it happening yet: when do the xfs_perag counters get corrected? And if they are already correct, why not just iterate the perag counters? Cheers, Dave. -- Dave Chinner david@fromorbit.com