All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 06/21] xfs: repair free space btrees
Date: Tue, 3 Jul 2018 19:15:04 -0700	[thread overview]
Message-ID: <20180704021504.GA32415@magnolia> (raw)
In-Reply-To: <20180627032123.GF19934@dastard>

On Wed, Jun 27, 2018 at 01:21:23PM +1000, Dave Chinner wrote:
> On Sun, Jun 24, 2018 at 12:24:07PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Rebuild the free space btrees from the gaps in the rmap btree.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> ......
> > +
> > +/* Collect an AGFL block for the not-to-release list. */
> > +static int
> > +xfs_repair_collect_agfl_block(
> > +	struct xfs_mount		*mp,
> > +	xfs_agblock_t			bno,
> > +	void				*priv)

Whoah, I never replied to this.  Oops. :(

> /me now gets confused by agfl code (xfs_repair_agfl_...) collecting btree
> blocks, and now the btree code (xfs_repair_collect_agfl... )
> collecting agfl blocks.
> 
> The naming/namespace collisions is not that nice. I think this needs
> to be xr_allocbt_collect_agfl_blocks().

> /me idly wonders about consistently renaming everything abt, bnbt, cnbt,
> fibt, ibt, rmbt and rcbt...

Hmm, I'll think about a mass rename. :)

xfs_repair_refcountbt_fiddle_faddle(...);

xrep_rcbt_fiddle_faddle(...);

xrpr_rcbt_fiddle_faddle(...);

xrprrcbt_fiddle_faddle(...);

Yeah, maybe that third one.

> 
> > +/*
> > + * Iterate all reverse mappings to find (1) the free extents, (2) the OWN_AG
> > + * extents, (3) the rmapbt blocks, and (4) the AGFL blocks.  The free space is
> > + * (1) + (2) - (3) - (4).  Figure out if we have enough free space to
> > + * reconstruct the free space btrees.  Caller must clean up the input lists
> > + * if something goes wrong.
> > + */
> > +STATIC int
> > +xfs_repair_allocbt_find_freespace(
> > +	struct xfs_scrub_context	*sc,
> > +	struct list_head		*free_extents,
> > +	struct xfs_repair_extent_list	*old_allocbt_blocks)
> > +{
> > +	struct xfs_repair_alloc		ra;
> > +	struct xfs_repair_alloc_extent	*rae;
> > +	struct xfs_btree_cur		*cur;
> > +	struct xfs_mount		*mp = sc->mp;
> > +	xfs_agblock_t			agend;
> > +	xfs_agblock_t			nr_blocks;
> > +	int				error;
> > +
> > +	ra.extlist = free_extents;
> > +	ra.btlist = old_allocbt_blocks;
> > +	xfs_repair_init_extent_list(&ra.nobtlist);
> > +	ra.next_bno = 0;
> > +	ra.nr_records = 0;
> > +	ra.nr_blocks = 0;
> > +	ra.sc = sc;
> > +
> > +	/*
> > +	 * Iterate all the reverse mappings to find gaps in the physical
> > +	 * mappings, all the OWN_AG blocks, and all the rmapbt extents.
> > +	 */
> > +	cur = xfs_rmapbt_init_cursor(mp, sc->tp, sc->sa.agf_bp, sc->sa.agno);
> > +	error = xfs_rmap_query_all(cur, xfs_repair_alloc_extent_fn, &ra);
> > +	if (error)
> > +		goto err;
> > +	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
> > +	cur = NULL;
> > +
> > +	/* Insert a record for space between the last rmap and EOAG. */
> > +	agend = be32_to_cpu(XFS_BUF_TO_AGF(sc->sa.agf_bp)->agf_length);
> > +	if (ra.next_bno < agend) {
> > +		rae = kmem_alloc(sizeof(struct xfs_repair_alloc_extent),
> > +				KM_MAYFAIL);
> > +		if (!rae) {
> > +			error = -ENOMEM;
> > +			goto err;
> > +		}
> > +		INIT_LIST_HEAD(&rae->list);
> > +		rae->bno = ra.next_bno;
> > +		rae->len = agend - ra.next_bno;
> > +		list_add_tail(&rae->list, free_extents);
> > +		ra.nr_records++;
> > +	}
> > +
> > +	/* Collect all the AGFL blocks. */
> > +	error = xfs_agfl_walk(mp, XFS_BUF_TO_AGF(sc->sa.agf_bp),
> > +			sc->sa.agfl_bp, xfs_repair_collect_agfl_block, &ra);
> > +	if (error)
> > +		goto err;
> > +
> > +	/* Do we actually have enough space to do this? */
> > +	nr_blocks = 2 * xfs_allocbt_calc_size(mp, ra.nr_records);
> 
> 	/* Do we have enough space to rebuild both freespace trees? */
> 
> (explains the multiplication by 2)

Yep, will fix.

> > +	if (!xfs_repair_ag_has_space(sc->sa.pag, nr_blocks, XFS_AG_RESV_NONE) ||
> > +	    ra.nr_blocks < nr_blocks) {
> > +		error = -ENOSPC;
> > +		goto err;
> > +	}
> > +
> > +	/* Compute the old bnobt/cntbt blocks. */
> > +	error = xfs_repair_subtract_extents(sc, old_allocbt_blocks,
> > +			&ra.nobtlist);
> > +	if (error)
> > +		goto err;
> > +	xfs_repair_cancel_btree_extents(sc, &ra.nobtlist);
> > +	return 0;
> > +
> > +err:
> > +	xfs_repair_cancel_btree_extents(sc, &ra.nobtlist);
> > +	if (cur)
> > +		xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
> > +	return error;
> 
> Error stacking here can be cleaned up - we don't need an extra stack
> as the cursor is NULL when finished with. Hence it could just be:
> 
> 	/* Compute the old bnobt/cntbt blocks. */
> 	error = xfs_repair_subtract_extents(sc, old_allocbt_blocks,
> 			&ra.nobtlist);
> err:
> 	xfs_repair_cancel_btree_extents(sc, &ra.nobtlist);
> 	if (cur)
> 		xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);

TBH I've been tempted for years to refactor this thing to take error
directly rather than require this XFS_BTREE_{NO,}ERROR business.
There's only two choices, and we nearly always decide using error == 0.

> 	return error;
> }
> 
> 
> > +}
> > +
> > +/*
> > + * Reset the global free block counter and the per-AG counters to make it look
> > + * like this AG has no free space.
> > + */
> > +STATIC int
> > +xfs_repair_allocbt_reset_counters(
> > +	struct xfs_scrub_context	*sc,
> > +	int				*log_flags)
> > +{
> > +	struct xfs_perag		*pag = sc->sa.pag;
> > +	struct xfs_agf			*agf;
> > +	xfs_extlen_t			oldf;
> > +	xfs_agblock_t			rmap_blocks;
> > +	int				error;
> > +
> > +	/*
> > +	 * Since we're abandoning the old bnobt/cntbt, we have to
> > +	 * decrease fdblocks by the # of blocks in those trees.
> > +	 * btreeblks counts the non-root blocks of the free space
> > +	 * and rmap btrees.  Do this before resetting the AGF counters.
> 
> Comment can use 80 columns.
> 
> > +	agf = XFS_BUF_TO_AGF(sc->sa.agf_bp);
> > +	rmap_blocks = be32_to_cpu(agf->agf_rmap_blocks) - 1;
> > +	oldf = pag->pagf_btreeblks + 2;
> > +	oldf -= rmap_blocks;
> 
> Convoluted. The comment really didn't help me understand what oldf
> is accounting.
> 
> Ah, rmap_blocks is actually the new btreeblks count. OK.
> 
> 	/*
> 	 * Since we're abandoning the old bnobt/cntbt, we have to decrease
> 	 * fdblocks by the # of blocks in those trees.  btreeblks counts the
> 	 * non-root blocks of the free space and rmap btrees.  Do this before
> 	 * resetting the AGF counters.
> 	 */
> 
> 	agf = XFS_BUF_TO_AGF(sc->sa.agf_bp);
> 
> 	/* rmap_blocks accounts root block, btreeblks doesn't */
> 	new_btblks = be32_to_cpu(agf->agf_rmap_blocks) - 1;
> 
> 	/* btreeblks doesn't account bno/cnt root blocks */
> 	to_free = pag->pagf_btreeblks + 2;
> 
> 	/* and don't account for the blocks we aren't freeing */
> 	to_free -= new_btblks;

Ok, I'll do that.

> 
> > +	error = xfs_mod_fdblocks(sc->mp, -(int64_t)oldf, false);
> > +	if (error)
> > +		return error;
> > +
> > +	/* Reset the per-AG info, both incore and ondisk. */
> > +	pag->pagf_btreeblks = rmap_blocks;
> > +	pag->pagf_freeblks = 0;
> > +	pag->pagf_longest = 0;
> > +
> > +	agf->agf_btreeblks = cpu_to_be32(pag->pagf_btreeblks);
> 
> I'd prefer that you use new_btblks here, too. Easier to see at a
> glance that the on-disk agf is being set to the new value....

Ok.

> 
> 
> > +	agf->agf_freeblks = 0;
> > +	agf->agf_longest = 0;
> > +	*log_flags |= XFS_AGF_BTREEBLKS | XFS_AGF_LONGEST | XFS_AGF_FREEBLKS;
> > +
> > +	return 0;
> > +}
> > +
> > +/* Initialize new bnobt/cntbt roots and implant them into the AGF. */
> > +STATIC int
> > +xfs_repair_allocbt_reset_btrees(
> > +	struct xfs_scrub_context	*sc,
> > +	struct list_head		*free_extents,
> > +	int				*log_flags)
> > +{
> > +	struct xfs_owner_info		oinfo;
> > +	struct xfs_repair_alloc_extent	*cached = NULL;
> > +	struct xfs_buf			*bp;
> > +	struct xfs_perag		*pag = sc->sa.pag;
> > +	struct xfs_mount		*mp = sc->mp;
> > +	struct xfs_agf			*agf;
> > +	xfs_fsblock_t			bnofsb;
> > +	xfs_fsblock_t			cntfsb;
> > +	int				error;
> > +
> > +	/* Allocate new bnobt root. */
> > +	bnofsb = xfs_repair_allocbt_alloc_block(sc, free_extents, &cached);
> > +	if (bnofsb == NULLFSBLOCK)
> > +		return -ENOSPC;
> 
> Does this happen after the free extent list has been sorted by bno
> order? It really should, that way the new root is as close to the
> the AGF as possible, and the new btree blocks will also tend to
> cluster towards the lower AG offsets.

Will do.

> > +	/* Allocate new cntbt root. */
> > +	cntfsb = xfs_repair_allocbt_alloc_block(sc, free_extents, &cached);
> > +	if (cntfsb == NULLFSBLOCK)
> > +		return -ENOSPC;
> > +
> > +	agf = XFS_BUF_TO_AGF(sc->sa.agf_bp);
> > +	/* Initialize new bnobt root. */
> > +	error = xfs_repair_init_btblock(sc, bnofsb, &bp, XFS_BTNUM_BNO,
> > +			&xfs_allocbt_buf_ops);
> > +	if (error)
> > +		return error;
> > +	agf->agf_roots[XFS_BTNUM_BNOi] =
> > +			cpu_to_be32(XFS_FSB_TO_AGBNO(mp, bnofsb));
> > +	agf->agf_levels[XFS_BTNUM_BNOi] = cpu_to_be32(1);
> > +
> > +	/* Initialize new cntbt root. */
> > +	error = xfs_repair_init_btblock(sc, cntfsb, &bp, XFS_BTNUM_CNT,
> > +			&xfs_allocbt_buf_ops);
> > +	if (error)
> > +		return error;
> > +	agf->agf_roots[XFS_BTNUM_CNTi] =
> > +			cpu_to_be32(XFS_FSB_TO_AGBNO(mp, cntfsb));
> > +	agf->agf_levels[XFS_BTNUM_CNTi] = cpu_to_be32(1);
> > +
> > +	/* Add rmap records for the btree roots */
> > +	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_AG);
> > +	error = xfs_rmap_alloc(sc->tp, sc->sa.agf_bp, sc->sa.agno,
> > +			XFS_FSB_TO_AGBNO(mp, bnofsb), 1, &oinfo);
> > +	if (error)
> > +		return error;
> > +	error = xfs_rmap_alloc(sc->tp, sc->sa.agf_bp, sc->sa.agno,
> > +			XFS_FSB_TO_AGBNO(mp, cntfsb), 1, &oinfo);
> > +	if (error)
> > +		return error;
> > +
> > +	/* Reset the incore state. */
> > +	pag->pagf_levels[XFS_BTNUM_BNOi] = 1;
> > +	pag->pagf_levels[XFS_BTNUM_CNTi] = 1;
> > +
> > +	*log_flags |=  XFS_AGF_ROOTS | XFS_AGF_LEVELS;
> > +	return 0;
> 
> Rather than duplicating all this init code twice, would factoring it
> make sense? The only difference between the alloc/init of the two
> btrees is the array index that info is stored in....

Yeah, it would.

> > +}
> > +
> > +/* Build new free space btrees and dispose of the old one. */
> > +STATIC int
> > +xfs_repair_allocbt_rebuild_trees(
> > +	struct xfs_scrub_context	*sc,
> > +	struct list_head		*free_extents,
> > +	struct xfs_repair_extent_list	*old_allocbt_blocks)
> > +{
> > +	struct xfs_owner_info		oinfo;
> > +	struct xfs_repair_alloc_extent	*rae;
> > +	struct xfs_repair_alloc_extent	*n;
> > +	struct xfs_repair_alloc_extent	*longest;
> > +	int				error;
> > +
> > +	xfs_rmap_skip_owner_update(&oinfo);
> > +
> > +	/*
> > +	 * Insert the longest free extent in case it's necessary to
> > +	 * refresh the AGFL with multiple blocks.  If there is no longest
> > +	 * extent, we had exactly the free space we needed; we're done.
> > +	 */
> > +	longest = xfs_repair_allocbt_get_longest(free_extents);
> > +	if (!longest)
> > +		goto done;
> > +	error = xfs_repair_allocbt_free_extent(sc,
> > +			XFS_AGB_TO_FSB(sc->mp, sc->sa.agno, longest->bno),
> > +			longest->len, &oinfo);
> > +	list_del(&longest->list);
> > +	kmem_free(longest);
> > +	if (error)
> > +		return error;
> > +
> > +	/* Insert records into the new btrees. */
> > +	list_sort(NULL, free_extents, xfs_repair_allocbt_extent_cmp);
> 
> Hmmm. I guess list sorting doesn't occur before allocating new root
> blocks. Can this get moved?

Certainly.

> ....
> 
> > +bool
> > +xfs_extent_busy_list_empty(
> > +	struct xfs_perag	*pag);
> 
> One line form for header prototypes, please.

Ok.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-07-04  2:15 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-24 19:23 [PATCH v16 00/21] xfs-4.19: online repair support Darrick J. Wong
2018-06-24 19:23 ` [PATCH 01/21] xfs: don't assume a left rmap when allocating a new rmap Darrick J. Wong
2018-06-27  0:54   ` Dave Chinner
2018-06-28 21:11   ` Allison Henderson
2018-06-29 14:39     ` Darrick J. Wong
2018-06-24 19:23 ` [PATCH 02/21] xfs: add helper to decide if an inode has allocated cow blocks Darrick J. Wong
2018-06-27  1:02   ` Dave Chinner
2018-06-28 21:12   ` Allison Henderson
2018-06-24 19:23 ` [PATCH 03/21] xfs: refactor part of xfs_free_eofblocks Darrick J. Wong
2018-06-28 21:13   ` Allison Henderson
2018-06-24 19:23 ` [PATCH 04/21] xfs: repair the AGF and AGFL Darrick J. Wong
2018-06-27  2:19   ` Dave Chinner
2018-06-27 16:44     ` Allison Henderson
2018-06-27 23:37       ` Dave Chinner
2018-06-29 15:14         ` Darrick J. Wong
2018-06-28 17:25     ` Allison Henderson
2018-06-29 15:08       ` Darrick J. Wong
2018-06-28 21:14   ` Allison Henderson
2018-06-28 23:21     ` Dave Chinner
2018-06-29  1:35       ` Allison Henderson
2018-06-29 14:55         ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 05/21] xfs: repair the AGI Darrick J. Wong
2018-06-27  2:22   ` Dave Chinner
2018-06-28 21:15   ` Allison Henderson
2018-06-24 19:24 ` [PATCH 06/21] xfs: repair free space btrees Darrick J. Wong
2018-06-27  3:21   ` Dave Chinner
2018-07-04  2:15     ` Darrick J. Wong [this message]
2018-07-04  2:25       ` Dave Chinner
2018-06-30 17:36   ` Allison Henderson
2018-06-24 19:24 ` [PATCH 07/21] xfs: repair inode btrees Darrick J. Wong
2018-06-28  0:55   ` Dave Chinner
2018-07-04  2:22     ` Darrick J. Wong
2018-06-30 17:36   ` Allison Henderson
2018-06-30 18:30     ` Darrick J. Wong
2018-07-01  0:45       ` Allison Henderson
2018-06-24 19:24 ` [PATCH 08/21] xfs: defer iput on certain inodes while scrub / repair are running Darrick J. Wong
2018-06-28 23:37   ` Dave Chinner
2018-06-29 14:49     ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 09/21] xfs: finish our set of inode get/put tracepoints for scrub Darrick J. Wong
2018-06-24 19:24 ` [PATCH 10/21] xfs: introduce online scrub freeze Darrick J. Wong
2018-06-24 19:24 ` [PATCH 11/21] xfs: repair the rmapbt Darrick J. Wong
2018-07-03  5:32   ` Dave Chinner
2018-07-03 23:59     ` Darrick J. Wong
2018-07-04  8:44       ` Carlos Maiolino
2018-07-04 18:40         ` Darrick J. Wong
2018-07-04 23:21       ` Dave Chinner
2018-07-05  3:48         ` Darrick J. Wong
2018-07-05  7:03           ` Dave Chinner
2018-07-06  0:47             ` Darrick J. Wong
2018-07-06  1:08               ` Dave Chinner
2018-06-24 19:24 ` [PATCH 12/21] xfs: repair refcount btrees Darrick J. Wong
2018-07-03  5:50   ` Dave Chinner
2018-07-04  2:23     ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 13/21] xfs: repair inode records Darrick J. Wong
2018-07-03  6:17   ` Dave Chinner
2018-07-04  0:16     ` Darrick J. Wong
2018-07-04  1:03       ` Dave Chinner
2018-07-04  1:30         ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 14/21] xfs: zap broken inode forks Darrick J. Wong
2018-07-04  2:07   ` Dave Chinner
2018-07-04  3:26     ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 15/21] xfs: repair inode block maps Darrick J. Wong
2018-07-04  3:00   ` Dave Chinner
2018-07-04  3:41     ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 16/21] xfs: repair damaged symlinks Darrick J. Wong
2018-07-04  5:45   ` Dave Chinner
2018-07-04 18:45     ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 17/21] xfs: repair extended attributes Darrick J. Wong
2018-07-06  1:03   ` Dave Chinner
2018-07-06  3:10     ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 18/21] xfs: scrub should set preen if attr leaf has holes Darrick J. Wong
2018-06-29  2:52   ` Dave Chinner
2018-06-24 19:25 ` [PATCH 19/21] xfs: repair quotas Darrick J. Wong
2018-07-06  1:50   ` Dave Chinner
2018-07-06  3:16     ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 20/21] xfs: implement live quotacheck as part of quota repair Darrick J. Wong
2018-06-24 19:25 ` [PATCH 21/21] xfs: add online scrub/repair for superblock counters Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180704021504.GA32415@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.