All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs_repair: more fixes
@ 2020-06-25 20:52 Darrick J. Wong
  2020-06-25 20:52 ` [PATCH 1/2] xfs_repair: complain about ag header crc errors Darrick J. Wong
  2020-06-25 20:52 ` [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist Darrick J. Wong
  0 siblings, 2 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-06-25 20:52 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

Hi all,

Two more fixes for repair: first, we actually complain if ag header crc
verification fails.  Second, we now try to propagate enough of an AGFL
so that fixing the freelist should never require splitting the free
space btrees.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=repair-fixes
---
 repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
 repair/scan.c    |    6 ++++
 2 files changed, 74 insertions(+), 10 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] xfs_repair: complain about ag header crc errors
  2020-06-25 20:52 [PATCH 0/2] xfs_repair: more fixes Darrick J. Wong
@ 2020-06-25 20:52 ` Darrick J. Wong
  2020-06-29 12:20   ` Brian Foster
  2020-06-25 20:52 ` [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist Darrick J. Wong
  1 sibling, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2020-06-25 20:52 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Repair doesn't complain about crc errors in the AG headers, and it
should.  Otherwise give the admin the wrong impression about the
state of the filesystem after a nomodify check.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/scan.c |    6 ++++++
 1 file changed, 6 insertions(+)


diff --git a/repair/scan.c b/repair/scan.c
index 505cfc53..42b299f7 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -2441,6 +2441,8 @@ scan_ag(
 		objname = _("root superblock");
 		goto out_free_sb;
 	}
+	if (sbbuf->b_error == -EFSBADCRC)
+		do_warn(_("superblock has bad CRC for ag %d\n"), agno);
 	libxfs_sb_from_disk(sb, sbbuf->b_addr);
 
 	error = salvage_buffer(mp->m_dev,
@@ -2450,6 +2452,8 @@ scan_ag(
 		objname = _("agf block");
 		goto out_free_sbbuf;
 	}
+	if (agfbuf->b_error == -EFSBADCRC)
+		do_warn(_("agf has bad CRC for ag %d\n"), agno);
 	agf = agfbuf->b_addr;
 
 	error = salvage_buffer(mp->m_dev,
@@ -2459,6 +2463,8 @@ scan_ag(
 		objname = _("agi block");
 		goto out_free_agfbuf;
 	}
+	if (agibuf->b_error == -EFSBADCRC)
+		do_warn(_("agi has bad CRC for ag %d\n"), agno);
 	agi = agibuf->b_addr;
 
 	/* fix up bad ag headers */


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
  2020-06-25 20:52 [PATCH 0/2] xfs_repair: more fixes Darrick J. Wong
  2020-06-25 20:52 ` [PATCH 1/2] xfs_repair: complain about ag header crc errors Darrick J. Wong
@ 2020-06-25 20:52 ` Darrick J. Wong
  2020-06-29 12:22   ` Brian Foster
  1 sibling, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2020-06-25 20:52 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In commit 9851fd79bfb1, we added a slight amount of slack to the free
space btrees being reconstructed so that the initial fix_freelist call
(which is run against a totally empty AGFL) would never have to split
either free space btree in order to populate the free list.

The new btree bulk loading code in xfs_repair can re-create this
situation because it can set the slack values to zero if the filesystem
is very full.  However, these days repair has the infrastructure needed
to ensure that overestimations of the btree block counts end up on the
AGFL or get freed back into the filesystem at the end of phase 5.

Fix this problem by reserving blocks to a separate AGFL block
reservation, and checking that between this new reservation and any
overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
around to populate the AGFL with the minimum number of blocks it needs
to handle a split in the bno/cnt/rmap btrees.

Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
of the reservation steps in phase 5, so the extra allocation should not
cause repair to fail if it can't find blocks for btrees.

Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 10 deletions(-)


diff --git a/repair/agbtree.c b/repair/agbtree.c
index 339b1489..7a4f316c 100644
--- a/repair/agbtree.c
+++ b/repair/agbtree.c
@@ -65,8 +65,8 @@ consume_freespace(
 }
 
 /* Reserve blocks for the new per-AG structures. */
-static void
-reserve_btblocks(
+static uint32_t
+reserve_agblocks(
 	struct xfs_mount	*mp,
 	xfs_agnumber_t		agno,
 	struct bt_rebuild	*btr,
@@ -86,8 +86,7 @@ reserve_btblocks(
 		 */
 		ext_ptr = findfirst_bcnt_extent(agno);
 		if (!ext_ptr)
-			do_error(
-_("error - not enough free space in filesystem\n"));
+			break;
 
 		/* Use up the extent we've got. */
 		len = min(ext_ptr->ex_blockcount, nr_blocks - blocks_allocated);
@@ -110,6 +109,23 @@ _("error - not enough free space in filesystem\n"));
 	fprintf(stderr, "blocks_allocated = %d\n",
 		blocks_allocated);
 #endif
+	return blocks_allocated;
+}
+
+static inline void
+reserve_btblocks(
+	struct xfs_mount	*mp,
+	xfs_agnumber_t		agno,
+	struct bt_rebuild	*btr,
+	uint32_t		nr_blocks)
+{
+	uint32_t		got;
+
+	got = reserve_agblocks(mp, agno, btr, nr_blocks);
+	if (got != nr_blocks)
+		do_error(
+	_("error - not enough free space in filesystem, AG %u\n"),
+				agno);
 }
 
 /* Feed one of the new btree blocks to the bulk loader. */
@@ -219,8 +235,13 @@ init_freespace_cursors(
 {
 	unsigned int		bno_blocks;
 	unsigned int		cnt_blocks;
+	unsigned int		min_agfl_len;
+	bool			fill_agfl = true;
 	int			error;
 
+	*extra_blocks = 0;
+	min_agfl_len = libxfs_alloc_min_freelist(sc->mp, NULL);
+
 	init_rebuild(sc, &XFS_RMAP_OINFO_AG, free_space, btr_bno);
 	init_rebuild(sc, &XFS_RMAP_OINFO_AG, free_space, btr_cnt);
 
@@ -244,6 +265,9 @@ init_freespace_cursors(
 	 */
 	do {
 		unsigned int	num_freeblocks;
+		unsigned int	overflow = 0;
+		unsigned int	got;
+		int64_t		wanted;
 
 		bno_blocks = btr_bno->bload.nr_blocks;
 		cnt_blocks = btr_cnt->bload.nr_blocks;
@@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
 			do_error(
 _("Unable to compute free space by length btree geometry, error %d.\n"), -error);
 
+		if (bno_blocks > btr_bno->bload.nr_blocks)
+			overflow += bno_blocks - btr_bno->bload.nr_blocks;
+		if (cnt_blocks > btr_cnt->bload.nr_blocks)
+			overflow += cnt_blocks - btr_cnt->bload.nr_blocks;
+		if (overflow >= min_agfl_len)
+			fill_agfl = false;
+
 		/* We don't need any more blocks, so we're done. */
 		if (bno_blocks >= btr_bno->bload.nr_blocks &&
-		    cnt_blocks >= btr_cnt->bload.nr_blocks)
+		    cnt_blocks >= btr_cnt->bload.nr_blocks &&
+		    !fill_agfl) {
+			*extra_blocks = overflow;
 			break;
+		}
 
 		/* Allocate however many more blocks we need this time. */
-		if (bno_blocks < btr_bno->bload.nr_blocks)
+		if (bno_blocks < btr_bno->bload.nr_blocks) {
 			reserve_btblocks(sc->mp, agno, btr_bno,
 					btr_bno->bload.nr_blocks - bno_blocks);
-		if (cnt_blocks < btr_cnt->bload.nr_blocks)
+			bno_blocks = btr_bno->bload.nr_blocks;
+		}
+		if (cnt_blocks < btr_cnt->bload.nr_blocks) {
 			reserve_btblocks(sc->mp, agno, btr_cnt,
 					btr_cnt->bload.nr_blocks - cnt_blocks);
+			cnt_blocks = btr_cnt->bload.nr_blocks;
+		}
+
+		/*
+		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
+		 * populate the AGFL.  If we don't get all the blocks we want,
+		 * stop trying to fill the AGFL.
+		 */
+		wanted = (int64_t)btr_bno->bload.nr_blocks +
+				(min_agfl_len / 2) - bno_blocks;
+		if (wanted > 0 && fill_agfl) {
+			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
+			if (wanted > got)
+				fill_agfl = false;
+			btr_bno->bload.nr_blocks += got;
+		}
+
+		wanted = (int64_t)btr_cnt->bload.nr_blocks +
+				(min_agfl_len / 2) - cnt_blocks;
+		if (wanted > 0 && fill_agfl) {
+			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
+			if (wanted > got)
+				fill_agfl = false;
+			btr_cnt->bload.nr_blocks += got;
+		}
 
 		/* Ok, now how many free space records do we have? */
 		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
 	} while (1);
-
-	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
-			(cnt_blocks - btr_cnt->bload.nr_blocks);
 }
 
 /* Rebuild the free space btrees. */


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] xfs_repair: complain about ag header crc errors
  2020-06-25 20:52 ` [PATCH 1/2] xfs_repair: complain about ag header crc errors Darrick J. Wong
@ 2020-06-29 12:20   ` Brian Foster
  2020-06-29 23:11     ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2020-06-29 12:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs

On Thu, Jun 25, 2020 at 01:52:32PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Repair doesn't complain about crc errors in the AG headers, and it
> should.  Otherwise give the admin the wrong impression about the
> state of the filesystem after a nomodify check.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  repair/scan.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> 
> diff --git a/repair/scan.c b/repair/scan.c
> index 505cfc53..42b299f7 100644
> --- a/repair/scan.c
> +++ b/repair/scan.c
> @@ -2441,6 +2441,8 @@ scan_ag(
>  		objname = _("root superblock");
>  		goto out_free_sb;
>  	}
> +	if (sbbuf->b_error == -EFSBADCRC)
> +		do_warn(_("superblock has bad CRC for ag %d\n"), agno);

So salvage_buffer() reads the buf and passes along the verifier. If the
verifier fails, we ignore the error and return 0 because of
LIBXFS_READBUF_SALVAGE, but leave it set in bp->b_error so it should be
accessible here. Looks Ok:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	libxfs_sb_from_disk(sb, sbbuf->b_addr);
>  
>  	error = salvage_buffer(mp->m_dev,
> @@ -2450,6 +2452,8 @@ scan_ag(
>  		objname = _("agf block");
>  		goto out_free_sbbuf;
>  	}
> +	if (agfbuf->b_error == -EFSBADCRC)
> +		do_warn(_("agf has bad CRC for ag %d\n"), agno);
>  	agf = agfbuf->b_addr;
>  
>  	error = salvage_buffer(mp->m_dev,
> @@ -2459,6 +2463,8 @@ scan_ag(
>  		objname = _("agi block");
>  		goto out_free_agfbuf;
>  	}
> +	if (agibuf->b_error == -EFSBADCRC)
> +		do_warn(_("agi has bad CRC for ag %d\n"), agno);
>  	agi = agibuf->b_addr;
>  
>  	/* fix up bad ag headers */
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
  2020-06-25 20:52 ` [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist Darrick J. Wong
@ 2020-06-29 12:22   ` Brian Foster
  2020-06-29 23:21     ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2020-06-29 12:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs

On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In commit 9851fd79bfb1, we added a slight amount of slack to the free
> space btrees being reconstructed so that the initial fix_freelist call
> (which is run against a totally empty AGFL) would never have to split
> either free space btree in order to populate the free list.
> 
> The new btree bulk loading code in xfs_repair can re-create this
> situation because it can set the slack values to zero if the filesystem
> is very full.  However, these days repair has the infrastructure needed
> to ensure that overestimations of the btree block counts end up on the
> AGFL or get freed back into the filesystem at the end of phase 5.
> 
> Fix this problem by reserving blocks to a separate AGFL block
> reservation, and checking that between this new reservation and any
> overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
> around to populate the AGFL with the minimum number of blocks it needs
> to handle a split in the bno/cnt/rmap btrees.
> 
> Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
> of the reservation steps in phase 5, so the extra allocation should not
> cause repair to fail if it can't find blocks for btrees.
> 
> Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 68 insertions(+), 10 deletions(-)
> 
> 
> diff --git a/repair/agbtree.c b/repair/agbtree.c
> index 339b1489..7a4f316c 100644
> --- a/repair/agbtree.c
> +++ b/repair/agbtree.c
...
> @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
...
> +
> +		/*
> +		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
> +		 * populate the AGFL.  If we don't get all the blocks we want,
> +		 * stop trying to fill the AGFL.
> +		 */
> +		wanted = (int64_t)btr_bno->bload.nr_blocks +
> +				(min_agfl_len / 2) - bno_blocks;
> +		if (wanted > 0 && fill_agfl) {
> +			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
> +			if (wanted > got)
> +				fill_agfl = false;
> +			btr_bno->bload.nr_blocks += got;
> +		}
> +
> +		wanted = (int64_t)btr_cnt->bload.nr_blocks +
> +				(min_agfl_len / 2) - cnt_blocks;
> +		if (wanted > 0 && fill_agfl) {
> +			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
> +			if (wanted > got)
> +				fill_agfl = false;
> +			btr_cnt->bload.nr_blocks += got;
> +		}

It's a little hard to follow this with the nr_blocks sampling and
whatnot, but I think I get the idea. What's the reason for splitting the
AGFL res requirement evenly across the two cursors? These AGFL blocks
all fall into the same overflow pool, right? I was wondering why we
couldn't just attach the overflow to one, or check one for the full res
and then the other if more blocks are needed.

In thinking about it a bit more, wouldn't the whole algorithm be more
simple if we reserved the min AGFL requirement first, optionally passed
'agfl_res' to reserve_btblocks() such that subsequent reservations can
steal from it (and then fail if it depletes), then stuff what's left in
one (or both, if there's a reason for that) of the cursors at the end?

Brian

>  
>  		/* Ok, now how many free space records do we have? */
>  		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
>  	} while (1);
> -
> -	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
> -			(cnt_blocks - btr_cnt->bload.nr_blocks);
>  }
>  
>  /* Rebuild the free space btrees. */
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] xfs_repair: complain about ag header crc errors
  2020-06-29 12:20   ` Brian Foster
@ 2020-06-29 23:11     ` Darrick J. Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-06-29 23:11 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs

On Mon, Jun 29, 2020 at 08:20:31AM -0400, Brian Foster wrote:
> On Thu, Jun 25, 2020 at 01:52:32PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Repair doesn't complain about crc errors in the AG headers, and it
> > should.  Otherwise give the admin the wrong impression about the

"Otherwise, this gives the admin the wrong impression..."

I'll fix this before the next repost, though if the maintainer chooses
to pull it in before then, please make this minor correction.

> > state of the filesystem after a nomodify check.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/scan.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > 
> > diff --git a/repair/scan.c b/repair/scan.c
> > index 505cfc53..42b299f7 100644
> > --- a/repair/scan.c
> > +++ b/repair/scan.c
> > @@ -2441,6 +2441,8 @@ scan_ag(
> >  		objname = _("root superblock");
> >  		goto out_free_sb;
> >  	}
> > +	if (sbbuf->b_error == -EFSBADCRC)
> > +		do_warn(_("superblock has bad CRC for ag %d\n"), agno);
> 
> So salvage_buffer() reads the buf and passes along the verifier. If the
> verifier fails, we ignore the error and return 0 because of
> LIBXFS_READBUF_SALVAGE, but leave it set in bp->b_error so it should be
> accessible here. Looks Ok:

Yep.

> Reviewed-by: Brian Foster <bfoster@redhat.com>

Thanks for the review!

--D

> 
> >  	libxfs_sb_from_disk(sb, sbbuf->b_addr);
> >  
> >  	error = salvage_buffer(mp->m_dev,
> > @@ -2450,6 +2452,8 @@ scan_ag(
> >  		objname = _("agf block");
> >  		goto out_free_sbbuf;
> >  	}
> > +	if (agfbuf->b_error == -EFSBADCRC)
> > +		do_warn(_("agf has bad CRC for ag %d\n"), agno);
> >  	agf = agfbuf->b_addr;
> >  
> >  	error = salvage_buffer(mp->m_dev,
> > @@ -2459,6 +2463,8 @@ scan_ag(
> >  		objname = _("agi block");
> >  		goto out_free_agfbuf;
> >  	}
> > +	if (agibuf->b_error == -EFSBADCRC)
> > +		do_warn(_("agi has bad CRC for ag %d\n"), agno);
> >  	agi = agibuf->b_addr;
> >  
> >  	/* fix up bad ag headers */
> > 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
  2020-06-29 12:22   ` Brian Foster
@ 2020-06-29 23:21     ` Darrick J. Wong
  2020-06-30 10:52       ` Brian Foster
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2020-06-29 23:21 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs

On Mon, Jun 29, 2020 at 08:22:28AM -0400, Brian Foster wrote:
> On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In commit 9851fd79bfb1, we added a slight amount of slack to the free
> > space btrees being reconstructed so that the initial fix_freelist call
> > (which is run against a totally empty AGFL) would never have to split
> > either free space btree in order to populate the free list.
> > 
> > The new btree bulk loading code in xfs_repair can re-create this
> > situation because it can set the slack values to zero if the filesystem
> > is very full.  However, these days repair has the infrastructure needed
> > to ensure that overestimations of the btree block counts end up on the
> > AGFL or get freed back into the filesystem at the end of phase 5.
> > 
> > Fix this problem by reserving blocks to a separate AGFL block
> > reservation, and checking that between this new reservation and any
> > overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
> > around to populate the AGFL with the minimum number of blocks it needs
> > to handle a split in the bno/cnt/rmap btrees.
> > 
> > Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
> > of the reservation steps in phase 5, so the extra allocation should not
> > cause repair to fail if it can't find blocks for btrees.
> > 
> > Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 68 insertions(+), 10 deletions(-)
> > 
> > 
> > diff --git a/repair/agbtree.c b/repair/agbtree.c
> > index 339b1489..7a4f316c 100644
> > --- a/repair/agbtree.c
> > +++ b/repair/agbtree.c
> ...
> > @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
> ...
> > +
> > +		/*
> > +		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
> > +		 * populate the AGFL.  If we don't get all the blocks we want,
> > +		 * stop trying to fill the AGFL.
> > +		 */
> > +		wanted = (int64_t)btr_bno->bload.nr_blocks +
> > +				(min_agfl_len / 2) - bno_blocks;
> > +		if (wanted > 0 && fill_agfl) {
> > +			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
> > +			if (wanted > got)
> > +				fill_agfl = false;
> > +			btr_bno->bload.nr_blocks += got;
> > +		}
> > +
> > +		wanted = (int64_t)btr_cnt->bload.nr_blocks +
> > +				(min_agfl_len / 2) - cnt_blocks;
> > +		if (wanted > 0 && fill_agfl) {
> > +			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
> > +			if (wanted > got)
> > +				fill_agfl = false;
> > +			btr_cnt->bload.nr_blocks += got;
> > +		}
> 
> It's a little hard to follow this with the nr_blocks sampling and
> whatnot, but I think I get the idea. What's the reason for splitting the
> AGFL res requirement evenly across the two cursors? These AGFL blocks
> all fall into the same overflow pool, right? I was wondering why we
> couldn't just attach the overflow to one, or check one for the full res
> and then the other if more blocks are needed.

I chose to stuff the excess blocks into the bnobt and cntbt bulkload
cursors to avoid having to initialize a semi-phony "bulkload cursor" for
the agfl, and I decided to split them evenly between the two cursors so
that I wouldn't have someday to deal with a bug report about how one
cursor somehow ran out of blocks but the other one had plenty more.

> In thinking about it a bit more, wouldn't the whole algorithm be more
> simple if we reserved the min AGFL requirement first, optionally passed
> 'agfl_res' to reserve_btblocks() such that subsequent reservations can
> steal from it (and then fail if it depletes), then stuff what's left in
> one (or both, if there's a reason for that) of the cursors at the end?

Hmm.  I hadn't thought about that.  In general I wanted the AGFL
reservations to go last because I'd rather we set off with an underfull
AGFL than totally fail because we couldn't get blocks for the
bnobt/cntbt, but I suppose you're right that we could steal from it as
needed to prevent repair failure.

So, uh, I could rework this patch to create a phony agfl bulk load
cursor, fill it before the loop, steal blocks from it to fill the
bnobt/cntbt to satisfy failed allocations, and then dump any remainders
into the bnobt/cntbt cursors afterwards.  How does that sound?

--D

> Brian
> 
> >  
> >  		/* Ok, now how many free space records do we have? */
> >  		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
> >  	} while (1);
> > -
> > -	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
> > -			(cnt_blocks - btr_cnt->bload.nr_blocks);
> >  }
> >  
> >  /* Rebuild the free space btrees. */
> > 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
  2020-06-29 23:21     ` Darrick J. Wong
@ 2020-06-30 10:52       ` Brian Foster
  2020-07-01 16:19         ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2020-06-30 10:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: sandeen, linux-xfs

On Mon, Jun 29, 2020 at 04:21:40PM -0700, Darrick J. Wong wrote:
> On Mon, Jun 29, 2020 at 08:22:28AM -0400, Brian Foster wrote:
> > On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > In commit 9851fd79bfb1, we added a slight amount of slack to the free
> > > space btrees being reconstructed so that the initial fix_freelist call
> > > (which is run against a totally empty AGFL) would never have to split
> > > either free space btree in order to populate the free list.
> > > 
> > > The new btree bulk loading code in xfs_repair can re-create this
> > > situation because it can set the slack values to zero if the filesystem
> > > is very full.  However, these days repair has the infrastructure needed
> > > to ensure that overestimations of the btree block counts end up on the
> > > AGFL or get freed back into the filesystem at the end of phase 5.
> > > 
> > > Fix this problem by reserving blocks to a separate AGFL block
> > > reservation, and checking that between this new reservation and any
> > > overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
> > > around to populate the AGFL with the minimum number of blocks it needs
> > > to handle a split in the bno/cnt/rmap btrees.
> > > 
> > > Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
> > > of the reservation steps in phase 5, so the extra allocation should not
> > > cause repair to fail if it can't find blocks for btrees.
> > > 
> > > Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 68 insertions(+), 10 deletions(-)
> > > 
> > > 
> > > diff --git a/repair/agbtree.c b/repair/agbtree.c
> > > index 339b1489..7a4f316c 100644
> > > --- a/repair/agbtree.c
> > > +++ b/repair/agbtree.c
> > ...
> > > @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
> > ...
> > > +
> > > +		/*
> > > +		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
> > > +		 * populate the AGFL.  If we don't get all the blocks we want,
> > > +		 * stop trying to fill the AGFL.
> > > +		 */
> > > +		wanted = (int64_t)btr_bno->bload.nr_blocks +
> > > +				(min_agfl_len / 2) - bno_blocks;
> > > +		if (wanted > 0 && fill_agfl) {
> > > +			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
> > > +			if (wanted > got)
> > > +				fill_agfl = false;
> > > +			btr_bno->bload.nr_blocks += got;
> > > +		}
> > > +
> > > +		wanted = (int64_t)btr_cnt->bload.nr_blocks +
> > > +				(min_agfl_len / 2) - cnt_blocks;
> > > +		if (wanted > 0 && fill_agfl) {
> > > +			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
> > > +			if (wanted > got)
> > > +				fill_agfl = false;
> > > +			btr_cnt->bload.nr_blocks += got;
> > > +		}
> > 
> > It's a little hard to follow this with the nr_blocks sampling and
> > whatnot, but I think I get the idea. What's the reason for splitting the
> > AGFL res requirement evenly across the two cursors? These AGFL blocks
> > all fall into the same overflow pool, right? I was wondering why we
> > couldn't just attach the overflow to one, or check one for the full res
> > and then the other if more blocks are needed.
> 
> I chose to stuff the excess blocks into the bnobt and cntbt bulkload
> cursors to avoid having to initialize a semi-phony "bulkload cursor" for
> the agfl, and I decided to split them evenly between the two cursors so
> that I wouldn't have someday to deal with a bug report about how one
> cursor somehow ran out of blocks but the other one had plenty more.
> 
> > In thinking about it a bit more, wouldn't the whole algorithm be more
> > simple if we reserved the min AGFL requirement first, optionally passed
> > 'agfl_res' to reserve_btblocks() such that subsequent reservations can
> > steal from it (and then fail if it depletes), then stuff what's left in
> > one (or both, if there's a reason for that) of the cursors at the end?
> 
> Hmm.  I hadn't thought about that.  In general I wanted the AGFL
> reservations to go last because I'd rather we set off with an underfull
> AGFL than totally fail because we couldn't get blocks for the
> bnobt/cntbt, but I suppose you're right that we could steal from it as
> needed to prevent repair failure.
> 
> So, uh, I could rework this patch to create a phony agfl bulk load
> cursor, fill it before the loop, steal blocks from it to fill the
> bnobt/cntbt to satisfy failed allocations, and then dump any remainders
> into the bnobt/cntbt cursors afterwards.  How does that sound?
> 

Ok.. the whole phony cursor thing sounds a bit unfortunate. I was
thinking we'd just have a reservation counter or some such, but in
reality we'd need that to pass down into the block reservation code to
acquire actual blocks for one, then we'd need new code to allocate said
blocks from the phony agfl cursor rather than the in-core block lists,
right? Perhaps it's not worth doing that if it doesn't reduce complexity
as much as shuffle it around or even add a bit more... :/

I wonder if a reasonable simplification/tradeoff might be to just
refactor the agfl logic in the current patch into a helper function that
1.) calculates the current overflow across both cursors and the current
total agfl "wanted" requirement based on that 2.) performs a single
reservation to try and accommodate on one of the cursors and 3.) adds a
bit more to the comment to explain that we're just overloading the bnobt
cursor (for example) for extra agfl res. Hm?

Brian

> --D
> 
> > Brian
> > 
> > >  
> > >  		/* Ok, now how many free space records do we have? */
> > >  		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
> > >  	} while (1);
> > > -
> > > -	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
> > > -			(cnt_blocks - btr_cnt->bload.nr_blocks);
> > >  }
> > >  
> > >  /* Rebuild the free space btrees. */
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
  2020-06-30 10:52       ` Brian Foster
@ 2020-07-01 16:19         ` Darrick J. Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-07-01 16:19 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, linux-xfs

On Tue, Jun 30, 2020 at 06:52:39AM -0400, Brian Foster wrote:
> On Mon, Jun 29, 2020 at 04:21:40PM -0700, Darrick J. Wong wrote:
> > On Mon, Jun 29, 2020 at 08:22:28AM -0400, Brian Foster wrote:
> > > On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > In commit 9851fd79bfb1, we added a slight amount of slack to the free
> > > > space btrees being reconstructed so that the initial fix_freelist call
> > > > (which is run against a totally empty AGFL) would never have to split
> > > > either free space btree in order to populate the free list.
> > > > 
> > > > The new btree bulk loading code in xfs_repair can re-create this
> > > > situation because it can set the slack values to zero if the filesystem
> > > > is very full.  However, these days repair has the infrastructure needed
> > > > to ensure that overestimations of the btree block counts end up on the
> > > > AGFL or get freed back into the filesystem at the end of phase 5.
> > > > 
> > > > Fix this problem by reserving blocks to a separate AGFL block
> > > > reservation, and checking that between this new reservation and any
> > > > overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
> > > > around to populate the AGFL with the minimum number of blocks it needs
> > > > to handle a split in the bno/cnt/rmap btrees.
> > > > 
> > > > Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
> > > > of the reservation steps in phase 5, so the extra allocation should not
> > > > cause repair to fail if it can't find blocks for btrees.
> > > > 
> > > > Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
> > > >  1 file changed, 68 insertions(+), 10 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/repair/agbtree.c b/repair/agbtree.c
> > > > index 339b1489..7a4f316c 100644
> > > > --- a/repair/agbtree.c
> > > > +++ b/repair/agbtree.c
> > > ...
> > > > @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
> > > ...
> > > > +
> > > > +		/*
> > > > +		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
> > > > +		 * populate the AGFL.  If we don't get all the blocks we want,
> > > > +		 * stop trying to fill the AGFL.
> > > > +		 */
> > > > +		wanted = (int64_t)btr_bno->bload.nr_blocks +
> > > > +				(min_agfl_len / 2) - bno_blocks;
> > > > +		if (wanted > 0 && fill_agfl) {
> > > > +			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
> > > > +			if (wanted > got)
> > > > +				fill_agfl = false;
> > > > +			btr_bno->bload.nr_blocks += got;
> > > > +		}
> > > > +
> > > > +		wanted = (int64_t)btr_cnt->bload.nr_blocks +
> > > > +				(min_agfl_len / 2) - cnt_blocks;
> > > > +		if (wanted > 0 && fill_agfl) {
> > > > +			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
> > > > +			if (wanted > got)
> > > > +				fill_agfl = false;
> > > > +			btr_cnt->bload.nr_blocks += got;
> > > > +		}
> > > 
> > > It's a little hard to follow this with the nr_blocks sampling and
> > > whatnot, but I think I get the idea. What's the reason for splitting the
> > > AGFL res requirement evenly across the two cursors? These AGFL blocks
> > > all fall into the same overflow pool, right? I was wondering why we
> > > couldn't just attach the overflow to one, or check one for the full res
> > > and then the other if more blocks are needed.
> > 
> > I chose to stuff the excess blocks into the bnobt and cntbt bulkload
> > cursors to avoid having to initialize a semi-phony "bulkload cursor" for
> > the agfl, and I decided to split them evenly between the two cursors so
> > that I wouldn't have someday to deal with a bug report about how one
> > cursor somehow ran out of blocks but the other one had plenty more.
> > 
> > > In thinking about it a bit more, wouldn't the whole algorithm be more
> > > simple if we reserved the min AGFL requirement first, optionally passed
> > > 'agfl_res' to reserve_btblocks() such that subsequent reservations can
> > > steal from it (and then fail if it depletes), then stuff what's left in
> > > one (or both, if there's a reason for that) of the cursors at the end?
> > 
> > Hmm.  I hadn't thought about that.  In general I wanted the AGFL
> > reservations to go last because I'd rather we set off with an underfull
> > AGFL than totally fail because we couldn't get blocks for the
> > bnobt/cntbt, but I suppose you're right that we could steal from it as
> > needed to prevent repair failure.
> > 
> > So, uh, I could rework this patch to create a phony agfl bulk load
> > cursor, fill it before the loop, steal blocks from it to fill the
> > bnobt/cntbt to satisfy failed allocations, and then dump any remainders
> > into the bnobt/cntbt cursors afterwards.  How does that sound?
> > 
> 
> Ok.. the whole phony cursor thing sounds a bit unfortunate. I was
> thinking we'd just have a reservation counter or some such, but in
> reality we'd need that to pass down into the block reservation code to
> acquire actual blocks for one, then we'd need new code to allocate said
> blocks from the phony agfl cursor rather than the in-core block lists,
> right? Perhaps it's not worth doing that if it doesn't reduce complexity
> as much as shuffle it around or even add a bit more... :/
> 
> I wonder if a reasonable simplification/tradeoff might be to just
> refactor the agfl logic in the current patch into a helper function that
> 1.) calculates the current overflow across both cursors and the current
> total agfl "wanted" requirement based on that 2.) performs a single
> reservation to try and accommodate on one of the cursors and 3.) adds a
> bit more to the comment to explain that we're just overloading the bnobt
> cursor (for example) for extra agfl res. Hm?

<shrug> The current patch more or less does this, albeit without the
explicit helper function in (1), and in (3) it splits the overload
between the two cursors instead of just the bnobt.  I'll see what that
looks like, since I came up with other cleanups for init_freespace_cursors.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > >  
> > > >  		/* Ok, now how many free space records do we have? */
> > > >  		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
> > > >  	} while (1);
> > > > -
> > > > -	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
> > > > -			(cnt_blocks - btr_cnt->bload.nr_blocks);
> > > >  }
> > > >  
> > > >  /* Rebuild the free space btrees. */
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-01 16:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 20:52 [PATCH 0/2] xfs_repair: more fixes Darrick J. Wong
2020-06-25 20:52 ` [PATCH 1/2] xfs_repair: complain about ag header crc errors Darrick J. Wong
2020-06-29 12:20   ` Brian Foster
2020-06-29 23:11     ` Darrick J. Wong
2020-06-25 20:52 ` [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist Darrick J. Wong
2020-06-29 12:22   ` Brian Foster
2020-06-29 23:21     ` Darrick J. Wong
2020-06-30 10:52       ` Brian Foster
2020-07-01 16:19         ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.