All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: david@fromorbit.com, linux-xfs@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 48/63] xfs: preallocate blocks for worst-case btree expansion
Date: Wed, 7 Dec 2016 06:53:24 -0500	[thread overview]
Message-ID: <20161207115323.GA23106@bfoster.bfoster> (raw)
In-Reply-To: <20161206193229.GF8436@birch.djwong.org>

On Tue, Dec 06, 2016 at 11:32:29AM -0800, Darrick J. Wong wrote:
> On Wed, Oct 12, 2016 at 06:42:36PM -0400, Brian Foster wrote:
> > On Wed, Oct 12, 2016 at 01:52:57PM -0700, Darrick J. Wong wrote:
> > > On Wed, Oct 12, 2016 at 02:44:51PM -0400, Brian Foster wrote:
> > > > On Thu, Sep 29, 2016 at 08:10:52PM -0700, Darrick J. Wong wrote:
> > > > > To gracefully handle the situation where a CoW operation turns a
> > > > > single refcount extent into a lot of tiny ones and then run out of
> > > > > space when a tree split has to happen, use the per-AG reserved block
> > > > > pool to pre-allocate all the space we'll ever need for a maximal
> > > > > btree.  For a 4K block size, this only costs an overhead of 0.3% of
> > > > > available disk space.
> > > > > 
> > > > > When reflink is enabled, we have an unfortunate problem with rmap --
> > > > > since we can share a block billions of times, this means that the
> > > > > reverse mapping btree can expand basically infinitely.  When an AG is
> > > > > so full that there are no free blocks with which to expand the rmapbt,
> > > > > the filesystem will shut down hard.
> > > > > 
> > > > > This is rather annoying to the user, so use the AG reservation code to
> > > > > reserve a "reasonable" amount of space for rmap.  We'll prevent
> > > > > reflinks and CoW operations if we think we're getting close to
> > > > > exhausting an AG's free space rather than shutting down, but this
> > > > > permanent reservation should be enough for "most" users.  Hopefully.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > [hch@lst.de: ensure that we invalidate the freed btree buffer]
> > > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > > > ---
> > > > > v2: Simplify the return value from xfs_perag_pool_free_block to a bool
> > > > > so that we can easily call xfs_trans_binval for both the per-AG pool
> > > > > and the real freeing case.  Without this we fail to invalidate the
> > > > > btree buffer and will trip over the write verifier on a shrinking
> > > > > refcount btree.
> > > > > 
> > > > > v3: Convert to the new per-AG reservation code.
> > > > > 
> > > > > v4: Combine this patch with the one that adds the rmapbt reservation,
> > > > > since the rmapbt reservation is only needed for reflink filesystems.
> > > > > 
> > > > > v5: If we detect errors while counting the refcount or rmap btrees,
> > > > > shut down the filesystem to avoid the scenario where the fs shuts down
> > > > > mid-transaction due to btree corruption, repair refuses to run until
> > > > > the log is clean, and the log cannot be cleaned because replay hits
> > > > > btree corruption and shuts down.
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_ag_resv.c        |   11 ++++++
> > > > >  fs/xfs/libxfs/xfs_refcount_btree.c |   45 ++++++++++++++++++++++++-
> > > > >  fs/xfs/libxfs/xfs_refcount_btree.h |    3 ++
> > > > >  fs/xfs/libxfs/xfs_rmap_btree.c     |   60 ++++++++++++++++++++++++++++++++++
> > > > >  fs/xfs/libxfs/xfs_rmap_btree.h     |    7 ++++
> > > > >  fs/xfs/xfs_fsops.c                 |   64 ++++++++++++++++++++++++++++++++++++
> > > > >  fs/xfs/xfs_fsops.h                 |    3 ++
> > > > >  fs/xfs/xfs_mount.c                 |    8 +++++
> > > > >  fs/xfs/xfs_super.c                 |   12 +++++++
> > > > >  9 files changed, 210 insertions(+), 3 deletions(-)
> > > > > 
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
> > > > > index e3ae0f2..adf770f 100644
> > > > > --- a/fs/xfs/libxfs/xfs_ag_resv.c
> > > > > +++ b/fs/xfs/libxfs/xfs_ag_resv.c
> > > > > @@ -38,6 +38,7 @@
> > > > >  #include "xfs_trans_space.h"
> > > > >  #include "xfs_rmap_btree.h"
> > > > >  #include "xfs_btree.h"
> > > > > +#include "xfs_refcount_btree.h"
> > > > >  
> > > > >  /*
> > > > >   * Per-AG Block Reservations
> > > > > @@ -228,6 +229,11 @@ xfs_ag_resv_init(
> > > > >  	if (pag->pag_meta_resv.ar_asked == 0) {
> > > > >  		ask = used = 0;
> > > > >  
> > > > > +		error = xfs_refcountbt_calc_reserves(pag->pag_mount,
> > > > > +				pag->pag_agno, &ask, &used);
> > > > > +		if (error)
> > > > > +			goto out;
> > > > > +
> > > > >  		error = __xfs_ag_resv_init(pag, XFS_AG_RESV_METADATA,
> > > > >  				ask, used);
> > > > 
> > > > Now that I get here, I see we have these per-ag reservation structures
> > > > and whatnot, but __xfs_ag_resv_init() (from a previous patch) calls
> > > > xfs_mod_fdblocks() for the reservation. AFAICT, that reserves from the
> > > > "global pool." Based on the commit log, isn't the intent here to reserve
> > > > blocks within each AG? What am I missing?
> > > 
> > > The AG reservation code "reserves" blocks in each AG by hiding them from
> > > the allocator.  They're all still there in the bnobt, but we underreport
> > > the length of the longest free extent and the free block count in that
> > > AG to make it look like there's less free space than there is.  Since
> > > those blocks are no longer generally available, we have to decrease the
> > > in-core free block count so we can't create delalloc reservations that
> > > the allocator won't (or can't) satisfy.
> > > 
> > 
> > Yep, I think I get the idea/purpose in principle. It sounds similar to
> > global reserve pool, where we set aside a count of unallocated blocks
> > via accounting magic such that we have some available in cases such as
> > the need to allocate a block to free an extent in low free space
> > conditions.
> 
> Correct.
> 
> > In this case, it looks like we reserve blocks in the same manner (via
> > xfs_mod_fdblocks()) and record the reservation in a new per-ag
> > reservation structure. The part I'm missing is how we guarantee those
> > blocks are accessible in the particular AG (or am I entirely mistaken
> > about the requirement that the per-AG reservation must reside within
> > that specific AG?).
> 
> You're correct there too.
> 
> > An example might clarify where my confusion lies... suppose we have a
> > non-standard configuration with a 1TB ag size and just barely enough
> > total filesystem size for a second AG, e.g., we have two AGs where AG 0
> > is 1TB and AG 1 is 16MB. Suppose that the reservation requirement (for
> > the sake of example, at least) based on sb_agblocks is larger than the
> > entire size of AG 1. Yet, the xfs_mod_fdblocks() call for the AG 1 res
> > struct will apparently succeed because there are plenty of blocks in
> > mp->m_fdblocks. Unless I'm mistaken, shouldn't we not be able to reserve
> > this many blocks out of AG 1?
> 
> You're right, that is a bug.  We /ought/ to be calculating the
> reservation ask based on agf_length, not sb_agblocks.  I'll also have to
> fix growfs to change the reservation if the length of the last AG
> changes.
> 

Yep, makes sense.

IMO it would also be nice to see some kind of assertion at reservation
time that the AG can honor the reservation at the time it is made, since
IIUC that should always be enforced to be true (whether that be DEBUG
code or a simple warning or whatever... just a thought).

> > Even in the case where AG 1 is large enough for the reservation, what
> > actually prevents a sequence of single block allocations from using all
> > of the space in the AG? 
> 
> AFAICT, the allocator picks an AG and tries to fix the freelist before
> allocating blocks.  As part of ensuring the AGFL, we call
> xfs_alloc_space_available to decide if there's enough space in the AG
> both to satisfy the allocation request and to fix the freelist.
> 
> _a_s_a starts by determining the number of blocks that have to stay
> reserved in that AG for the given allocation type.  Then it calls
> xfs_alloc_longest_free_extent to find the longest free extent in the AG.
> 
> _a_l_f_e finds the longest extent and subtracts whatever part of the AG
> reservation it can't satisfy out of the non-longest free extents.
> 
> Upon returning from _a_l_f_e, _a_s_a rejects the allocation if the
> longest extent cannot satisfy the required minimum allocation with the
> given alignment constraints.
> 
> Next it calculates the space that would remain after the allocation,
> which is:
> 
> (free space + agfl blocks) - (ag reservation) - (minimum agfl length) -
>      (total blocks requested)
> 

Ah, Ok. I think I missed that this calculation was tweaked, I'm guessing
because that doesn't appear to have been changed in this patch (granted
this is an old series). Thus I didn't see how the reservation was
ultimately enforced on a particular AG. Makes sense now, thanks for the
explanation!

Brian

> If this quantity is less than zero (or less than args->minleft) then the
> allocation is also rejected.  I believe this should be sufficient to
> prevent a series of single block alloc requests from exhausting the AG
> since we're stopped from giving away reserved blocks that we're not
> entitled to, even if there are still records in the bnobt.
> 
> --D
> 
> > 
> > Brian
> > 
> > > Maybe a more concrete way to put that is: say we have 4 AGs with 4 agresv
> > > blocks each, and no other free space left anywhere.  The in-core fdblocks count
> > > should be 0 so that starting a write into a hole returns ENOSPC even if the
> > > write could be done without any btree shape changes.   Otherwise, writepages
> > > tries to allocate the delalloc reservation, fails to find any space because
> > > we've hidden it, and kaboom.
> > > 
> > > --D
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > >  		if (error)
> > > > > @@ -238,6 +244,11 @@ xfs_ag_resv_init(
> > > > >  	if (pag->pag_agfl_resv.ar_asked == 0) {
> > > > >  		ask = used = 0;
> > > > >  
> > > > > +		error = xfs_rmapbt_calc_reserves(pag->pag_mount, pag->pag_agno,
> > > > > +				&ask, &used);
> > > > > +		if (error)
> > > > > +			goto out;
> > > > > +
> > > > >  		error = __xfs_ag_resv_init(pag, XFS_AG_RESV_AGFL, ask, used);
> > > > >  		if (error)
> > > > >  			goto out;
> > > > > diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c
> > > > > index 6b5e82b9..453bb27 100644
> > > > > --- a/fs/xfs/libxfs/xfs_refcount_btree.c
> > > > > +++ b/fs/xfs/libxfs/xfs_refcount_btree.c
> > > > > @@ -79,6 +79,8 @@ xfs_refcountbt_alloc_block(
> > > > >  	struct xfs_alloc_arg	args;		/* block allocation args */
> > > > >  	int			error;		/* error return value */
> > > > >  
> > > > > +	XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY);
> > > > > +
> > > > >  	memset(&args, 0, sizeof(args));
> > > > >  	args.tp = cur->bc_tp;
> > > > >  	args.mp = cur->bc_mp;
> > > > > @@ -88,6 +90,7 @@ xfs_refcountbt_alloc_block(
> > > > >  	args.firstblock = args.fsbno;
> > > > >  	xfs_rmap_ag_owner(&args.oinfo, XFS_RMAP_OWN_REFC);
> > > > >  	args.minlen = args.maxlen = args.prod = 1;
> > > > > +	args.resv = XFS_AG_RESV_METADATA;
> > > > >  
> > > > >  	error = xfs_alloc_vextent(&args);
> > > > >  	if (error)
> > > > > @@ -125,16 +128,19 @@ xfs_refcountbt_free_block(
> > > > >  	struct xfs_agf		*agf = XFS_BUF_TO_AGF(agbp);
> > > > >  	xfs_fsblock_t		fsbno = XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp));
> > > > >  	struct xfs_owner_info	oinfo;
> > > > > +	int			error;
> > > > >  
> > > > >  	trace_xfs_refcountbt_free_block(cur->bc_mp, cur->bc_private.a.agno,
> > > > >  			XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno), 1);
> > > > >  	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_REFC);
> > > > >  	be32_add_cpu(&agf->agf_refcount_blocks, -1);
> > > > >  	xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_REFCOUNT_BLOCKS);
> > > > > -	xfs_bmap_add_free(mp, cur->bc_private.a.dfops, fsbno, 1,
> > > > > -			&oinfo);
> > > > > +	error = xfs_free_extent(cur->bc_tp, fsbno, 1, &oinfo,
> > > > > +			XFS_AG_RESV_METADATA);
> > > > > +	if (error)
> > > > > +		return error;
> > > > >  
> > > > > -	return 0;
> > > > > +	return error;
> > > > >  }
> > > > >  
> > > > >  STATIC int
> > > > > @@ -410,3 +416,36 @@ xfs_refcountbt_max_size(
> > > > >  
> > > > >  	return xfs_refcountbt_calc_size(mp, mp->m_sb.sb_agblocks);
> > > > >  }
> > > > > +
> > > > > +/*
> > > > > + * Figure out how many blocks to reserve and how many are used by this btree.
> > > > > + */
> > > > > +int
> > > > > +xfs_refcountbt_calc_reserves(
> > > > > +	struct xfs_mount	*mp,
> > > > > +	xfs_agnumber_t		agno,
> > > > > +	xfs_extlen_t		*ask,
> > > > > +	xfs_extlen_t		*used)
> > > > > +{
> > > > > +	struct xfs_buf		*agbp;
> > > > > +	struct xfs_agf		*agf;
> > > > > +	xfs_extlen_t		tree_len;
> > > > > +	int			error;
> > > > > +
> > > > > +	if (!xfs_sb_version_hasreflink(&mp->m_sb))
> > > > > +		return 0;
> > > > > +
> > > > > +	*ask += xfs_refcountbt_max_size(mp);
> > > > > +
> > > > > +	error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
> > > > > +	if (error)
> > > > > +		return error;
> > > > > +
> > > > > +	agf = XFS_BUF_TO_AGF(agbp);
> > > > > +	tree_len = be32_to_cpu(agf->agf_refcount_blocks);
> > > > > +	xfs_buf_relse(agbp);
> > > > > +
> > > > > +	*used += tree_len;
> > > > > +
> > > > > +	return error;
> > > > > +}
> > > > > diff --git a/fs/xfs/libxfs/xfs_refcount_btree.h b/fs/xfs/libxfs/xfs_refcount_btree.h
> > > > > index 780b02f..3be7768 100644
> > > > > --- a/fs/xfs/libxfs/xfs_refcount_btree.h
> > > > > +++ b/fs/xfs/libxfs/xfs_refcount_btree.h
> > > > > @@ -68,4 +68,7 @@ extern xfs_extlen_t xfs_refcountbt_calc_size(struct xfs_mount *mp,
> > > > >  		unsigned long long len);
> > > > >  extern xfs_extlen_t xfs_refcountbt_max_size(struct xfs_mount *mp);
> > > > >  
> > > > > +extern int xfs_refcountbt_calc_reserves(struct xfs_mount *mp,
> > > > > +		xfs_agnumber_t agno, xfs_extlen_t *ask, xfs_extlen_t *used);
> > > > > +
> > > > >  #endif	/* __XFS_REFCOUNT_BTREE_H__ */
> > > > > diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
> > > > > index 9c0585e..83e672f 100644
> > > > > --- a/fs/xfs/libxfs/xfs_rmap_btree.c
> > > > > +++ b/fs/xfs/libxfs/xfs_rmap_btree.c
> > > > > @@ -35,6 +35,7 @@
> > > > >  #include "xfs_cksum.h"
> > > > >  #include "xfs_error.h"
> > > > >  #include "xfs_extent_busy.h"
> > > > > +#include "xfs_ag_resv.h"
> > > > >  
> > > > >  /*
> > > > >   * Reverse map btree.
> > > > > @@ -533,3 +534,62 @@ xfs_rmapbt_compute_maxlevels(
> > > > >  		mp->m_rmap_maxlevels = xfs_btree_compute_maxlevels(mp,
> > > > >  				mp->m_rmap_mnr, mp->m_sb.sb_agblocks);
> > > > >  }
> > > > > +
> > > > > +/* Calculate the refcount btree size for some records. */
> > > > > +xfs_extlen_t
> > > > > +xfs_rmapbt_calc_size(
> > > > > +	struct xfs_mount	*mp,
> > > > > +	unsigned long long	len)
> > > > > +{
> > > > > +	return xfs_btree_calc_size(mp, mp->m_rmap_mnr, len);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Calculate the maximum refcount btree size.
> > > > > + */
> > > > > +xfs_extlen_t
> > > > > +xfs_rmapbt_max_size(
> > > > > +	struct xfs_mount	*mp)
> > > > > +{
> > > > > +	/* Bail out if we're uninitialized, which can happen in mkfs. */
> > > > > +	if (mp->m_rmap_mxr[0] == 0)
> > > > > +		return 0;
> > > > > +
> > > > > +	return xfs_rmapbt_calc_size(mp, mp->m_sb.sb_agblocks);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Figure out how many blocks to reserve and how many are used by this btree.
> > > > > + */
> > > > > +int
> > > > > +xfs_rmapbt_calc_reserves(
> > > > > +	struct xfs_mount	*mp,
> > > > > +	xfs_agnumber_t		agno,
> > > > > +	xfs_extlen_t		*ask,
> > > > > +	xfs_extlen_t		*used)
> > > > > +{
> > > > > +	struct xfs_buf		*agbp;
> > > > > +	struct xfs_agf		*agf;
> > > > > +	xfs_extlen_t		pool_len;
> > > > > +	xfs_extlen_t		tree_len;
> > > > > +	int			error;
> > > > > +
> > > > > +	if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
> > > > > +		return 0;
> > > > > +
> > > > > +	/* Reserve 1% of the AG or enough for 1 block per record. */
> > > > > +	pool_len = max(mp->m_sb.sb_agblocks / 100, xfs_rmapbt_max_size(mp));
> > > > > +	*ask += pool_len;
> > > > > +
> > > > > +	error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
> > > > > +	if (error)
> > > > > +		return error;
> > > > > +
> > > > > +	agf = XFS_BUF_TO_AGF(agbp);
> > > > > +	tree_len = be32_to_cpu(agf->agf_rmap_blocks);
> > > > > +	xfs_buf_relse(agbp);
> > > > > +
> > > > > +	*used += tree_len;
> > > > > +
> > > > > +	return error;
> > > > > +}
> > > > > diff --git a/fs/xfs/libxfs/xfs_rmap_btree.h b/fs/xfs/libxfs/xfs_rmap_btree.h
> > > > > index e73a553..2a9ac47 100644
> > > > > --- a/fs/xfs/libxfs/xfs_rmap_btree.h
> > > > > +++ b/fs/xfs/libxfs/xfs_rmap_btree.h
> > > > > @@ -58,4 +58,11 @@ struct xfs_btree_cur *xfs_rmapbt_init_cursor(struct xfs_mount *mp,
> > > > >  int xfs_rmapbt_maxrecs(struct xfs_mount *mp, int blocklen, int leaf);
> > > > >  extern void xfs_rmapbt_compute_maxlevels(struct xfs_mount *mp);
> > > > >  
> > > > > +extern xfs_extlen_t xfs_rmapbt_calc_size(struct xfs_mount *mp,
> > > > > +		unsigned long long len);
> > > > > +extern xfs_extlen_t xfs_rmapbt_max_size(struct xfs_mount *mp);
> > > > > +
> > > > > +extern int xfs_rmapbt_calc_reserves(struct xfs_mount *mp,
> > > > > +		xfs_agnumber_t agno, xfs_extlen_t *ask, xfs_extlen_t *used);
> > > > > +
> > > > >  #endif	/* __XFS_RMAP_BTREE_H__ */
> > > > > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> > > > > index 3acbf4e0..93d12fa 100644
> > > > > --- a/fs/xfs/xfs_fsops.c
> > > > > +++ b/fs/xfs/xfs_fsops.c
> > > > > @@ -43,6 +43,7 @@
> > > > >  #include "xfs_log.h"
> > > > >  #include "xfs_filestream.h"
> > > > >  #include "xfs_rmap.h"
> > > > > +#include "xfs_ag_resv.h"
> > > > >  
> > > > >  /*
> > > > >   * File system operations
> > > > > @@ -630,6 +631,11 @@ xfs_growfs_data_private(
> > > > >  	xfs_set_low_space_thresholds(mp);
> > > > >  	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
> > > > >  
> > > > > +	/* Reserve AG metadata blocks. */
> > > > > +	error = xfs_fs_reserve_ag_blocks(mp);
> > > > > +	if (error && error != -ENOSPC)
> > > > > +		goto out;
> > > > > +
> > > > >  	/* update secondary superblocks. */
> > > > >  	for (agno = 1; agno < nagcount; agno++) {
> > > > >  		error = 0;
> > > > > @@ -680,6 +686,8 @@ xfs_growfs_data_private(
> > > > >  			continue;
> > > > >  		}
> > > > >  	}
> > > > > +
> > > > > + out:
> > > > >  	return saved_error ? saved_error : error;
> > > > >  
> > > > >   error0:
> > > > > @@ -989,3 +997,59 @@ xfs_do_force_shutdown(
> > > > >  	"Please umount the filesystem and rectify the problem(s)");
> > > > >  	}
> > > > >  }
> > > > > +
> > > > > +/*
> > > > > + * Reserve free space for per-AG metadata.
> > > > > + */
> > > > > +int
> > > > > +xfs_fs_reserve_ag_blocks(
> > > > > +	struct xfs_mount	*mp)
> > > > > +{
> > > > > +	xfs_agnumber_t		agno;
> > > > > +	struct xfs_perag	*pag;
> > > > > +	int			error = 0;
> > > > > +	int			err2;
> > > > > +
> > > > > +	for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
> > > > > +		pag = xfs_perag_get(mp, agno);
> > > > > +		err2 = xfs_ag_resv_init(pag);
> > > > > +		xfs_perag_put(pag);
> > > > > +		if (err2 && !error)
> > > > > +			error = err2;
> > > > > +	}
> > > > > +
> > > > > +	if (error && error != -ENOSPC) {
> > > > > +		xfs_warn(mp,
> > > > > +	"Error %d reserving per-AG metadata reserve pool.", error);
> > > > > +		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > > > > +	}
> > > > > +
> > > > > +	return error;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Free space reserved for per-AG metadata.
> > > > > + */
> > > > > +int
> > > > > +xfs_fs_unreserve_ag_blocks(
> > > > > +	struct xfs_mount	*mp)
> > > > > +{
> > > > > +	xfs_agnumber_t		agno;
> > > > > +	struct xfs_perag	*pag;
> > > > > +	int			error = 0;
> > > > > +	int			err2;
> > > > > +
> > > > > +	for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
> > > > > +		pag = xfs_perag_get(mp, agno);
> > > > > +		err2 = xfs_ag_resv_free(pag);
> > > > > +		xfs_perag_put(pag);
> > > > > +		if (err2 && !error)
> > > > > +			error = err2;
> > > > > +	}
> > > > > +
> > > > > +	if (error)
> > > > > +		xfs_warn(mp,
> > > > > +	"Error %d freeing per-AG metadata reserve pool.", error);
> > > > > +
> > > > > +	return error;
> > > > > +}
> > > > > diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h
> > > > > index f32713f..f349158 100644
> > > > > --- a/fs/xfs/xfs_fsops.h
> > > > > +++ b/fs/xfs/xfs_fsops.h
> > > > > @@ -26,4 +26,7 @@ extern int xfs_reserve_blocks(xfs_mount_t *mp, __uint64_t *inval,
> > > > >  				xfs_fsop_resblks_t *outval);
> > > > >  extern int xfs_fs_goingdown(xfs_mount_t *mp, __uint32_t inflags);
> > > > >  
> > > > > +extern int xfs_fs_reserve_ag_blocks(struct xfs_mount *mp);
> > > > > +extern int xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
> > > > > +
> > > > >  #endif	/* __XFS_FSOPS_H__ */
> > > > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > > > > index caecbd2..b5da81d 100644
> > > > > --- a/fs/xfs/xfs_mount.c
> > > > > +++ b/fs/xfs/xfs_mount.c
> > > > > @@ -986,10 +986,17 @@ xfs_mountfs(
> > > > >  			xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > > > >  			goto out_quota;
> > > > >  		}
> > > > > +
> > > > > +		/* Reserve AG blocks for future btree expansion. */
> > > > > +		error = xfs_fs_reserve_ag_blocks(mp);
> > > > > +		if (error && error != -ENOSPC)
> > > > > +			goto out_agresv;
> > > > >  	}
> > > > >  
> > > > >  	return 0;
> > > > >  
> > > > > + out_agresv:
> > > > > +	xfs_fs_unreserve_ag_blocks(mp);
> > > > >   out_quota:
> > > > >  	xfs_qm_unmount_quotas(mp);
> > > > >   out_rtunmount:
> > > > > @@ -1034,6 +1041,7 @@ xfs_unmountfs(
> > > > >  
> > > > >  	cancel_delayed_work_sync(&mp->m_eofblocks_work);
> > > > >  
> > > > > +	xfs_fs_unreserve_ag_blocks(mp);
> > > > >  	xfs_qm_unmount_quotas(mp);
> > > > >  	xfs_rtunmount_inodes(mp);
> > > > >  	IRELE(mp->m_rootip);
> > > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > > > > index e6aaa91..875ab9f 100644
> > > > > --- a/fs/xfs/xfs_super.c
> > > > > +++ b/fs/xfs/xfs_super.c
> > > > > @@ -1315,10 +1315,22 @@ xfs_fs_remount(
> > > > >  			xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > > > >  			return error;
> > > > >  		}
> > > > > +
> > > > > +		/* Create the per-AG metadata reservation pool .*/
> > > > > +		error = xfs_fs_reserve_ag_blocks(mp);
> > > > > +		if (error && error != -ENOSPC)
> > > > > +			return error;
> > > > >  	}
> > > > >  
> > > > >  	/* rw -> ro */
> > > > >  	if (!(mp->m_flags & XFS_MOUNT_RDONLY) && (*flags & MS_RDONLY)) {
> > > > > +		/* Free the per-AG metadata reservation pool. */
> > > > > +		error = xfs_fs_unreserve_ag_blocks(mp);
> > > > > +		if (error) {
> > > > > +			xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > > > > +			return error;
> > > > > +		}
> > > > > +
> > > > >  		/*
> > > > >  		 * Before we sync the metadata, we need to free up the reserve
> > > > >  		 * block pool so that the used block count in the superblock on
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-12-07 11:53 UTC|newest]

Thread overview: 188+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-30  3:05 [PATCH v10 00/63] xfs: add reflink and dedupe support Darrick J. Wong
2016-09-30  3:05 ` [PATCH 01/63] vfs: support FS_XFLAG_COWEXTSIZE and get/set of CoW extent size hint Darrick J. Wong
2016-09-30  3:05 ` [PATCH 02/63] vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks Darrick J. Wong
2016-09-30  7:08   ` Christoph Hellwig
2016-09-30  3:05 ` [PATCH 03/63] xfs: return an error when an inline directory is too small Darrick J. Wong
2016-09-30  3:06 ` [PATCH 04/63] xfs: define tracepoints for refcount btree activities Darrick J. Wong
2016-09-30  3:06 ` [PATCH 05/63] xfs: introduce refcount btree definitions Darrick J. Wong
2016-09-30  3:06 ` [PATCH 06/63] xfs: refcount btree add more reserved blocks Darrick J. Wong
2016-09-30  3:06 ` [PATCH 07/63] xfs: define the on-disk refcount btree format Darrick J. Wong
2016-09-30  3:06 ` [PATCH 08/63] xfs: add refcount btree support to growfs Darrick J. Wong
2016-09-30  3:06 ` [PATCH 09/63] xfs: account for the refcount btree in the alloc/free log reservation Darrick J. Wong
2016-09-30  3:06 ` [PATCH 10/63] xfs: add refcount btree operations Darrick J. Wong
2016-09-30  3:06 ` [PATCH 11/63] xfs: create refcount update intent log items Darrick J. Wong
2016-09-30  3:06 ` [PATCH 12/63] xfs: log refcount intent items Darrick J. Wong
2016-09-30  3:06 ` [PATCH 13/63] xfs: adjust refcount of an extent of blocks in refcount btree Darrick J. Wong
2016-09-30  7:11   ` Christoph Hellwig
2016-09-30 17:53     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 14/63] xfs: connect refcount adjust functions to upper layers Darrick J. Wong
2016-09-30  7:13   ` Christoph Hellwig
2016-09-30 16:21   ` Brian Foster
2016-09-30 19:40     ` Darrick J. Wong
2016-09-30 20:11       ` Brian Foster
2016-09-30  3:07 ` [PATCH 15/63] xfs: adjust refcount when unmapping file blocks Darrick J. Wong
2016-09-30  7:14   ` Christoph Hellwig
2016-09-30  3:07 ` [PATCH 16/63] xfs: add refcount btree block detection to log recovery Darrick J. Wong
2016-09-30  7:15   ` Christoph Hellwig
2016-09-30  3:07 ` [PATCH 17/63] xfs: refcount btree requires more reserved space Darrick J. Wong
2016-09-30  7:15   ` Christoph Hellwig
2016-09-30 16:46   ` Brian Foster
2016-09-30 18:41     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 18/63] xfs: introduce reflink utility functions Darrick J. Wong
2016-09-30  3:07   ` Darrick J. Wong
2016-09-30  7:16   ` Christoph Hellwig
2016-09-30 19:22   ` Brian Foster
2016-09-30 19:50     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 19/63] xfs: create bmbt update intent log items Darrick J. Wong
2016-09-30  7:24   ` Christoph Hellwig
2016-09-30 17:24     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 20/63] xfs: log bmap intent items Darrick J. Wong
2016-09-30  7:26   ` Christoph Hellwig
2016-09-30 17:26     ` Darrick J. Wong
2016-09-30 19:22   ` Brian Foster
2016-09-30 19:52     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 21/63] xfs: map an inode's offset to an exact physical block Darrick J. Wong
2016-09-30  7:31   ` Christoph Hellwig
2016-09-30 17:30     ` Darrick J. Wong
2016-10-03 19:03   ` Brian Foster
2016-10-04  0:11     ` Darrick J. Wong
2016-10-04 12:43       ` Brian Foster
2016-10-04 17:28         ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 22/63] xfs: pass bmapi flags through to bmap_del_extent Darrick J. Wong
2016-09-30  7:16   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 23/63] xfs: implement deferred bmbt map/unmap operations Darrick J. Wong
2016-09-30  7:34   ` Christoph Hellwig
2016-09-30 17:38     ` Darrick J. Wong
2016-09-30 20:34       ` Roger Willcocks
2016-09-30 21:08         ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 24/63] xfs: when replaying bmap operations, don't let unlinked inodes get reaped Darrick J. Wong
2016-09-30  7:35   ` Christoph Hellwig
2016-10-03 19:04   ` Brian Foster
2016-10-04  0:29     ` Darrick J. Wong
2016-10-04 12:44       ` Brian Foster
2016-10-04 19:07         ` Dave Chinner
2016-10-04 21:44           ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 25/63] xfs: return work remaining at the end of a bunmapi operation Darrick J. Wong
2016-09-30  7:19   ` Christoph Hellwig
2016-10-03 19:04   ` Brian Foster
2016-10-04  0:30     ` Darrick J. Wong
2016-10-04 12:44       ` Brian Foster
2016-09-30  3:08 ` [PATCH 26/63] xfs: define tracepoints for reflink activities Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 27/63] xfs: add reflink feature flag to geometry Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 28/63] xfs: don't allow reflinked dir/dev/fifo/socket/pipe files Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 29/63] xfs: introduce the CoW fork Darrick J. Wong
2016-09-30  7:39   ` Christoph Hellwig
2016-09-30 17:48     ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 30/63] xfs: support bmapping delalloc extents in " Darrick J. Wong
2016-09-30  7:42   ` Christoph Hellwig
2016-09-30  3:09 ` [PATCH 31/63] xfs: create delalloc extents in " Darrick J. Wong
2016-10-04 16:38   ` Brian Foster
2016-10-04 17:39     ` Darrick J. Wong
2016-10-04 18:38       ` Brian Foster
2016-09-30  3:09 ` [PATCH 32/63] xfs: support allocating delayed " Darrick J. Wong
2016-09-30  7:42   ` Christoph Hellwig
2016-10-04 16:38   ` Brian Foster
2016-09-30  3:09 ` [PATCH 33/63] xfs: allocate " Darrick J. Wong
2016-10-04 16:38   ` Brian Foster
2016-10-04 18:26     ` Darrick J. Wong
2016-10-04 18:39       ` Brian Foster
2016-09-30  3:09 ` [PATCH 34/63] xfs: support removing extents from " Darrick J. Wong
2016-09-30  7:46   ` Christoph Hellwig
2016-09-30 18:00     ` Darrick J. Wong
2016-10-05 18:26   ` Brian Foster
2016-09-30  3:09 ` [PATCH 35/63] xfs: move mappings from cow fork to data fork after copy-write Darrick J. Wong
2016-10-05 18:26   ` Brian Foster
2016-10-05 21:22     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 36/63] xfs: report shared extent mappings to userspace correctly Darrick J. Wong
2016-09-30  3:09 ` [PATCH 37/63] xfs: implement CoW for directio writes Darrick J. Wong
2016-10-05 18:27   ` Brian Foster
2016-10-05 20:55     ` Darrick J. Wong
2016-10-06 12:20       ` Brian Foster
2016-10-07  1:02         ` Darrick J. Wong
2016-10-07  6:17           ` Christoph Hellwig
2016-10-07 12:16             ` Brian Foster
2016-10-07 12:15           ` Brian Foster
2016-10-13 18:14             ` Darrick J. Wong
2016-10-13 19:01               ` Brian Foster
2016-09-30  3:09 ` [PATCH 38/63] xfs: cancel CoW reservations and clear inode reflink flag when freeing blocks Darrick J. Wong
2016-09-30  7:47   ` Christoph Hellwig
2016-10-06 16:44   ` Brian Foster
2016-10-07  0:40     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 39/63] xfs: cancel pending CoW reservations when destroying inodes Darrick J. Wong
2016-09-30  7:47   ` Christoph Hellwig
2016-10-06 16:44   ` Brian Foster
2016-10-07  0:42     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 40/63] xfs: store in-progress CoW allocations in the refcount btree Darrick J. Wong
2016-09-30  7:49   ` Christoph Hellwig
2016-10-07 18:04   ` Brian Foster
2016-10-07 19:18     ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 41/63] xfs: reflink extents from one file to another Darrick J. Wong
2016-09-30  7:50   ` Christoph Hellwig
2016-10-07 18:04   ` Brian Foster
2016-10-07 19:44     ` Darrick J. Wong
2016-10-07 20:48       ` Brian Foster
2016-10-07 21:41         ` Darrick J. Wong
2016-10-10 13:17           ` Brian Foster
2016-09-30  3:10 ` [PATCH 42/63] xfs: add clone file and clone range vfs functions Darrick J. Wong
2016-09-30  7:51   ` Christoph Hellwig
2016-09-30 18:04     ` Darrick J. Wong
2016-10-07 18:04   ` Brian Foster
2016-10-07 20:31     ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 43/63] xfs: add dedupe range vfs function Darrick J. Wong
2016-09-30  7:53   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 44/63] xfs: teach get_bmapx about shared extents and the CoW fork Darrick J. Wong
2016-09-30  7:53   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 45/63] xfs: swap inode reflink flags when swapping inode extents Darrick J. Wong
2016-09-30  7:54   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 46/63] xfs: unshare a range of blocks via fallocate Darrick J. Wong
2016-09-30  7:54   ` Christoph Hellwig
2016-10-07 18:05   ` Brian Foster
2016-10-07 20:26     ` Darrick J. Wong
2016-10-07 20:58       ` Brian Foster
2016-10-07 21:15         ` Darrick J. Wong
2016-10-07 22:25           ` Dave Chinner
2016-10-10 17:05             ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 47/63] xfs: create a separate cow extent size hint for the allocator Darrick J. Wong
2016-09-30  7:55   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 48/63] xfs: preallocate blocks for worst-case btree expansion Darrick J. Wong
2016-09-30  8:19   ` Christoph Hellwig
2016-10-12 18:44   ` Brian Foster
2016-10-12 20:52     ` Darrick J. Wong
2016-10-12 22:42       ` Brian Foster
2016-12-06 19:32         ` Darrick J. Wong
2016-12-07 11:53           ` Brian Foster [this message]
2016-12-08  6:14             ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 49/63] xfs: don't allow reflink when the AG is low on space Darrick J. Wong
2016-09-30  8:19   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 50/63] xfs: try other AGs to allocate a BMBT block Darrick J. Wong
2016-09-30  8:20   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 51/63] xfs: garbage collect old cowextsz reservations Darrick J. Wong
2016-09-30  8:23   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 52/63] xfs: increase log reservations for reflink Darrick J. Wong
2016-09-30  8:23   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 53/63] xfs: add shared rmap map/unmap/convert log item types Darrick J. Wong
2016-09-30  8:24   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 54/63] xfs: use interval query for rmap alloc operations on shared files Darrick J. Wong
2016-09-30  8:24   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 55/63] xfs: convert unwritten status of reverse mappings for " Darrick J. Wong
2016-09-30  8:25   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 56/63] xfs: set a default CoW extent size of 32 blocks Darrick J. Wong
2016-09-30  8:25   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 57/63] xfs: check for invalid inode reflink flags Darrick J. Wong
2016-09-30  8:26   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 58/63] xfs: don't mix reflink and DAX mode for now Darrick J. Wong
2016-09-30  8:26   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 59/63] xfs: simulate per-AG reservations being critically low Darrick J. Wong
2016-09-30  8:27   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 60/63] xfs: recognize the reflink feature bit Darrick J. Wong
2016-09-30  8:27   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 61/63] xfs: various swapext cleanups Darrick J. Wong
2016-09-30  8:28   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 62/63] xfs: refactor swapext code Darrick J. Wong
2016-09-30  8:28   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 63/63] xfs: implement swapext for rmap filesystems Darrick J. Wong
2016-09-30  9:00 ` [PATCH v10 00/63] xfs: add reflink and dedupe support Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2016-09-28  2:53 [PATCH v9 " Darrick J. Wong
2016-09-28  2:58 ` [PATCH 48/63] xfs: preallocate blocks for worst-case btree expansion Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161207115323.GA23106@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.