All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2 04/11] xfs: CoW fork operations should only update quota reservations
Date: Fri, 26 Jan 2018 10:40:29 -0800	[thread overview]
Message-ID: <20180126184029.GX9068@magnolia> (raw)
In-Reply-To: <20180126130215.GA47923@bfoster.bfoster>

On Fri, Jan 26, 2018 at 08:02:16AM -0500, Brian Foster wrote:
> On Thu, Jan 25, 2018 at 10:20:03AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 25, 2018 at 08:03:53AM -0500, Brian Foster wrote:
> > > On Wed, Jan 24, 2018 at 05:20:35PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Since the CoW fork only exists in memory, it is incorrect to update the
> > > > on-disk quota block counts when we modify the CoW fork.  Unlike the data
> > > > fork, even real extents in the CoW fork are only reservations (on-disk
> > > > they're owned by the refcountbt) so they must not be tracked in the on
> > > > disk quota info.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > > v2: make documentation more crisp and to the point
> > > > ---
> > > >  fs/xfs/libxfs/xfs_bmap.c |  118 ++++++++++++++++++++++++++++++++++++++++++----
> > > >  fs/xfs/xfs_quota.h       |   14 ++++-
> > > >  fs/xfs/xfs_reflink.c     |    8 ++-
> > > >  3 files changed, 122 insertions(+), 18 deletions(-)
> > > > 
> ...
> > > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > > > index 82abff6..e367351 100644
> > > > --- a/fs/xfs/xfs_reflink.c
> > > > +++ b/fs/xfs/xfs_reflink.c
> > > > @@ -599,10 +599,6 @@ xfs_reflink_cancel_cow_blocks(
> > > >  					del.br_startblock, del.br_blockcount,
> > > >  					NULL);
> > > >  
> > > > -			/* Update quota accounting */
> > > > -			xfs_trans_mod_dquot_byino(*tpp, ip, XFS_TRANS_DQ_BCOUNT,
> > > > -					-(long)del.br_blockcount);
> > > > -
> > > >  			/* Roll the transaction */
> > > >  			xfs_defer_ijoin(&dfops, ip);
> > > >  			error = xfs_defer_finish(tpp, &dfops);
> > > > @@ -795,6 +791,10 @@ xfs_reflink_end_cow(
> > > >  		if (error)
> > > >  			goto out_defer;
> > > >  
> > > > +		/* Charge this new data fork mapping to the on-disk quota. */
> > > > +		xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
> > > > +				(long)del.br_blockcount);
> > > > +
> > > 
> > > Should this technically be XFS_TRANS_DQ_DELBCOUNT? The blocks obviously
> > > aren't delalloc and this transaction doesn't make a quota reservation so
> > > I don't think it screws up accounting. But if the transaction did make a
> > > quota reservation, it seems like this would account the extent against
> > > the tx reservation where it instead should recognize that cow blocks
> > > have already been reserved (which is essentially what DELBCOUNT means,
> > > IIUC).
> > 
> > Hmmm, there's a subtlety here -- we're opencoding what DELBCOUNT does,
> > because the subsequent xfs_bmap_del_extent_cow unconditionally reduces
> > the in-core reservation after we've mapped in the extent as if it had
> > been accounted as a real extent all along.  But considering all the
> > blather about how cow fork blocks are treated as incore reservations, it
> > does look funny, doesn't it?
> > 
> 
> Ok.. I missed that the end/del cases were tied together, then reconfused
> myself over the accounting in the end_cow() path (re: our irc chat
> yesterday) when reassessing that bit. So to reset my brain, we have the
> following with this current patch:
> 
> - cow reserve does a delalloc and in-core dquot reservation
> - cow real alloc either skips dquot adjustment if wasdel, else reduces
>   the quota res acquired by the transaction by the size of the alloc[1].
>   Either way we leave around an in-core quota reservation as if the blocks
>   remained delalloc.
> - A cancel at this point simply kills the in-core dquot reservation
>   along with the cow fork blocks.
> - end_cow() unmaps the current data fork blocks and decrements
>   associated real quota usage (tx), remaps the cow blocks and increments
>   real quota usage (tx), then kills off the in-core dquot reservation.

Correct.

> [1] Would this even be necessary if we just acquired a delalloc like
> reservation in xfs_reflink_allocate_cow() rather than associate the
> reservation with the transaction in the first place (assuming we have
> enough information to cover error handling, extent manipulations and
> whatnot)?

Originally cow did make da reservations even for direct writes, but
Christoph thought that we could avoid the overhead of running through
the cow fork an extra time by mapping directly to the cow fork.

> When the tx commits, this essentially has the effect of applying the
> bcount delta to both the on-disk dquot and the in-core res. The former
> reflects the change in the file on-disk and the latter is rectified
> because the field accounts for the current real usage plus outstanding
> reservation. The original cowblocks res has been dropped directly, so
> the bcount delta reflects the change to the data fork.

<nod>

> If we instead use delbcount in end_cow(), we're telling the transaction
> to drop bcount by whatever old data fork blocks were removed and that
> we've converted N delalloc (cow fork, actually) blocks that already had
> in-core reservation. Therefore, transaction commit updates the on-disk
> dquot just the same (-dataforkblocks + delallocblocks), but delbcount
> blocks have already updated the in-core dquot res so the transaction has
> nothing else to do there (and so we must also not remove that
> reservation in del_cow()). This approach does seem like it requires a
> bit less mental gymnastics to follow because it more closely resembles
> delalloc quota accounting. ;)

Yes, that's less brain muddling; last night's patchpile incorporates
that.

> Another thing that I'm not sure has been considered here is whether
> doing the bcount delta in the transaction and dropping the cowblocks res
> from the dquot directly leaves a race window where the quota can overrun
> a limit. E.g., since the transaction has to up the in-core res in the
> original example at commit time, is there anything that locks out
> further external reservation from the dquot between the time the in-core
> res is dropped and the transaction commits?

Yes, that's a theoretical race (as in I've never seen it happen) that
is fixed by using delbcount in end_cow.

> > So perhaps the solution is to pass intent into xfs_bmap_del_extent_cow:
> > if we're calling it from _end_cow then we want to hang on to the
> > reservation so that delbcount can do its thing, but if we're calling
> > from _cancel_cow then we're dumping the extent and reservation.
> > 
> 
> Indeed. But since those are the only callers and we'd already update
> delbcount from end_cow(), could we not just lift the del_cow() decrement
> into the cancel_cow() function? FWIW, some extra comments around quota
> manipulation in the reflink functions would also be useful for future
> reference.

Hm, yes, could do that too.

TBH I had the moment of "doh, just call the quota unreserve in
cancel_cow directly instead of at the end of del_extent_cow" right after
I hit send. :(

--D

> Brian
> 
> > --D
> > 
> > > 
> > > Other than that the code seems Ok to me.
> > > 
> > > Brian
> > > 
> > > >  		/* Remove the mapping from the CoW fork. */
> > > >  		xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
> > > >  
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-01-26 18:43 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-24  2:17 [PATCH 00/11] xfs: reflink/scrub/quota fixes Darrick J. Wong
2018-01-24  2:18 ` [PATCH 01/11] xfs: reflink should break pnfs leases before sharing blocks Darrick J. Wong
2018-01-24 14:16   ` Brian Foster
2018-01-26  9:06   ` Christoph Hellwig
2018-01-26 18:26     ` Darrick J. Wong
2018-01-24  2:18 ` [PATCH 02/11] xfs: only grab shared inode locks for source file during reflink Darrick J. Wong
2018-01-24 14:18   ` Brian Foster
2018-01-24 18:40     ` Darrick J. Wong
2018-01-26 12:07   ` Christoph Hellwig
2018-01-26 18:48     ` Darrick J. Wong
2018-01-27  3:32     ` Dave Chinner
2018-01-24  2:18 ` [PATCH 03/11] xfs: call xfs_qm_dqattach before performing reflink operations Darrick J. Wong
2018-01-24 14:18   ` Brian Foster
2018-01-26  9:07   ` Christoph Hellwig
2018-01-24  2:18 ` [PATCH 04/11] xfs: CoW fork operations should only update quota reservations Darrick J. Wong
2018-01-24 14:22   ` Brian Foster
2018-01-24 19:14     ` Darrick J. Wong
2018-01-25 13:01       ` Brian Foster
2018-01-25 17:52         ` Darrick J. Wong
2018-01-25  1:20   ` [PATCH v2 " Darrick J. Wong
2018-01-25 13:03     ` Brian Foster
2018-01-25 18:20       ` Darrick J. Wong
2018-01-26 13:02         ` Brian Foster
2018-01-26 18:40           ` Darrick J. Wong [this message]
2018-01-26 12:12     ` Christoph Hellwig
2018-01-24  2:18 ` [PATCH 05/11] xfs: track CoW blocks separately in the inode Darrick J. Wong
2018-01-25 13:06   ` Brian Foster
2018-01-25 19:21     ` Darrick J. Wong
2018-01-26 13:04       ` Brian Foster
2018-01-26 19:08         ` Darrick J. Wong
2018-01-26 12:15   ` Christoph Hellwig
2018-01-26 19:00     ` Darrick J. Wong
2018-01-26 23:51       ` Darrick J. Wong
2018-01-24  2:18 ` [PATCH 06/11] xfs: fix up cowextsz allocation shortfalls Darrick J. Wong
2018-01-25 17:31   ` Brian Foster
2018-01-25 20:20     ` Darrick J. Wong
2018-01-26 13:06       ` Brian Foster
2018-01-26 19:12         ` Darrick J. Wong
2018-01-26  9:11   ` Christoph Hellwig
2018-01-24  2:18 ` [PATCH 07/11] xfs: always zero di_flags2 when we free the inode Darrick J. Wong
2018-01-25 17:31   ` Brian Foster
2018-01-25 18:36     ` Darrick J. Wong
2018-01-26  9:08   ` Christoph Hellwig
2018-01-24  2:18 ` [PATCH 08/11] xfs: fix tracepoint %p formats Darrick J. Wong
2018-01-25 17:31   ` Brian Foster
2018-01-25 18:47     ` Darrick J. Wong
2018-01-26  0:19       ` Darrick J. Wong
2018-01-26  9:09         ` Christoph Hellwig
2018-01-24  2:18 ` [PATCH 09/11] xfs: make tracepoint inode number format consistent Darrick J. Wong
2018-01-25 17:31   ` Brian Foster
2018-01-26  9:09   ` Christoph Hellwig
2018-01-24  2:19 ` [PATCH 10/11] xfs: refactor inode verifier corruption error printing Darrick J. Wong
2018-01-25 17:31   ` Brian Foster
2018-01-25 18:23     ` Darrick J. Wong
2018-01-26  9:10   ` Christoph Hellwig
2018-01-24  2:19 ` [PATCH 11/11] xfs: don't clobber inobt/finobt cursors when xref with rmap Darrick J. Wong
2018-01-26  9:10   ` Christoph Hellwig
2018-01-25  5:26 ` [PATCH 12/11] xfs: refactor quota code in xfs_bmap_btalloc Darrick J. Wong
2018-01-26 12:17   ` Christoph Hellwig
2018-01-26 21:46     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180126184029.GX9068@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.