All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 8/8] xfs: intent item whiteouts
Date: Wed, 27 Apr 2022 10:31:45 -0700	[thread overview]
Message-ID: <20220427173145.GK17059@magnolia> (raw)
In-Reply-To: <20220427054757.GO1098723@dread.disaster.area>

On Wed, Apr 27, 2022 at 03:47:57PM +1000, Dave Chinner wrote:
> On Tue, Apr 26, 2022 at 08:32:52PM -0700, Darrick J. Wong wrote:
> > On Wed, Apr 27, 2022 at 12:22:59PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > When we log modifications based on intents, we add both intent
> > > and intent done items to the modification being made. These get
> > > written to the log to ensure that the operation is re-run if the
> > > intent done is not found in the log.
> > > 
> > > However, for operations that complete wholly within a single
> > > checkpoint, the change in the checkpoint is atomic and will never
> > > need replay. In this case, we don't need to actually write the
> > > intent and intent done items to the journal because log recovery
> > > will never need to manually restart this modification.
> > > 
> > > Log recovery currently handles intent/intent done matching by
> > > inserting the intent into the AIL, then removing it when a matching
> > > intent done item is found. Hence for all the intent-based operations
> > > that complete within a checkpoint, we spend all that time parsing
> > > the intent/intent done items just to cancel them and do nothing with
> > > them.
> > > 
> > > Hence it follows that the only time we actually need intents in the
> > > log is when the modification crosses checkpoint boundaries in the
> > > log and so may only be partially complete in the journal. Hence if
> > > we commit and intent done item to the CIL and the intent item is in
> > > the same checkpoint, we don't actually have to write them to the
> > > journal because log recovery will always cancel the intents.
> > > 
> > > We've never really worried about the overhead of logging intents
> > > unnecessarily like this because the intents we log are generally
> > > very much smaller than the change being made. e.g. freeing an extent
> > > involves modifying at lease two freespace btree blocks and the AGF,
> > > so the EFI/EFD overhead is only a small increase in space and
> > > processing time compared to the overall cost of freeing an extent.
> > 
> > Question: If we whiteout enough intent items, does that make it possible
> > to cram more updates into a checkpoint?
> 
> Yes - we release the space the cancelled intent pair used from the
> ctx->space_used counter that tracks the size of the CIL checkpoint.

<nod> Good, that's what I was thinking.

> > Are changes required to the existing intent item code to support
> > whiteouts, or does the log code autodetect an *I/*D pair in the same
> > checkpoint and elide them automatically?
> 
> The log code automagically detects it. That's what the ->iop_intent
> op is for - when a done intent committed, it looks up it's intent
> pair via ->iop_intent and then checks if it is in the current
> checkpoint via xlog_item_in_current_chkpt() and if that returns true
> then we place a whiteout on the intent and release the space it
> consumes.
> 
> We don't cull the intent from the CIL until the context checkpoint
> commits - we could remove it immediately, but then when the CIL
> scalability code gets placed on top of this we can't remove log
> items from the per-cpu CIL in transaction commit context and so we
> have to use whiteouts to delay removal to the push context. So I
> just implemented it that way to start with....

<nod> Sounds reasonable to me.

> > (I might be fishing here for "Does this make generic/447 faster when it
> > deletes the million extents?")
> 
> I think it knocks it down a little - maybe 10%? One machine would
> appear to drop 126->116s, another it is 52->48s.
> 
> The intents are generally tiny compared to the rest of the changes
> being (re-)logged, so I'm not expecting miracles for the
> rmap/reflink code. The big wins come when intents contain large
> chunks of data...

I'll take a 10% win. :)

> > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> > > index 59aa5f9bf769..670d074a71cc 100644
> > > --- a/fs/xfs/xfs_bmap_item.c
> > > +++ b/fs/xfs/xfs_bmap_item.c
> > > @@ -39,6 +39,7 @@ STATIC void
> > >  xfs_bui_item_free(
> > >  	struct xfs_bui_log_item	*buip)
> > >  {
> > > +	kmem_free(buip->bui_item.li_lv_shadow);
> > 
> > Why is it necessary for log items to free their own shadow buffer?
> 
> Twisty unpin passages...
> 
> Intents with whiteouts on them were leaking them when they
> were unpinned from the whiteout list in xlog_cil_push_work(). The
> log vectors no longer get attached to the CIL context and freed
> via xlog_cil_committed()->xlog_cil_free_logvec(), and so when they
> are unpinned by xlog_cil_push_work() the last reference is released
> and we have to free the log vector attached to the item as it is
> still attached.
> 
> The reason we can't do it directly from ->iop_unpin() is that we
> also call ->iop_unpin from xlog_cil_committed()->
> xfs_trans_committed_bulk(), and if we are aborting there we do not
> want to free the shadow buffer because it is still linked into the
> lv chain attached to the CIL ctx and will get freed once
> xfs_trans_committed_bulk() returns....

Huh.  So now that I'm more awake, I noticed that you didn't patch
xfs_extfree_item.c to free the shadow buffers because the
xfs_ef[id]_item_free functions already have code to free the shadow
buffer.  git blame says that was added in:

b1c5ebb21301 ("xfs: allocate log vector buffers outside CIL context lock")

This commit was added in 4.8-rc1, and just prior to merging the rmap
patches.

When do log intent items get shadow buffers, since they should only be
committed once?  Looking at that old commit, I think what's going on is
that we preallocate the shadow buffers for every log item at commit time
to avoid a memory allocation when we have the ctx lock, so now it's
necessary for all log intent items to free them?

Does that mean RUI/CUI/BUI log items could have been leaking shadow
buffers since the beginning, and we just haven't noticed because the CIL
has freed them for us?  Which means that the changes to xfs_*_item.c
could, in theory, be a separate patch that fixes a theoretical memory
leak?

(I'm not asking for you to separate the changes; I'm checking my
understanding of something that caused me to go "Eh???" on first
reading.)

> > > @@ -1393,7 +1463,11 @@ xlog_cil_commit(
> > >  	/* lock out background commit */
> > >  	down_read(&cil->xc_ctx_lock);
> > >  
> > > -	xlog_cil_insert_items(log, tp);
> > > +	if (tp->t_flags & XFS_TRANS_HAS_INTENT_DONE)
> > > +		released_space = xlog_cil_process_intents(cil, tp);
> > > +
> > > +	xlog_cil_insert_items(log, tp, released_space);
> > > +	tp->t_ticket->t_curr_res += released_space;
> > 
> > I'm a little tired, so why isn't this adjustment a part of
> > xlog_cil_insert_items?  A similar adjustment is made to
> > ctx->space_used to release the unused space back to the committing tx,
> > right?
> 
> Probably because it was a bug fix I added at some point and not
> original code....
> 
> I'm not fussed where it ends up - I can move it if you want.

Yes please, since xlog_cil_insert_items already adjusts
tp->t_ticket->t_curr_res and it would seem to make more sense to keep
all those adjustments together.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2022-04-27 17:32 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27  2:22 [PATCH 0/8 v5] xfs: intent whiteouts Dave Chinner
2022-04-27  2:22 ` [PATCH 1/8] xfs: hide log iovec alignment constraints Dave Chinner
2022-04-27  3:14   ` Darrick J. Wong
2022-04-27  4:50     ` Dave Chinner
2022-04-27 16:45       ` Darrick J. Wong
2022-04-28 13:00   ` Christoph Hellwig
2022-04-27  2:22 ` [PATCH 2/8] xfs: don't commit the first deferred transaction without intents Dave Chinner
2022-04-27  3:03   ` Darrick J. Wong
2022-04-27  4:52     ` Dave Chinner
2022-04-28 13:02   ` Christoph Hellwig
2022-04-30 17:02   ` Alli
2022-04-27  2:22 ` [PATCH 3/8] xfs: add log item flags to indicate intents Dave Chinner
2022-04-27  3:04   ` Darrick J. Wong
2022-04-28 13:04     ` Christoph Hellwig
2022-04-27  2:22 ` [PATCH 4/8] xfs: tag transactions that contain intent done items Dave Chinner
2022-04-27  3:06   ` Darrick J. Wong
2022-04-28 13:05   ` Christoph Hellwig
2022-04-27  2:22 ` [PATCH 5/8] xfs: factor and move some code in xfs_log_cil.c Dave Chinner
2022-04-27  3:15   ` Darrick J. Wong
2022-04-27  4:56     ` Dave Chinner
2022-04-28 13:06   ` Christoph Hellwig
2022-04-29  1:56   ` Alli
2022-04-27  2:22 ` [PATCH 6/8] xfs: add log item method to return related intents Dave Chinner
2022-04-27  3:18   ` Darrick J. Wong
2022-04-28 13:10   ` Christoph Hellwig
2022-04-27  2:22 ` [PATCH 7/8] xfs: whiteouts release intents that are not in the AIL Dave Chinner
2022-04-27  3:19   ` Darrick J. Wong
2022-04-28 13:15   ` Christoph Hellwig
2022-04-27  2:22 ` [PATCH 8/8] xfs: intent item whiteouts Dave Chinner
2022-04-27  3:32   ` Darrick J. Wong
2022-04-27  5:47     ` Dave Chinner
2022-04-27 17:31       ` Darrick J. Wong [this message]
2022-04-27 22:05         ` Dave Chinner
2022-04-28 13:22   ` Christoph Hellwig
2022-04-28 21:38     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2022-03-14 22:06 [PATCH 0/8 v3] xfs: intent whiteouts Dave Chinner
2022-03-14 22:06 ` [PATCH 8/8] xfs: intent item whiteouts Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220427173145.GK17059@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.