All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 7/7] xfs: push the grant head when the log head moves forward
Date: Mon, 9 Sep 2019 09:26:32 +1000	[thread overview]
Message-ID: <20190908232632.GD16973@dread.disaster.area> (raw)
In-Reply-To: <20190907151050.GA3967@bfoster>

On Sat, Sep 07, 2019 at 11:10:50AM -0400, Brian Foster wrote:
> On Fri, Sep 06, 2019 at 09:10:14AM -0400, Brian Foster wrote:
> > On Fri, Sep 06, 2019 at 10:02:05AM +1000, Dave Chinner wrote:
> > > > And the raciness concerns..? AFAICT this still opens a race window where
> > > > the AIL can idle on the push target before AIL insertion.
> > > 
> > > I don't know what race you see - if the AIL completes a push before
> > > we insert new objects at the head from the current commit, then it
> > > does not matter one bit because the items are being inserted at the
> > > log head, not the log tail where the pushing occurs at. If we are
> > > inserting objects into the AIL within the push target window, then
> > > there is something else very wrong going on, because when the log is
> > > out of space the push target should be nowhere near the LSN we are
> > > inserting inew objects into the AIL at. (i.e. they should be 3/4s of
> > > the log apart...)
> > > 
> > 
> > I'm not following your reasoning. It sounds to me that you're arguing it
> > doesn't matter that the AIL is not populated from the current commit
> > because the push target should be much farther behind the head. If
> > that's the case, why does this patch order the AIL push after a
> > ->l_last_sync_lsn update? That's the LSN of the most recent commit
> > record to hit the log and hence the new physical log head.
> > 
> > Side note: I think the LSN of the commit record iclog is different and
> > actually ahead of the LSN associated with AIL insertion. I don't
> > necessarily think that's a problem given how the log subsystem behaves
> > today, but it's another subtle/undocumented (and easily avoidable) quirk
> > that may not always be so benign.
> > 
> 
> Just to put a finer point on this (and since this seems to be the only
> way I can get you to consider nontrivial feedback to your patches):

If I can't make head or tail of the problem you are describing,
exactly how am I supposed to respond? If I'm unable to get my point
across, I'd much prefer to spend my time on patches than on going
around in circles. I'm not interested in winning arguments. I'm not
interested in spending lots of time discussing theoretical problems
with the current set of fixes that don't exist once the root cause
we've already identified is fixed. My time is much better spent
fixing that root cause...

Keep in mind that I also have a lot of different, complex  things
going on at once that all require total focus while I'm looking at
them, so it can take days for me to cycle through everything and get
back to past topics. Delay doesn't mean I haven't read your response
or taken it on board, it just means I don't have time to write a
-meaingful response- straight away.

> kworker/0:1H-220   [000] ...1  3869.403829: xlog_state_do_callback: 2691: l_last_sync_lsn 0x15000021f6
> kworker/0:1H-220   [000] ...1  3869.403864: xfs_ail_push: 639: ail_target 0x15000021f6

Which implies that the log has less than 25% of space free because
we've issued a push, and that the distance we push is bound by the
log head.

>       <...>-215246 [002] ...1  3875.568561: xfsaild: 403: empty (target 0x15000021f6)
>       <...>-215246 [002] ....  3875.568649: xfsaild: 589: idle

has an empty AIL. IOWs, you are creating the situation where the CIL
has not been allowed to run and hence has violated the >50% log size
limit on transactions. This goes away once we prevent the CIL from
doing this.

> kworker/0:1H-220   [000] ...1  3889.843872: xfs_trans_ail_update_bulk: 746: inserted lsn 0x1500001bf6

Ok, so what you see here is somewhat intentional, based on the fact
that the LSN used for items is different to the LSN used for the
commit record (start of commit vs end of commit).  We don't want to
push the currently commiting items instantly to disk as that defeats
the "delayed write" behaviour the AIL uses to allow efficient
relogging to occur.

The next commit will do a similar push during with the new
l_last_sync_lsn, which causes the target to point at the new
last_sync_lsn and so all the items in the AIL from the previous
commit that haven't been relogged (pinned) in the current commit
will get pushed. i.e. commit N will cause commit (N - 1) to get
pushed.

This will continue while we are in a situation where the current log
head location is limiting the push target and we are completely out
of log reservation space. Once we get to the point where the
physical head of the log is more than 25% of the log away from the
tail, the push target will stop being limited by the l_last_sync_lsn
and we'll go back to triggering push target updates via the tail of
the log moving forwards as we currently do.  IOWs, this "log head
pushing" behaviour is likely only necessary for the first 2-3 CIL
commits of a workload, then we fall back into the normal tail
pushing scenario.

> This is an instance of xfsaild going idle between the time this
> new AIL push sets the target based on the iclog about to be
> committed and AIL insertion of the associated log items,
> reproduced via a bit of timing instrumentation.  Don't be
> distracted by the timestamps or the fact that the LSNs do not
> match because the log items in the AIL end up indexed by the start
> lsn of the CIL checkpoint (whereas last_sync_lsn refers to the
> commit record). The point is simply that xfsaild has completed a
> push of a target that hasn't been inserted yet.

AFAICT, what you are showing requires delaying of the CIL push to the
point it violates a fundamental assumption about commit sizes, which
is why I largely think it's irrelevant.

> A couple additional notes.. I don't see further side effects in the
> variant I reproduced, I suspect because we have other wakeups that
> squash this transient state created by the race,

Right, if we do run out of log space, the log reservation tail
pushing mechanisms takes over and does the right thing.

> > > i.e. we changed the unlinked inode processing in a way that
> > > the kernel can now runs tens of thousands of unlink transactions
> > > without yeilding the CPU. That violated the "CIL push work will run
> > > within a few transactions of the background push occurring"
> > > mechanism the workqueue provided us with and that, fundamentally, is
> > > the underlying issue here. It's not a CIL vs empty AIL vs log
> > > reservation exhaustion race condition - that's just an observable
> > > symptom.
> > > 
> > 
> > Yes, but the point is that's not the only thing that can delay CIL push
> > work. Since the AIL is not populated until the commit record iclog is
> > written out, and background CIL pushing doesn't flush the commit record
> > for the associated checkpoint before it completes, and CIL pushing
> > itself is serialized, a stalled commit record iclog I/O is enough to
> > create "log full, empty AIL" conditions.

CIL pushing is not actually serialised. Ordered, yes, serialised,
no. ANd stalling an iclog with a commit record should not cause the
log to fill completely - the next CIL push when it overflows should
get it moving long before the log runs out of reservation space.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-09-08 23:26 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04  4:24 [PATCH 0/7] xfs: log race fixes and cleanups Dave Chinner
2019-09-04  4:24 ` [PATCH 1/7] xfs: push the AIL in xlog_grant_head_wake Dave Chinner
2019-09-04  6:07   ` Christoph Hellwig
2019-09-04 21:46     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 2/7] xfs: fix missed wakeup on l_flush_wait Dave Chinner
2019-09-04  6:07   ` Christoph Hellwig
2019-09-04 21:47     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 3/7] xfs: factor debug code out of xlog_state_do_callback() Dave Chinner
2019-09-04  6:10   ` Christoph Hellwig
2019-09-04 21:14     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 4/7] xfs: factor callbacks " Dave Chinner
2019-09-04  6:13   ` Christoph Hellwig
2019-09-04  6:32   ` Christoph Hellwig
2019-09-04 21:22     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 5/7] xfs: factor iclog state processing " Dave Chinner
2019-09-04  6:42   ` Christoph Hellwig
2019-09-04 21:43     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 6/7] xfs: push iclog state cleaning into xlog_state_clean_log Dave Chinner
2019-09-04  6:44   ` Christoph Hellwig
2019-09-04  4:24 ` [PATCH 7/7] xfs: push the grant head when the log head moves forward Dave Chinner
2019-09-04  6:45   ` Christoph Hellwig
2019-09-04 21:49     ` Dave Chinner
2019-09-04 19:34   ` Brian Foster
2019-09-04 22:50     ` Dave Chinner
2019-09-05 16:25       ` Brian Foster
2019-09-06  0:02         ` Dave Chinner
2019-09-06 13:10           ` Brian Foster
2019-09-07 15:10             ` Brian Foster
2019-09-08 23:26               ` Dave Chinner [this message]
2019-09-10  9:56                 ` Brian Foster
2019-09-10 23:38                   ` Dave Chinner
2019-09-12 13:46                     ` Brian Foster
2019-09-17  4:31                       ` Darrick J. Wong
2019-09-17 12:48                         ` Brian Foster
2019-09-24 17:16                           ` Darrick J. Wong
2019-09-26 13:19                             ` Brian Foster
2019-09-04  5:26 ` [PATCH 0/7] xfs: log race fixes and cleanups Christoph Hellwig
2019-09-04  5:56   ` Christoph Hellwig
2019-09-04 22:57     ` Dave Chinner
     [not found]       ` <20190905065133.GA21840@infradead.org>
2019-09-05  7:10         ` Dave Chinner
2019-09-05  7:28           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190908232632.GD16973@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.