linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/7] xfs: log race fixes and cleanups
Date: Thu, 5 Sep 2019 17:28:56 +1000	[thread overview]
Message-ID: <20190905072856.GE1119@dread.disaster.area> (raw)
In-Reply-To: <20190905071031.GD1119@dread.disaster.area>

On Thu, Sep 05, 2019 at 05:10:31PM +1000, Dave Chinner wrote:
> On Wed, Sep 04, 2019 at 11:51:33PM -0700, Christoph Hellwig wrote:
> > On Thu, Sep 05, 2019 at 08:57:16AM +1000, Dave Chinner wrote:
> > > > And unfortunately generic/530 still hangs for me with this series.
> > > 
> > > Where does it hang?
> > > 
> > > > This is an x86-64 VM with 4G of RAM and virtio-blk, default mkfs.xfs
> > > > options from current xfsprogs, 20G test and scratch fs.
> > > 
> > > That's pretty much what one of my test rigs is, except iscsi luns
> > > rather than virtio-blk. I haven't been able to reproduce the issues,
> > > so I'm kinda flying blind w.r.t. to testing them here. Can you
> > > get a trace of what is happening (xfs_trans*, xfs_log*, xfs_ail*
> > > tracepoints) so I can have a deeper look?
> > 
> > console output below, traces attached.
> 
> Thanks, I'll have a look in a minute. I'm pretty sure I know what it
> will show - I got a trace from Chandan earlier this afternoon and
> the problem is log recovery doesn't yeild the cpu until it runs out
> of transaction reservation space, so the push work doesn't run
> because workqueue default behaviour is strict "run work only on the
> CPU it is queued on"....

Yup, exactly the same trace. Right down to the lsns in the log and
the 307 iclog writes just after the log runs out of space. To quote
from #xfs earlier this afternoon:

[5/9/19 14:21] <dchinner> I see what is -likely- to be a cil checkpoint but without the closing commit record
[5/9/19 14:21] <chandan> which line number in the trace log are you noticing that?
[5/9/19 14:22] <dchinner> 307 sequential calls to xfs_log_assign_tail_lsn() from a kworker and then releasing a log reservation
[5/9/19 14:22] <dchinner> Assuming 32kB iclog size (default)
[5/9/19 14:23] <dchinner> thats 307 * 32 / 4 filesystem blocks, which is 2456 blocks
[5/9/19 14:24] <dchinner> that's 96% of the log in a single CIL commit
[5/9/19 14:24] <dchinner> this isn't a "why hasn't there been iclog completion" problem
[5/9/19 14:24] <dchinner> this is a "why didn't the CIL push occur when it passed 12% of the log...
[5/9/19 14:25] <dchinner> ?
[5/9/19 14:25] <dchinner> " problem
[5/9/19 14:26] <dchinner> oooohhhh
[5/9/19 14:27] <dchinner> this isn't a premeptible kernel, is it?
[5/9/19 14:27] <chandan> correct. Linux kernel on ppc64le isn't prememptible
[5/9/19 14:28] <dchinner> so a kernel thread running in a tight loop wil delay a kworker thread scheduled on the same CPU until running kthread yields the CPU
[5/9/19 14:28] <dchinner> but, because we've recovered all the inodes, etc, everything is hot in cache
[5/9/19 14:28] <dchinner> so the unlink workload runs without blocking, and so never yeilds the CPU until it runs out of transaction space.
[5/9/19 14:29] <dchinner> and only then does the background kworker get scheduled to run.

I'll send the updated patch set soon...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2019-09-05  7:29 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04  4:24 [PATCH 0/7] xfs: log race fixes and cleanups Dave Chinner
2019-09-04  4:24 ` [PATCH 1/7] xfs: push the AIL in xlog_grant_head_wake Dave Chinner
2019-09-04  6:07   ` Christoph Hellwig
2019-09-04 21:46     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 2/7] xfs: fix missed wakeup on l_flush_wait Dave Chinner
2019-09-04  6:07   ` Christoph Hellwig
2019-09-04 21:47     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 3/7] xfs: factor debug code out of xlog_state_do_callback() Dave Chinner
2019-09-04  6:10   ` Christoph Hellwig
2019-09-04 21:14     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 4/7] xfs: factor callbacks " Dave Chinner
2019-09-04  6:13   ` Christoph Hellwig
2019-09-04  6:32   ` Christoph Hellwig
2019-09-04 21:22     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 5/7] xfs: factor iclog state processing " Dave Chinner
2019-09-04  6:42   ` Christoph Hellwig
2019-09-04 21:43     ` Dave Chinner
2019-09-04  4:24 ` [PATCH 6/7] xfs: push iclog state cleaning into xlog_state_clean_log Dave Chinner
2019-09-04  6:44   ` Christoph Hellwig
2019-09-04  4:24 ` [PATCH 7/7] xfs: push the grant head when the log head moves forward Dave Chinner
2019-09-04  6:45   ` Christoph Hellwig
2019-09-04 21:49     ` Dave Chinner
2019-09-04 19:34   ` Brian Foster
2019-09-04 22:50     ` Dave Chinner
2019-09-05 16:25       ` Brian Foster
2019-09-06  0:02         ` Dave Chinner
2019-09-06 13:10           ` Brian Foster
2019-09-07 15:10             ` Brian Foster
2019-09-08 23:26               ` Dave Chinner
2019-09-10  9:56                 ` Brian Foster
2019-09-10 23:38                   ` Dave Chinner
2019-09-12 13:46                     ` Brian Foster
2019-09-17  4:31                       ` Darrick J. Wong
2019-09-17 12:48                         ` Brian Foster
2019-09-24 17:16                           ` Darrick J. Wong
2019-09-26 13:19                             ` Brian Foster
2019-09-04  5:26 ` [PATCH 0/7] xfs: log race fixes and cleanups Christoph Hellwig
2019-09-04  5:56   ` Christoph Hellwig
2019-09-04 22:57     ` Dave Chinner
     [not found]       ` <20190905065133.GA21840@infradead.org>
2019-09-05  7:10         ` Dave Chinner
2019-09-05  7:28           ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190905072856.GE1119@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).