All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Friesen <chris.friesen@windriver.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Austin Schuh <austin@peloton-tech.com>, <pavel@pavlinux.ru>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	<linux-ext4@vger.kernel.org>, <tytso@mit.edu>,
	<adilger.kernel@dilger.ca>,
	rt-users <linux-rt-users@vger.kernel.org>
Subject: Re: RT/ext4/jbd2 circular dependency
Date: Wed, 29 Oct 2014 14:17:53 -0600	[thread overview]
Message-ID: <54514B71.7000809@windriver.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1410292013510.5308@nanos>

On 10/29/2014 01:26 PM, Thomas Gleixner wrote:
> On Wed, 29 Oct 2014, Chris Friesen wrote:
>> On 10/29/2014 12:05 PM, Thomas Gleixner wrote:
>> It seems plausible that the reason why page writeback never completes is that
>> it's blocking trying to take inode->i_data_sem for reading, as seen in the
>> following stack trace (from a hung system):
>>
>> [<ffffffff8109cd0c>] rt_down_read+0x2c/0x40
>> [<ffffffff8120ac91>] ext4_map_blocks+0x41/0x270
>> [<ffffffff8120f0dc>] mpage_da_map_and_submit+0xac/0x4c0
>> [<ffffffff8120f9c9>] write_cache_pages_da+0x3f9/0x420
>> [<ffffffff8120fd30>] ext4_da_writepages+0x340/0x720
>> [<ffffffff8111a5f4>] do_writepages+0x24/0x40
>> [<ffffffff81191b71>] writeback_single_inode+0x181/0x4b0
>> [<ffffffff811922a2>] writeback_sb_inodes+0x1b2/0x290
>> [<ffffffff8119241e>] __writeback_inodes_wb+0x9e/0xd0
>> [<ffffffff811928e3>] wb_writeback+0x223/0x3f0
>> [<ffffffff81192b4f>] wb_check_old_data_flush+0x9f/0xb0
>> [<ffffffff8119403f>] wb_do_writeback+0x12f/0x250
>> [<ffffffff811941f4>] bdi_writeback_thread+0x94/0x320
>
> Well, the point is that the JBD write out is not completed. The above
> is just the consequence. So really looking at ext4 inode write backs
> and something stuck on BJ_Shadow or the inode sem is the wrong
> place. It's all just caused by the JDB writeout not being completed
> for whatever reason.

I'll willingly confess my ignorance of filesystem code before I started 
looking at this issue.  I was under the impression that the above stack 
trace (for the "flush-147:3" task, in this case) was performing the 
write out of the page that had been flagged for writeback by JBD...is 
that not the case?  If not, then could you point me in the right direction?

>> For what it's worth, I'm currently testing a backport of commit b34090e from
>> mainline (which in turn required backporting commits e5a120a and f5113ef).  It
>> switches from using the BJ_Shadow list to using the BH_Shadow flag on the
>> buffer head.  More interestingly, waiters now get woken up from
>> journal_end_buffer_io_sync() instead of from
>> jbd2_journal_commit_transaction().
>>
>> So far this seems to be helping a lot.  It's lasted about 15x as long under
>> stress as without the patches.
>
> I fear that this is just papering over the problem, but you have to
> talk to the jbd2 folks about that.

They're on the CC list, hopefully they'll chime in...

> I personally prefer a reasonable explanation for the current behaviour
> rather than a magic "solution" to the problem. But that's up to you.

Well sure...but if nothing else it helps to point to a possible cause. 
I'm currently looking at the locking implications of the patch to see if 
it completely closes the race window or just narrows it.

Chris


  reply	other threads:[~2014-10-29 20:17 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-17 17:50 Hang writing to nfs-mounted filesystem from client, all nfsd tasks on server blocked in D Chris Friesen
2014-10-17 17:50 ` Chris Friesen
2014-10-17 18:01 ` Pavel Vasilyev
     [not found]   ` <CANGgnMbQmsdMDJUx7Bop9Xs=jQMmAJgWRjhXVFUGx-DwF=inYw@mail.gmail.com>
2014-10-23 17:54     ` RT/ext4/jbd2 circular dependency (was: Re: Hang writing to nfs-mounted filesystem from client) Chris Friesen
2014-10-26 14:25       ` Thomas Gleixner
2014-10-27 16:22         ` RT/ext4/jbd2 circular dependency Chris Friesen
2014-10-29 18:05           ` Thomas Gleixner
2014-10-29 19:11             ` Chris Friesen
2014-10-29 19:26               ` Thomas Gleixner
2014-10-29 20:17                 ` Chris Friesen [this message]
2014-10-29 20:31                   ` Thomas Gleixner
2014-10-29 23:19                 ` Theodore Ts'o
2014-10-29 23:37                   ` Chris Friesen
2014-10-30  1:44                     ` Theodore Ts'o
2014-10-30  8:15                       ` Kevin Liao
2014-10-30 12:24                         ` Theodore Ts'o
2014-10-30 21:11                   ` Thomas Gleixner
2014-10-30 23:24                     ` Theodore Ts'o
2014-10-31  0:08                       ` Chris Friesen
2014-10-31  0:16                       ` Thomas Gleixner
2014-11-13 19:06                       ` Jan Kara
2014-10-27 19:57       ` Chris Friesen
     [not found] ` <544156FE.7070905-CWA4WttNNZF54TAoqtyWWQ@public.gmane.org>
2014-10-17 18:58   ` Hang writing to nfs-mounted filesystem from client, all nfsd tasks on server blocked in D Austin Schuh
2014-10-17 18:58     ` Austin Schuh
2014-10-17 19:12   ` Dmitry Monakhov
2014-10-17 19:12     ` Dmitry Monakhov
2014-10-18 17:05   ` Hang writing to nfs-mounted filesystem from client -- expected code path? Chris Friesen
2014-10-18 17:05     ` Chris Friesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54514B71.7000809@windriver.com \
    --to=chris.friesen@windriver.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=austin@peloton-tech.com \
    --cc=bfields@fieldses.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=pavel@pavlinux.ru \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.