linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: cluster-devel <cluster-devel@redhat.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Ross Lagerwall" <ross.lagerwall@citrix.com>,
	"Mark Syms" <Mark.Syms@citrix.com>,
	"Edwin Török" <edvin.torok@citrix.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: gfs2 iomap dealock, IOMAP_F_UNBALANCED
Date: Fri, 29 Mar 2019 23:13:00 +0100	[thread overview]
Message-ID: <CAHc6FU49oBdo8mAq7hb1greR+B1C_Fpy5JU7RBHfRYACt1S4wA@mail.gmail.com> (raw)
In-Reply-To: <20190328165104.GA21552@lst.de>

On Thu, 28 Mar 2019 at 17:51, Christoph Hellwig <hch@lst.de> wrote:
> On Thu, Mar 21, 2019 at 02:13:04PM +0100, Andreas Gruenbacher wrote:
> > Hi Christoph,
> >
> > we need your help fixing a gfs2 deadlock involving iomap.  What's going
> > on is the following:
> >
> > * During iomap_file_buffered_write, gfs2_iomap_begin grabs the log flush
> >   lock and keeps it until gfs2_iomap_end.  It currently always does that
> >   even though there is no point other than for journaled data writes.
> >
> > * iomap_file_buffered_write then calls balance_dirty_pages_ratelimited.
> >   If that ends up calling gfs2_write_inode, gfs2 will try to grab the
> >   log flush lock again and deadlock.
>
> What is the exact call chain?

It's laid out here:
https://www.redhat.com/archives/cluster-devel/2019-March/msg00000.html

> balance_dirty_pages_ratelimited these days doesn't start I/O, but just
> wakes up the flusher threads.  Or do we have a issue where it is blocking
> on those threads?

Yes, the writer is holding sd_log_flush_lock at the point where it
ends up kicking the flusher thread and waiting for writeback to
happen. The flusher thread calls gfs2_write_inode, and that tries to
grab sd_log_flush_lock again.

> Also why do you need to flush the log for background writeback in
> ->write_inode?

If we stop doing that in the (wbc->sync_mode == WB_SYNC_NONE) case,
then inodes will remain dirty until the journal is flushed for some
other reason (or a write_inode with WB_SYNC_ALL). That doesn't seem
right. We could perhaps trigger a background journal flush in the
WB_SYNC_NONE case, but that would remove the back pressure on
balance_dirty_pages. Not sure this is a good idea, either.

> balance_dirty_pages_ratelimited is per definition not a data integrity
> writeback, so there shouldn't be a good reason to flush the log
> (which I assume the log flush log is for).
>
> If we look gfs2_write_inode, this seems to be the code:
>
>         bool flush_all = (wbc->sync_mode == WB_SYNC_ALL || gfs2_is_jdata(ip));
>
>         if (flush_all)
>                 gfs2_log_flush(GFS2_SB(inode), ip->i_gl,
>                                GFS2_LOG_HEAD_FLUSH_NORMAL |
>                                GFS2_LFC_WRITE_INODE);
>
> But what is the requirement to do this in writeback context?  Can't
> we move it out into another context instead?

Indeed, this isn't for data integrity in this case but because the
dirty limit is exceeded. What other context would you suggest to move
this to?

(The iomap flag I've proposed would save us from getting into this
situation in the first place.)

Thanks,
Andreas

  reply	other threads:[~2019-03-29 22:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-21 13:13 gfs2 iomap dealock, IOMAP_F_UNBALANCED Andreas Gruenbacher
2019-03-21 21:43 ` Dave Chinner
2019-03-21 23:01   ` Andreas Gruenbacher
2019-03-22  0:21   ` Andreas Gruenbacher
2019-03-27 16:49     ` Ross Lagerwall
2019-03-28 16:51 ` Christoph Hellwig
2019-03-29 22:13   ` Andreas Gruenbacher [this message]
2019-04-07  7:32     ` Christoph Hellwig
2019-04-08  8:53       ` Andreas Gruenbacher
2019-04-08 13:44         ` Jan Kara
2019-04-09 12:15           ` Christoph Hellwig
2019-04-09 12:27             ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHc6FU49oBdo8mAq7hb1greR+B1C_Fpy5JU7RBHfRYACt1S4wA@mail.gmail.com \
    --to=agruenba@redhat.com \
    --cc=Mark.Syms@citrix.com \
    --cc=cluster-devel@redhat.com \
    --cc=david@fromorbit.com \
    --cc=edvin.torok@citrix.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ross.lagerwall@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).