From: Jan Kara <jack@suse.cz>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: "Christoph Hellwig" <hch@lst.de>,
cluster-devel <cluster-devel@redhat.com>,
"Dave Chinner" <david@fromorbit.com>,
"Ross Lagerwall" <ross.lagerwall@citrix.com>,
"Mark Syms" <Mark.Syms@citrix.com>,
"Edwin Török" <edvin.torok@citrix.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
"Jan Kara" <jack@suse.cz>,
linux-mm@kvack.org
Subject: Re: gfs2 iomap dealock, IOMAP_F_UNBALANCED
Date: Mon, 8 Apr 2019 15:44:05 +0200 [thread overview]
Message-ID: <20190408134405.GA15023@quack2.suse.cz> (raw)
In-Reply-To: <CAHc6FU7kgm4OyrY-KRb8H2w6LDrWDSJ2p=UgZeeJ8YrHynKU2w@mail.gmail.com>
On Mon 08-04-19 10:53:34, Andreas Gruenbacher wrote:
> On Sun, 7 Apr 2019 at 09:32, Christoph Hellwig <hch@lst.de> wrote:
> >
> > [adding Jan and linux-mm]
> >
> > On Fri, Mar 29, 2019 at 11:13:00PM +0100, Andreas Gruenbacher wrote:
> > > > But what is the requirement to do this in writeback context? Can't
> > > > we move it out into another context instead?
> > >
> > > Indeed, this isn't for data integrity in this case but because the
> > > dirty limit is exceeded. What other context would you suggest to move
> > > this to?
> > >
> > > (The iomap flag I've proposed would save us from getting into this
> > > situation in the first place.)
> >
> > Your patch does two things:
> >
> > - it only calls balance_dirty_pages_ratelimited once per write
> > operation instead of once per page. In the past btrfs did
> > hacks like that, but IIRC they caused VM balancing issues.
> > That is why everyone now calls balance_dirty_pages_ratelimited
> > one per page. If calling it at a coarse granularity would
> > be fine we should do it everywhere instead of just in gfs2
> > in journaled mode
> > - it artifically reduces the size of writes to a low value,
> > which I suspect is going to break real life application
>
> Not quite, balance_dirty_pages_ratelimited is called from iomap_end,
> so once per iomap mapping returned, not per write. (The first version
> of this patch got that wrong by accident, but not the second.)
>
> We can limit the size of the mappings returned just in that case. I'm
> aware that there is a risk of balancing problems, I just don't have
> any better ideas.
>
> This is a problem all filesystems with data-journaling will have with
> iomap, it's not that gfs2 is doing anything particularly stupid.
I agree that if ext4 would be using iomap, it would have similar issues.
> > So I really think we need to fix this properly. And if that means
> > that you can't make use of the iomap batching for gfs2 in journaled
> > mode that is still a better option.
>
> That would mean using the old-style, page-size allocations, and a
> completely separate write path in that case. That would be quite a
> nightmare.
>
> > But I really think you need
> > to look into the scope of your flush_log and figure out a good way
> > to reduce that as solve the root cause.
>
> We won't be able to do a log flush while another transaction is
> active, but that's what's needed to clean dirty pages. iomap doesn't
> allow us to put the block allocation into a separate transaction from
> the page writes; for that, the opposite to the page_done hook would
> probably be needed.
I agree that a ->page_prepare() hook would be probably the cleanest
solution for this.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2019-04-08 13:44 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-21 13:13 gfs2 iomap dealock, IOMAP_F_UNBALANCED Andreas Gruenbacher
2019-03-21 21:43 ` Dave Chinner
2019-03-21 23:01 ` Andreas Gruenbacher
2019-03-22 0:21 ` Andreas Gruenbacher
2019-03-27 16:49 ` Ross Lagerwall
2019-03-28 16:51 ` Christoph Hellwig
2019-03-29 22:13 ` Andreas Gruenbacher
2019-04-07 7:32 ` Christoph Hellwig
2019-04-08 8:53 ` Andreas Gruenbacher
2019-04-08 13:44 ` Jan Kara [this message]
2019-04-09 12:15 ` Christoph Hellwig
2019-04-09 12:27 ` Andreas Gruenbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190408134405.GA15023@quack2.suse.cz \
--to=jack@suse.cz \
--cc=Mark.Syms@citrix.com \
--cc=agruenba@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=david@fromorbit.com \
--cc=edvin.torok@citrix.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ross.lagerwall@citrix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).