All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Bypass filesystems for reading cached pages
Date: Thu, 2 Jul 2020 18:30:26 +0100	[thread overview]
Message-ID: <20200702173026.GA25523@casper.infradead.org> (raw)
In-Reply-To: <CAHc6FU5jZfz3Kv-Aa6MWbELhTscSp5eEAXTWBoVysrQg6f1moA@mail.gmail.com>

On Thu, Jul 02, 2020 at 05:16:43PM +0200, Andreas Gruenbacher wrote:
> On Wed, Jun 24, 2020 at 2:35 PM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> > On Mon, Jun 22, 2020 at 8:13 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > On Mon, Jun 22, 2020 at 04:35:05PM +0200, Andreas Gruenbacher wrote:
> > > > I'm fine with not moving that functionality into the VFS. The problem
> > > > I have in gfs2 is that taking glocks is really expensive. Part of that
> > > > overhead is accidental, but we definitely won't be able to fix it in
> > > > the short term. So something like the IOCB_CACHED flag that prevents
> > > > generic_file_read_iter from issuing readahead I/O would save the day
> > > > for us. Does that idea stand a chance?
> > >
> > > For the short-term fix, is switching to a trylock in gfs2_readahead()
> > > acceptable?
> >
> > Well, it's the only thing we can do for now, right?
> 
> It turns out that gfs2 can still deadlock with a trylock in
> gfs2_readahead, just differently: in this instance, gfs2_glock_nq will
> call inode_dio_wait. When there is pending direct I/O, we'll end up
> waiting for iomap_dio_complete, which will call
> invalidate_inode_pages2_range, which will try to lock the pages
> already locked for gfs2_readahead.

That seems like a bug in trylock.  If I trylock something I'm not
expecting it to wait; i'm expecting it to fail because it would have to wait.
ie something like this:

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index c84887769b5a..97ca8f5ed22b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -470,7 +470,10 @@ static int inode_go_lock(struct gfs2_holder *gh)
                        return error;
        }
 
-       if (gh->gh_state != LM_ST_DEFERRED)
+       if (gh->gh_flags & LM_FLAG_TRY) {
+               if (atomic_read(&ip->i_inode.i_dio_count))
+                       return -EWOULDBLOCK;
+       } else if (gh->gh_state != LM_ST_DEFERRED)
                inode_dio_wait(&ip->i_inode);
 
        if ((ip->i_diskflags & GFS2_DIF_TRUNC_IN_PROG) &&

... but probably not exactly that because I didn't try to figure out the
calling conventions or whether I should set some state in the gfs2_holder.

> This late in the 5.8 release cycle, I'd like to propose converting
> gfs2 back to use mpage_readpages. This requires reinstating
> mpage_readpages, but it's otherwise relatively trivial.
> We can then introduce an IOCB_CACHED or equivalent flag, fix the
> locking order in gfs2, convert gfs2 to mpage_readahead, and finally
> remove mage_readpages in 5.9.

I would rather not do that.

  reply	other threads:[~2020-07-02 17:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 15:50 [RFC] Bypass filesystems for reading cached pages Matthew Wilcox
2020-06-19 19:06 ` Chaitanya Kulkarni
2020-06-19 20:12   ` Matthew Wilcox
2020-06-19 21:25     ` Chaitanya Kulkarni
2020-06-20  6:19 ` Amir Goldstein
2020-06-20  6:19   ` Amir Goldstein
2020-06-20 19:15   ` Matthew Wilcox
2020-06-21  6:00     ` Amir Goldstein
2020-06-21  6:00       ` Amir Goldstein
2020-06-22  1:02     ` Dave Chinner
2020-06-22  0:32 ` Dave Chinner
2020-06-22 14:35   ` Andreas Gruenbacher
2020-06-22 14:35     ` Andreas Gruenbacher
2020-06-22 18:13     ` Matthew Wilcox
2020-06-24 12:35       ` Andreas Gruenbacher
2020-06-24 12:35         ` Andreas Gruenbacher
2020-07-02 15:16         ` Andreas Gruenbacher
2020-07-02 15:16           ` Andreas Gruenbacher
2020-07-02 17:30           ` Matthew Wilcox [this message]
2020-06-23  0:52     ` Dave Chinner
2020-06-23  7:41       ` Andreas Gruenbacher
2020-06-23  7:41         ` Andreas Gruenbacher
2020-06-22 19:18   ` Matthew Wilcox
2020-06-23  2:35     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200702173026.GA25523@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=agruenba@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.