linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Vivek Goyal <vgoyal@redhat.com>, Qian Cai <cai@lca.pw>,
	Hugh Dickins <hughd@google.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Linux-MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Amir Goldstein <amir73il@gmail.com>
Subject: Re: Possible deadlock in fuse write path (Was: Re: [PATCH 0/4] Some more lock_page work..)
Date: Fri, 16 Oct 2020 13:27:58 +0100	[thread overview]
Message-ID: <20201016122758.GE20115@casper.infradead.org> (raw)
In-Reply-To: <CAJfpegtAstEo+nYgT81swYZWdziaZP_40QGAXcTORqYwgeWNUA@mail.gmail.com>

On Fri, Oct 16, 2020 at 12:02:21PM +0200, Miklos Szeredi wrote:
> This was added by commit ea9b9907b82a ("fuse: implement
> perform_write") in v2.6.26 and remains essentially unchanged, AFAICS.
> So this is an old bug indeed.
> 
> So what is the page lock protecting?   I think not truncation, because
> inode_lock should be sufficient protection.
> 
> What it does after sending a synchronous WRITE and before unlocking
> the pages is set the PG_uptodate flag, but only if the complete page
> was really written, which is what the uptodate flag really says:  this
> page is in sync with the underlying fs.

Not in sync with.  Uptodate means "every byte on this page is at least
as new as the bytes on storage".  Dirty means "at least one byte is
newer than the bytes on storage".

> So I think the page lock here is trying to protect against concurrent
> reads/faults on not uptodate pages.  I.e. until the WRITE request
> completes it is unknown whether the page was really written or not, so
> any reads must block until this state becomes known.  This logic falls
> down on already cached pages, since they start out uptodate and the
> write does not clear this flag.

That's not how the page cache should work -- if a write() has touched
every byte in a page, then the page should be marked as Uptodate, and it
can immediately satisfy read()s without even touching the backing store.

> So keeping the pages locked has dubious value: short writes don't seem
> to work correctly anyway.  Which means that we can probably just set
> the page uptodate right after filling it from the user buffer, and
> unlock the page immediately.
> 
> Am I missing something?

I haven't looked at fuse in detail -- are you handling partial page
writes?  That is, if someone writes to bytes 5-15 of a file, are you
first reading bytes 0-4 and 16-4095 from the backing store?  If so,
you can mark the page Uptodate as soon as you've copied bytes 5-15
from the user.

  reply	other threads:[~2020-10-16 12:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-13 19:59 [PATCH 0/4] Some more lock_page work Linus Torvalds
2020-10-13 20:03 ` Linus Torvalds
2020-10-14 13:05   ` Kirill A. Shutemov
2020-10-14 16:53     ` Linus Torvalds
2020-10-14 18:15       ` Matthew Wilcox
2020-10-15 10:41         ` Kirill A. Shutemov
2020-10-15  9:43       ` Kirill A. Shutemov
2020-10-15 16:44         ` Linus Torvalds
2020-10-14  5:50 ` Hugh Dickins
2020-10-15  1:48 ` Qian Cai
2020-10-15  2:44   ` Linus Torvalds
2020-10-15 15:16     ` Possible deadlock in fuse write path (Was: Re: [PATCH 0/4] Some more lock_page work..) Vivek Goyal
2020-10-15 19:55       ` Vivek Goyal
2020-10-15 21:21         ` Linus Torvalds
2020-10-16 10:02           ` Miklos Szeredi
2020-10-16 12:27             ` Matthew Wilcox [this message]
2020-10-20 20:42             ` Vivek Goyal
2020-10-21  7:40               ` Miklos Szeredi
2020-10-21 20:12                 ` Vivek Goyal
2020-10-28 20:29                   ` Miklos Szeredi
2021-02-09 10:01                     ` Miklos Szeredi
2021-02-09 19:09                       ` Vivek Goyal
2020-10-16 18:19           ` Vivek Goyal
2020-10-16 18:24             ` Linus Torvalds
2020-10-16 18:24               ` Linus Torvalds
2020-10-16 23:03             ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201016122758.GE20115@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=cai@lca.pw \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).