linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: dhowells@redhat.com, Matthew Wilcox <willy@infradead.org>,
	Jeff Layton <jlayton@redhat.com>,
	David Wysochanski <dwysocha@redhat.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	ceph-devel@vger.kernel.org, linux-afs@lists.infradead.org,
	linux-cachefs@redhat.com, CIFS <linux-cifs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"open list:NFS, SUNRPC, AND..." <linux-nfs@vger.kernel.org>,
	v9fs-developer@lists.sourceforge.net,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] fscache: I/O API modernisation and netfs helper library
Date: Tue, 09 Feb 2021 21:55:26 +0000	[thread overview]
Message-ID: <620724.1612907726@warthog.procyon.org.uk> (raw)
In-Reply-To: <CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > Yeah, I have trouble with the private2 vs fscache bit too.  I've been
> > trying to persuade David that he doesn't actually need an fscache
> > bit at all; he can just increment the page's refcount to prevent it
> > from being freed while he writes data to the cache.
> 
> Does the code not hold a refcount already?

AIUI, Willy wanted me to drop the refcount and rely on PG_locked alone during
I/O triggered by the new ->readahead() method, so when it comes to setting
PG_fscache after a successful read from the server, I don't hold any page refs
- the assumption being that the waits in releasepage and invalidatepage
suffice.  If that isn't sufficient, I can make it take page refs on the pages
to be written out - that should be easy enough to do.

> Honestly, the fact that writeback doesn't take a refcount, and then
> has magic "if writeback is set, don't free" code in other parts of the
> VM layer has been a problem already, when the wakeup ended up
> "leaking" from a previous page to a new allocation.
> 
> I very much hope the fscache bit does not make similar mistakes,
> because the rest of the VM will _not_ have special "if fscache is set,
> then we won't do X" the way we do for writeback.

The VM can't do that because PG_private_2 might not be being used for
PG_fscache.  It does, however, treat PG_private_2 like PG_private when
triggering calls to releasepage and invalidatepage.

> So I think the fscache code needs to hold a refcount regardless, and
> that the fscache bit is set the page has to have a reference.
> 
> So what are the current lifetime rules for the fscache bit?

It depends which 'current' you're referring to.

The old fscache I/O API (ie. what's upstream) - in which PG_fscache is set on
a page to note that fscache knows about the page - does not keep a separate
ref on such pages.

The new fscache I/O API simplifies things.  With that, pages are only known
about for the duration of a write to the cache.  I've tried to analogise the
way PG_writeback works[*], including waiting for it in places like
invalidation, releasepage, page_mkwrite (though in the netfs, not the core VM)
as it may represent DMA.

Note that with the new I/O API, fscache and cachefiles know nothing about the
PG_fscache bit or netfs pages; they just deal with an iov_iter and a
completion function.  Dealing with PG_fscache is done by the netfs and the new
netfs helper lib.

[*] Though I see that 073861ed77b6b made a change to end_page_writeback() for
    an issue that probably affects unlock_page_fscache() too[**].

[**] This may mean that both PG_fscache and PG_writeback need to hold a ref on
     the page.

David


  parent reply	other threads:[~2021-02-09 22:21 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-09 16:09 [GIT PULL] fscache: I/O API modernisation and netfs helper library David Howells
2021-02-09 19:06 ` Linus Torvalds
2021-02-09 19:45   ` Jeff Layton
2021-02-09 20:21   ` Matthew Wilcox
2021-02-09 21:19     ` Linus Torvalds
2021-02-09 21:55     ` David Howells [this message]
2021-02-10 16:36     ` David Howells
2021-02-09 21:25   ` David Howells
2021-02-09 22:42   ` David Wysochanski
2021-02-09 21:10 ` David Howells
2021-02-10 16:29 ` David Howells
2021-02-10 16:33 ` David Howells
2021-02-10 20:43   ` Linus Torvalds
2021-02-11 22:38   ` David Howells
2021-02-11 23:20   ` David Howells
2021-02-12 16:40     ` David Wysochanski
2021-02-13  1:05     ` Linus Torvalds
2021-02-15  0:22     ` David Howells
2021-02-15  1:01       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=620724.1612907726@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dwysocha@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=sfrench@samba.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trondmy@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).