All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/3] Convert NFS with fscache to the netfs API
@ 2022-09-04  9:05 Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 1/3] NFS: Rename readpage_async_filler to nfs_pageio_add_page Dave Wysochanski
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Dave Wysochanski @ 2022-09-04  9:05 UTC (permalink / raw)
  To: Anna Schumaker, Trond Myklebust, David Howells
  Cc: linux-nfs, linux-cachefs, Benjamin Maynard, Daire Byrne

This patchset converts NFS with fscache non-direct READ IO paths to
use the netfs API with a non-invasive approach.  The existing NFS pgio
layer does not need extensive changes, and is the best way so far I've
found to address Trond's concerns about modifying the IO path [1] as
well as only enabling netfs when fscache is configured and enabled [2].
I have not attempted performance comparisions to address Chuck
Lever's concern [3] because we are not converting the non-fscache
enabled NFS IO paths to netfs.

The main patch to be reviewed is patch #3 which converts nfs_read_folio
and nfs_readahead.

The patchset is based on 6.0-rc3 and has been pushed to github at:
https://github.com/DaveWysochanskiRH/kernel/commits/nfs-fscache-netfs
https://github.com/DaveWysochanskiRH/kernel/commit/152f43e3de2cce4e331593bd425c638a4a430a7c


Changes since v5
================
Patch1: Add Jeff Layton's reviewed-by (from v3 posting)
Patch2: Add Jeff Layton's reviewed-by (from v5 posting)
Patch3: Make netfs->transferred atomic64_t, drop spinlock (Jeff Layton)
Patch3: Various cleanups
- rename nfs_netfs_read_initiate to nfs_netfs_initiate_read
- rename nfs_fscache_read_folio to nfs_netfs_read_folio
- rename nfs_fscache_readahead to nfs_netfs_readahead
- rename nfs_netfs_read_done nfs_netfs_readpage_done
- move unlock_page inside nfs_netfs_readpage_release
- use netfs_inode() helper in more places


Testing
=======
The patches are fairly stable as evidenced with xfstests generic with
various servers, both with and without fscache enabled: 
hammerspace(pNFS flexfiles): NFS4.1,NFS4.2
NetApp(pNFS filelayout): NFS3,NFS4.0,NFS4.1
RHEL8: NFS3,NFS4.1,NFS4.2

No major issues outstanding.  The known issues are as follows:

1. Unit test setting rsize < readahead does not properly read from
fscache but re-reads data from the NFS server
* This will be fixed with another linux-cachefs [4] patch to resolve
"Stop read optimisation when folio removed from pagecache"
* Daire Byrne also verified the patch fixes his issue as well

2. "Cache volume key already in use" after xfstest runs
* xfstests (hammerspace with vers=4.2,fsc) shows the following on the
console after some tests:
"NFS: Cache volume key already in use (nfs,4.1,2,c50,cfe0100a,3,,,8000,100000,100000,bb8,ea60,7530,ea60,1)"
* This may be fixed with another patch [5] that is in progress

3. generic/127 triggers "Subreq overread" warning
* Intermittent, hard to reproduce 
* Seen with NFSv3 and RHEL8 server
[ 4196.864176] run fstests generic/127 at 2022-08-31 17:29:38
[ 5608.997945] ------------[ cut here ]------------
[ 5609.000476] Subreq overread: R1c85d[0] 73728 > 70073 - 0


Outstanding work
================
Note that the existing NFS fscache stats ("fsc:" line in /proc/self/mountstats)
as well as trace events still need removed.  I've left these out of
this patchset for now as removing them are benign and can come later
(the stats will all be 0, and trace events are no longer used).
The existing NFS fscache stat counts no longer apply since the new
API is not page based - they are not meaningful or possible to obtain,
and there are new fscache stats in /proc/fs/fscache/stats.  A similar
situation exists with the NFS trace events - netfs and fscache have
plenty of trace events so the NFS specific ones probably are not needed.


References
==========
[1] https://lore.kernel.org/linux-nfs/9cfd5bc3cfc6abc2d3316b0387222e708d67f595.camel@hammerspace.com/
[2] https://lore.kernel.org/linux-nfs/da9200f1bded9b8b078a7aef227fd6b92eb028fb.camel@hammerspace.com/
[3] https://marc.info/?l=linux-nfs&m=160597917525083&w=4
[4] https://www.mail-archive.com/linux-cachefs@redhat.com/msg03043.html
[5] https://marc.info/?l=linux-nfs&m=165962662200679&w=4

Dave Wysochanski (3):
  NFS: Rename readpage_async_filler to nfs_pageio_add_page
  NFS: Configure support for netfs when NFS fscache is configured
  NFS: Convert buffered read paths to use netfs when fscache is enabled

 fs/nfs/Kconfig           |   1 +
 fs/nfs/delegation.c      |   2 +-
 fs/nfs/dir.c             |   2 +-
 fs/nfs/fscache.c         | 251 +++++++++++++++++++++++----------------
 fs/nfs/fscache.h         | 101 ++++++++++------
 fs/nfs/inode.c           |   8 +-
 fs/nfs/internal.h        |  11 +-
 fs/nfs/pagelist.c        |  12 ++
 fs/nfs/pnfs.c            |  12 +-
 fs/nfs/read.c            | 111 +++++++++--------
 fs/nfs/write.c           |   2 +-
 include/linux/nfs_fs.h   |  34 ++++--
 include/linux/nfs_page.h |   3 +
 include/linux/nfs_xdr.h  |   3 +
 14 files changed, 335 insertions(+), 218 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 1/3] NFS: Rename readpage_async_filler to nfs_pageio_add_page
  2022-09-04  9:05 [PATCH v6 0/3] Convert NFS with fscache to the netfs API Dave Wysochanski
@ 2022-09-04  9:05 ` Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 2/3] NFS: Configure support for netfs when NFS fscache is configured Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled Dave Wysochanski
  2 siblings, 0 replies; 8+ messages in thread
From: Dave Wysochanski @ 2022-09-04  9:05 UTC (permalink / raw)
  To: Anna Schumaker, Trond Myklebust, David Howells
  Cc: linux-nfs, linux-cachefs, Benjamin Maynard, Daire Byrne

Rename readpage_async_filler to nfs_pageio_add_page to
better reflect what this function does (add a page to
the nfs_pageio_descriptor), and simplify arguments to
this function by removing struct nfs_readdesc.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/read.c | 60 +++++++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 8ae2c8d1219d..525e82ea9a9e 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -127,11 +127,6 @@ static void nfs_readpage_release(struct nfs_page *req, int error)
 	nfs_release_request(req);
 }
 
-struct nfs_readdesc {
-	struct nfs_pageio_descriptor pgio;
-	struct nfs_open_context *ctx;
-};
-
 static void nfs_page_group_set_uptodate(struct nfs_page *req)
 {
 	if (nfs_page_group_sync_on_bit(req, PG_UPTODATE))
@@ -153,7 +148,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
 
 		if (test_bit(NFS_IOHDR_EOF, &hdr->flags)) {
 			/* note: regions of the page not covered by a
-			 * request are zeroed in readpage_async_filler */
+			 * request are zeroed in nfs_pageio_add_page
+			 */
 			if (bytes > hdr->good_bytes) {
 				/* nothing in this request was good, so zero
 				 * the full extent of the request */
@@ -281,8 +277,10 @@ static void nfs_readpage_result(struct rpc_task *task,
 		nfs_readpage_retry(task, hdr);
 }
 
-static int
-readpage_async_filler(struct nfs_readdesc *desc, struct page *page)
+int
+nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
+		    struct nfs_open_context *ctx,
+		    struct page *page)
 {
 	struct inode *inode = page_file_mapping(page)->host;
 	unsigned int rsize = NFS_SERVER(inode)->rsize;
@@ -302,15 +300,15 @@ readpage_async_filler(struct nfs_readdesc *desc, struct page *page)
 			goto out_unlock;
 	}
 
-	new = nfs_create_request(desc->ctx, page, 0, aligned_len);
+	new = nfs_create_request(ctx, page, 0, aligned_len);
 	if (IS_ERR(new))
 		goto out_error;
 
 	if (len < PAGE_SIZE)
 		zero_user_segment(page, len, PAGE_SIZE);
-	if (!nfs_pageio_add_request(&desc->pgio, new)) {
+	if (!nfs_pageio_add_request(pgio, new)) {
 		nfs_list_remove_request(new);
-		error = desc->pgio.pg_error;
+		error = pgio->pg_error;
 		nfs_readpage_release(new, error);
 		goto out;
 	}
@@ -332,7 +330,8 @@ readpage_async_filler(struct nfs_readdesc *desc, struct page *page)
 int nfs_read_folio(struct file *file, struct folio *folio)
 {
 	struct page *page = &folio->page;
-	struct nfs_readdesc desc;
+	struct nfs_pageio_descriptor pgio;
+	struct nfs_open_context *ctx;
 	struct inode *inode = page_file_mapping(page)->host;
 	int ret;
 
@@ -358,29 +357,29 @@ int nfs_read_folio(struct file *file, struct folio *folio)
 
 	if (file == NULL) {
 		ret = -EBADF;
-		desc.ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
-		if (desc.ctx == NULL)
+		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
+		if (ctx == NULL)
 			goto out_unlock;
 	} else
-		desc.ctx = get_nfs_open_context(nfs_file_open_context(file));
+		ctx = get_nfs_open_context(nfs_file_open_context(file));
 
-	xchg(&desc.ctx->error, 0);
-	nfs_pageio_init_read(&desc.pgio, inode, false,
+	xchg(&ctx->error, 0);
+	nfs_pageio_init_read(&pgio, inode, false,
 			     &nfs_async_read_completion_ops);
 
-	ret = readpage_async_filler(&desc, page);
+	ret = nfs_pageio_add_page(&pgio, ctx, page);
 	if (ret)
 		goto out;
 
-	nfs_pageio_complete_read(&desc.pgio);
-	ret = desc.pgio.pg_error < 0 ? desc.pgio.pg_error : 0;
+	nfs_pageio_complete_read(&pgio);
+	ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
 	if (!ret) {
 		ret = wait_on_page_locked_killable(page);
 		if (!PageUptodate(page) && !ret)
-			ret = xchg(&desc.ctx->error, 0);
+			ret = xchg(&ctx->error, 0);
 	}
 out:
-	put_nfs_open_context(desc.ctx);
+	put_nfs_open_context(ctx);
 	trace_nfs_aop_readpage_done(inode, page, ret);
 	return ret;
 out_unlock:
@@ -391,9 +390,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
 
 void nfs_readahead(struct readahead_control *ractl)
 {
+	struct nfs_pageio_descriptor pgio;
+	struct nfs_open_context *ctx;
 	unsigned int nr_pages = readahead_count(ractl);
 	struct file *file = ractl->file;
-	struct nfs_readdesc desc;
 	struct inode *inode = ractl->mapping->host;
 	struct page *page;
 	int ret;
@@ -407,25 +407,25 @@ void nfs_readahead(struct readahead_control *ractl)
 
 	if (file == NULL) {
 		ret = -EBADF;
-		desc.ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
-		if (desc.ctx == NULL)
+		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
+		if (ctx == NULL)
 			goto out;
 	} else
-		desc.ctx = get_nfs_open_context(nfs_file_open_context(file));
+		ctx = get_nfs_open_context(nfs_file_open_context(file));
 
-	nfs_pageio_init_read(&desc.pgio, inode, false,
+	nfs_pageio_init_read(&pgio, inode, false,
 			     &nfs_async_read_completion_ops);
 
 	while ((page = readahead_page(ractl)) != NULL) {
-		ret = readpage_async_filler(&desc, page);
+		ret = nfs_pageio_add_page(&pgio, ctx, page);
 		put_page(page);
 		if (ret)
 			break;
 	}
 
-	nfs_pageio_complete_read(&desc.pgio);
+	nfs_pageio_complete_read(&pgio);
 
-	put_nfs_open_context(desc.ctx);
+	put_nfs_open_context(ctx);
 out:
 	trace_nfs_aop_readahead_done(inode, nr_pages, ret);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 2/3] NFS: Configure support for netfs when NFS fscache is configured
  2022-09-04  9:05 [PATCH v6 0/3] Convert NFS with fscache to the netfs API Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 1/3] NFS: Rename readpage_async_filler to nfs_pageio_add_page Dave Wysochanski
@ 2022-09-04  9:05 ` Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled Dave Wysochanski
  2 siblings, 0 replies; 8+ messages in thread
From: Dave Wysochanski @ 2022-09-04  9:05 UTC (permalink / raw)
  To: Anna Schumaker, Trond Myklebust, David Howells
  Cc: linux-nfs, linux-cachefs, Benjamin Maynard, Daire Byrne

As first steps for support of the netfs library when NFS_FSCACHE is
configured, add NETFS_SUPPORT to Kconfig and add the required netfs_inode
into struct nfs_inode.

Using netfs requires we move the VFS inode structure to be stored
inside struct netfs_inode, along with the fscache_cookie.
Thus, create a new helper, VFS_I(), which is defined
differently depending on whether NFS_FSCACHE is configured.
In addition, use the netfs_inode() and netfs_i_cookie() helpers,
and remove our own helper, nfs_i_fscache().

Later patches will convert NFS fscache to fully use netfs.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/Kconfig         |  1 +
 fs/nfs/delegation.c    |  2 +-
 fs/nfs/dir.c           |  2 +-
 fs/nfs/fscache.c       | 20 +++++++++-----------
 fs/nfs/fscache.h       | 15 ++++++---------
 fs/nfs/inode.c         |  6 +++---
 fs/nfs/internal.h      |  2 +-
 fs/nfs/pnfs.c          | 12 ++++++------
 fs/nfs/write.c         |  2 +-
 include/linux/nfs_fs.h | 34 +++++++++++++++++++++++-----------
 10 files changed, 52 insertions(+), 44 deletions(-)

diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index 14a72224b657..8fbb6caf3481 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -171,6 +171,7 @@ config ROOT_NFS
 config NFS_FSCACHE
 	bool "Provide NFS client caching support"
 	depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
+	select NETFS_SUPPORT
 	help
 	  Say Y here if you want NFS data to be cached locally on disc through
 	  the general filesystem cache manager
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5c97cad741a7..b5c492d40367 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -306,7 +306,7 @@ nfs_start_delegation_return_locked(struct nfs_inode *nfsi)
 	}
 	spin_unlock(&delegation->lock);
 	if (ret)
-		nfs_clear_verifier_delegated(&nfsi->vfs_inode);
+		nfs_clear_verifier_delegated(VFS_I(nfsi));
 out:
 	return ret;
 }
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5d6c2ddc7ea6..b63c624cea6d 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2799,7 +2799,7 @@ nfs_do_access_cache_scan(unsigned int nr_to_scan)
 
 		if (nr_to_scan-- == 0)
 			break;
-		inode = &nfsi->vfs_inode;
+		inode = VFS_I(nfsi);
 		spin_lock(&inode->i_lock);
 		if (list_empty(&nfsi->access_cache_entry_lru))
 			goto remove_lru_entry;
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index e861d7bae305..a6fc1c8b6644 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -163,13 +163,14 @@ void nfs_fscache_init_inode(struct inode *inode)
 	struct nfs_server *nfss = NFS_SERVER(inode);
 	struct nfs_inode *nfsi = NFS_I(inode);
 
-	nfsi->fscache = NULL;
+	netfs_inode(inode)->cache = NULL;
 	if (!(nfss->fscache && S_ISREG(inode->i_mode)))
 		return;
 
 	nfs_fscache_update_auxdata(&auxdata, inode);
 
-	nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
+	netfs_inode(inode)->cache = fscache_acquire_cookie(
+					       nfss->fscache,
 					       0,
 					       nfsi->fh.data, /* index_key */
 					       nfsi->fh.size,
@@ -183,11 +184,8 @@ void nfs_fscache_init_inode(struct inode *inode)
  */
 void nfs_fscache_clear_inode(struct inode *inode)
 {
-	struct nfs_inode *nfsi = NFS_I(inode);
-	struct fscache_cookie *cookie = nfs_i_fscache(inode);
-
-	fscache_relinquish_cookie(cookie, false);
-	nfsi->fscache = NULL;
+	fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
+	netfs_inode(inode)->cache = NULL;
 }
 
 /*
@@ -212,7 +210,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
 void nfs_fscache_open_file(struct inode *inode, struct file *filp)
 {
 	struct nfs_fscache_inode_auxdata auxdata;
-	struct fscache_cookie *cookie = nfs_i_fscache(inode);
+	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
 	bool open_for_write = inode_is_open_for_write(inode);
 
 	if (!fscache_cookie_valid(cookie))
@@ -230,7 +228,7 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
 void nfs_fscache_release_file(struct inode *inode, struct file *filp)
 {
 	struct nfs_fscache_inode_auxdata auxdata;
-	struct fscache_cookie *cookie = nfs_i_fscache(inode);
+	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
 	loff_t i_size = i_size_read(inode);
 
 	nfs_fscache_update_auxdata(&auxdata, inode);
@@ -243,7 +241,7 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
 static int fscache_fallback_read_page(struct inode *inode, struct page *page)
 {
 	struct netfs_cache_resources cres;
-	struct fscache_cookie *cookie = nfs_i_fscache(inode);
+	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
 	struct iov_iter iter;
 	struct bio_vec bvec[1];
 	int ret;
@@ -271,7 +269,7 @@ static int fscache_fallback_write_page(struct inode *inode, struct page *page,
 				       bool no_space_allocated_yet)
 {
 	struct netfs_cache_resources cres;
-	struct fscache_cookie *cookie = nfs_i_fscache(inode);
+	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
 	struct iov_iter iter;
 	struct bio_vec bvec[1];
 	loff_t start = page_offset(page);
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 2a37af880978..38614ed8f951 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -54,7 +54,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
 		if (current_is_kswapd() || !(gfp & __GFP_FS))
 			return false;
 		folio_wait_fscache(folio);
-		fscache_note_page_release(nfs_i_fscache(folio->mapping->host));
+		fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
 		nfs_inc_fscache_stats(folio->mapping->host,
 				      NFSIOS_FSCACHE_PAGES_UNCACHED);
 	}
@@ -66,7 +66,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
  */
 static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
 {
-	if (nfs_i_fscache(inode))
+	if (netfs_inode(inode)->cache)
 		return __nfs_fscache_read_page(inode, page);
 	return -ENOBUFS;
 }
@@ -78,7 +78,7 @@ static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
 static inline void nfs_fscache_write_page(struct inode *inode,
 					   struct page *page)
 {
-	if (nfs_i_fscache(inode))
+	if (netfs_inode(inode)->cache)
 		__nfs_fscache_write_page(inode, page);
 }
 
@@ -101,13 +101,10 @@ static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *
 static inline void nfs_fscache_invalidate(struct inode *inode, int flags)
 {
 	struct nfs_fscache_inode_auxdata auxdata;
-	struct nfs_inode *nfsi = NFS_I(inode);
+	struct fscache_cookie *cookie =  netfs_i_cookie(&NFS_I(inode)->netfs);
 
-	if (nfsi->fscache) {
-		nfs_fscache_update_auxdata(&auxdata, inode);
-		fscache_invalidate(nfsi->fscache, &auxdata,
-				   i_size_read(inode), flags);
-	}
+	nfs_fscache_update_auxdata(&auxdata, inode);
+	fscache_invalidate(cookie, &auxdata, i_size_read(inode), flags);
 }
 
 /*
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index bea7c005119c..aa2aec785ab5 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1411,7 +1411,7 @@ int nfs_revalidate_mapping(struct inode *inode, struct address_space *mapping)
 
 static bool nfs_file_has_writers(struct nfs_inode *nfsi)
 {
-	struct inode *inode = &nfsi->vfs_inode;
+	struct inode *inode = VFS_I(nfsi);
 
 	if (!S_ISREG(inode->i_mode))
 		return false;
@@ -2249,7 +2249,7 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
 #ifdef CONFIG_NFS_V4_2
 	nfsi->xattr_cache = NULL;
 #endif
-	return &nfsi->vfs_inode;
+	return VFS_I(nfsi);
 }
 EXPORT_SYMBOL_GPL(nfs_alloc_inode);
 
@@ -2273,7 +2273,7 @@ static void init_once(void *foo)
 {
 	struct nfs_inode *nfsi = (struct nfs_inode *) foo;
 
-	inode_init_once(&nfsi->vfs_inode);
+	inode_init_once(VFS_I(nfsi));
 	INIT_LIST_HEAD(&nfsi->open_files);
 	INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
 	INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 27c720d71b4e..273687082992 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -355,7 +355,7 @@ nfs4_label_copy(struct nfs4_label *dst, struct nfs4_label *src)
 
 static inline void nfs_zap_label_cache_locked(struct nfs_inode *nfsi)
 {
-	if (nfs_server_capable(&nfsi->vfs_inode, NFS_CAP_SECURITY_LABEL))
+	if (nfs_server_capable(VFS_I(nfsi), NFS_CAP_SECURITY_LABEL))
 		nfsi->cache_validity |= NFS_INO_INVALID_LABEL;
 }
 #else
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 2613b7e36eb9..035bf2eac2cf 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -763,19 +763,19 @@ static struct pnfs_layout_hdr *__pnfs_destroy_layout(struct nfs_inode *nfsi)
 	struct pnfs_layout_hdr *lo;
 	LIST_HEAD(tmp_list);
 
-	spin_lock(&nfsi->vfs_inode.i_lock);
+	spin_lock(&VFS_I(nfsi)->i_lock);
 	lo = nfsi->layout;
 	if (lo) {
 		pnfs_get_layout_hdr(lo);
 		pnfs_mark_layout_stateid_invalid(lo, &tmp_list);
 		pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RO_FAILED);
 		pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RW_FAILED);
-		spin_unlock(&nfsi->vfs_inode.i_lock);
+		spin_unlock(&VFS_I(nfsi)->i_lock);
 		pnfs_free_lseg_list(&tmp_list);
-		nfs_commit_inode(&nfsi->vfs_inode, 0);
+		nfs_commit_inode(VFS_I(nfsi), 0);
 		pnfs_put_layout_hdr(lo);
 	} else
-		spin_unlock(&nfsi->vfs_inode.i_lock);
+		spin_unlock(&VFS_I(nfsi)->i_lock);
 	return lo;
 }
 
@@ -790,9 +790,9 @@ static bool pnfs_layout_removed(struct nfs_inode *nfsi,
 {
 	bool ret;
 
-	spin_lock(&nfsi->vfs_inode.i_lock);
+	spin_lock(&VFS_I(nfsi)->i_lock);
 	ret = nfsi->layout != lo;
-	spin_unlock(&nfsi->vfs_inode.i_lock);
+	spin_unlock(&VFS_I(nfsi)->i_lock);
 	return ret;
 }
 
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1843fa235d9b..d84121799a7a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -828,7 +828,7 @@ nfs_page_search_commits_for_head_request_locked(struct nfs_inode *nfsi,
 {
 	struct nfs_page *freq, *t;
 	struct nfs_commit_info cinfo;
-	struct inode *inode = &nfsi->vfs_inode;
+	struct inode *inode = VFS_I(nfsi);
 
 	nfs_init_cinfo_from_inode(&cinfo, inode);
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 7931fa472561..a1c402e26abf 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -31,6 +31,10 @@
 #include <linux/sunrpc/auth.h>
 #include <linux/sunrpc/clnt.h>
 
+#ifdef CONFIG_NFS_FSCACHE
+#include <linux/netfs.h>
+#endif
+
 #include <linux/nfs.h>
 #include <linux/nfs2.h>
 #include <linux/nfs3.h>
@@ -204,9 +208,11 @@ struct nfs_inode {
 	__u64 write_io;
 	__u64 read_io;
 #ifdef CONFIG_NFS_FSCACHE
-	struct fscache_cookie	*fscache;
-#endif
+	struct netfs_inode	netfs; /* netfs context and VFS inode */
+#else
 	struct inode		vfs_inode;
+#endif
+
 
 #ifdef CONFIG_NFS_V4_2
 	struct nfs4_xattr_cache *xattr_cache;
@@ -281,10 +287,25 @@ struct nfs4_copy_state {
 #define NFS_INO_LAYOUTSTATS	(11)		/* layoutstats inflight */
 #define NFS_INO_ODIRECT		(12)		/* I/O setting is O_DIRECT */
 
+#ifdef CONFIG_NFS_FSCACHE
+static inline struct inode *VFS_I(struct nfs_inode *nfsi)
+{
+	return &nfsi->netfs.inode;
+}
+static inline struct nfs_inode *NFS_I(const struct inode *inode)
+{
+	return container_of(inode, struct nfs_inode, netfs.inode);
+}
+#else
+static inline struct inode *VFS_I(struct nfs_inode *nfsi)
+{
+	return &nfsi->vfs_inode;
+}
 static inline struct nfs_inode *NFS_I(const struct inode *inode)
 {
 	return container_of(inode, struct nfs_inode, vfs_inode);
 }
+#endif
 
 static inline struct nfs_server *NFS_SB(const struct super_block *s)
 {
@@ -328,15 +349,6 @@ static inline int NFS_STALE(const struct inode *inode)
 	return test_bit(NFS_INO_STALE, &NFS_I(inode)->flags);
 }
 
-static inline struct fscache_cookie *nfs_i_fscache(struct inode *inode)
-{
-#ifdef CONFIG_NFS_FSCACHE
-	return NFS_I(inode)->fscache;
-#else
-	return NULL;
-#endif
-}
-
 static inline __u64 NFS_FILEID(const struct inode *inode)
 {
 	return NFS_I(inode)->fileid;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled
  2022-09-04  9:05 [PATCH v6 0/3] Convert NFS with fscache to the netfs API Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 1/3] NFS: Rename readpage_async_filler to nfs_pageio_add_page Dave Wysochanski
  2022-09-04  9:05 ` [PATCH v6 2/3] NFS: Configure support for netfs when NFS fscache is configured Dave Wysochanski
@ 2022-09-04  9:05 ` Dave Wysochanski
  2022-09-04 13:59   ` Jeff Layton
  2 siblings, 1 reply; 8+ messages in thread
From: Dave Wysochanski @ 2022-09-04  9:05 UTC (permalink / raw)
  To: Anna Schumaker, Trond Myklebust, David Howells
  Cc: linux-nfs, linux-cachefs, Benjamin Maynard, Daire Byrne

Convert the NFS buffered read code paths to corresponding netfs APIs,
but only when fscache is configured and enabled.

The netfs API defines struct netfs_request_ops which must be filled
in by the network filesystem.  For NFS, we only need to define 5 of
the functions, the main one being the issue_read() function.
The issue_read() function is called by the netfs layer when a read
cannot be fulfilled locally, and must be sent to the server (either
the cache is not active, or it is active but the data is not available).
Once the read from the server is complete, netfs requires a call to
netfs_subreq_terminated() which conveys either how many bytes were read
successfully, or an error.  Note that issue_read() is called with a
structure, netfs_io_subrequest, which defines the IO requested, and
contains a start and a length (both in bytes), and assumes the underlying
netfs will return a either an error on the whole region, or the number
of bytes successfully read.

The NFS IO path is page based and the main APIs are the pgio APIs defined
in pagelist.c.  For the pgio APIs, there is no way for the caller to
know how many RPCs will be sent and how the pages will be broken up
into underlying RPCs, each of which will have their own return code.
Thus, NFS needs some way to accommodate the netfs API requirement on
the single response to the whole request, while also minimizing
disruptive changes to the NFS pgio layer.  The approach taken with this
patch is to allocate a small structure for each nfs_netfs_issue_read() call
to keep the final error value or the number of bytes successfully read.
The refcount on the structure is used also as a marker for the last
RPC completion, updated inside nfs_netfs_read_initiate(), and
nfs_netfs_read_done(), when a nfs_pgio_header contains a valid pointer
to the data.  Then finally in nfs_read_completion(), call into
nfs_netfs_read_completion() to update the final error value and bytes
read, and check the refcount to determine whether this is the final
RPC completion.  If this is the last RPC, then in the final put on
the structure, call into netfs_subreq_terminated() with the final
error value or the number of bytes successfully transferred.

Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
---
 fs/nfs/fscache.c         | 241 ++++++++++++++++++++++++---------------
 fs/nfs/fscache.h         |  94 +++++++++------
 fs/nfs/inode.c           |   2 +
 fs/nfs/internal.h        |   9 ++
 fs/nfs/pagelist.c        |  12 ++
 fs/nfs/read.c            |  51 ++++-----
 include/linux/nfs_page.h |   3 +
 include/linux/nfs_xdr.h  |   3 +
 8 files changed, 263 insertions(+), 152 deletions(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index a6fc1c8b6644..9b7df3d61c35 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -15,6 +15,9 @@
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/iversion.h>
+#include <linux/xarray.h>
+#include <linux/fscache.h>
+#include <linux/netfs.h>
 
 #include "internal.h"
 #include "iostat.h"
@@ -184,7 +187,7 @@ void nfs_fscache_init_inode(struct inode *inode)
  */
 void nfs_fscache_clear_inode(struct inode *inode)
 {
-	fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
+	fscache_relinquish_cookie(netfs_i_cookie(netfs_inode(inode)), false);
 	netfs_inode(inode)->cache = NULL;
 }
 
@@ -210,7 +213,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
 void nfs_fscache_open_file(struct inode *inode, struct file *filp)
 {
 	struct nfs_fscache_inode_auxdata auxdata;
-	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
+	struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
 	bool open_for_write = inode_is_open_for_write(inode);
 
 	if (!fscache_cookie_valid(cookie))
@@ -228,119 +231,169 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
 void nfs_fscache_release_file(struct inode *inode, struct file *filp)
 {
 	struct nfs_fscache_inode_auxdata auxdata;
-	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
+	struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
 	loff_t i_size = i_size_read(inode);
 
 	nfs_fscache_update_auxdata(&auxdata, inode);
 	fscache_unuse_cookie(cookie, &auxdata, &i_size);
 }
 
-/*
- * Fallback page reading interface.
- */
-static int fscache_fallback_read_page(struct inode *inode, struct page *page)
+int nfs_netfs_read_folio(struct file *file, struct folio *folio)
 {
-	struct netfs_cache_resources cres;
-	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
-	struct iov_iter iter;
-	struct bio_vec bvec[1];
-	int ret;
-
-	memset(&cres, 0, sizeof(cres));
-	bvec[0].bv_page		= page;
-	bvec[0].bv_offset	= 0;
-	bvec[0].bv_len		= PAGE_SIZE;
-	iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
-
-	ret = fscache_begin_read_operation(&cres, cookie);
-	if (ret < 0)
-		return ret;
-
-	ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
-			   NULL, NULL);
-	fscache_end_operation(&cres);
-	return ret;
+	if (!netfs_inode(folio_inode(folio))->cache)
+		return -ENOBUFS;
+
+	return netfs_read_folio(file, folio);
 }
 
-/*
- * Fallback page writing interface.
- */
-static int fscache_fallback_write_page(struct inode *inode, struct page *page,
-				       bool no_space_allocated_yet)
+int nfs_netfs_readahead(struct readahead_control *ractl)
 {
-	struct netfs_cache_resources cres;
-	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
-	struct iov_iter iter;
-	struct bio_vec bvec[1];
-	loff_t start = page_offset(page);
-	size_t len = PAGE_SIZE;
-	int ret;
-
-	memset(&cres, 0, sizeof(cres));
-	bvec[0].bv_page		= page;
-	bvec[0].bv_offset	= 0;
-	bvec[0].bv_len		= PAGE_SIZE;
-	iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
-
-	ret = fscache_begin_write_operation(&cres, cookie);
-	if (ret < 0)
-		return ret;
-
-	ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
-				      no_space_allocated_yet);
-	if (ret == 0)
-		ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
-	fscache_end_operation(&cres);
-	return ret;
+	struct inode *inode = ractl->mapping->host;
+
+	if (!netfs_inode(inode)->cache)
+		return -ENOBUFS;
+
+	netfs_readahead(ractl);
+	return 0;
 }
 
-/*
- * Retrieve a page from fscache
- */
-int __nfs_fscache_read_page(struct inode *inode, struct page *page)
+atomic_t nfs_netfs_debug_id;
+static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
 {
-	int ret;
+	rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
+	rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
 
-	trace_nfs_fscache_read_page(inode, page);
-	if (PageChecked(page)) {
-		ClearPageChecked(page);
-		ret = 1;
-		goto out;
-	}
+	return 0;
+}
 
-	ret = fscache_fallback_read_page(inode, page);
-	if (ret < 0) {
-		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
-		SetPageChecked(page);
-		goto out;
-	}
+static void nfs_netfs_free_request(struct netfs_io_request *rreq)
+{
+	put_nfs_open_context(rreq->netfs_priv);
+}
 
-	/* Read completed synchronously */
-	nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
-	SetPageUptodate(page);
-	ret = 0;
-out:
-	trace_nfs_fscache_read_page_exit(inode, page, ret);
-	return ret;
+static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
+{
+	return fscache_begin_read_operation(&rreq->cache_resources,
+					    netfs_i_cookie(netfs_inode(rreq->inode)));
 }
 
-/*
- * Store a newly fetched page in fscache.  We can be certain there's no page
- * stored in the cache as yet otherwise we would've read it from there.
- */
-void __nfs_fscache_write_page(struct inode *inode, struct page *page)
+static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
 {
-	int ret;
+	struct nfs_netfs_io_data *netfs;
+
+	netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
+	if (!netfs)
+		return NULL;
+	netfs->sreq = sreq;
+	refcount_set(&netfs->refcount, 1);
+	return netfs;
+}
 
-	trace_nfs_fscache_write_page(inode, page);
+static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
+{
+	size_t	rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
 
-	ret = fscache_fallback_write_page(inode, page, true);
+	sreq->len = min(sreq->len, rsize);
+	return true;
+}
 
-	if (ret != 0) {
-		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
-		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
-	} else {
-		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
+static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
+{
+	struct nfs_pageio_descriptor pgio;
+	struct inode *inode = sreq->rreq->inode;
+	struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
+	struct page *page;
+	int err;
+	pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
+	pgoff_t last = ((sreq->start + sreq->len -
+			 sreq->transferred - 1) >> PAGE_SHIFT);
+	XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
+
+	nfs_pageio_init_read(&pgio, inode, false,
+			     &nfs_async_read_completion_ops);
+
+	pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
+	if (!pgio.pg_netfs)
+		return netfs_subreq_terminated(sreq, -ENOMEM, false);
+
+	xas_lock(&xas);
+	xas_for_each(&xas, page, last) {
+		/* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs  */
+		xas_pause(&xas);
+		xas_unlock(&xas);
+		err = nfs_pageio_add_page(&pgio, ctx, page);
+		if (err < 0)
+			return netfs_subreq_terminated(sreq, err, false);
+		xas_lock(&xas);
 	}
-	trace_nfs_fscache_write_page_exit(inode, page, ret);
+	xas_unlock(&xas);
+	nfs_pageio_complete_read(&pgio);
+	nfs_netfs_put(pgio.pg_netfs);
 }
+
+void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr)
+{
+	struct nfs_netfs_io_data        *netfs = hdr->netfs;
+
+	if (!netfs)
+		return;
+
+	nfs_netfs_get(netfs);
+}
+
+void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr)
+{
+	struct nfs_netfs_io_data        *netfs = hdr->netfs;
+
+	if (!netfs)
+		return;
+
+	if (hdr->res.op_status)
+		/*
+		 * Retryable errors such as BAD_STATEID will be re-issued,
+		 * so reduce refcount.
+		 */
+		nfs_netfs_put(netfs);
+}
+
+void nfs_netfs_readpage_release(struct nfs_page *req)
+{
+	struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
+
+	/*
+	 * If fscache is enabled, netfs will unlock pages.
+	 */
+	if (netfs_inode(inode)->cache)
+		return;
+
+	unlock_page(req->wb_page);
+}
+
+void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
+{
+	struct nfs_netfs_io_data        *netfs = hdr->netfs;
+	struct netfs_io_subrequest      *sreq;
+
+	if (!netfs)
+		return;
+
+	sreq = netfs->sreq;
+	if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
+		__set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
+
+	if (hdr->error)
+		netfs->error = hdr->error;
+	else
+		atomic64_add(hdr->res.count, &netfs->transferred);
+
+	nfs_netfs_put(netfs);
+	hdr->netfs = NULL;
+}
+
+const struct netfs_request_ops nfs_netfs_ops = {
+	.init_request		= nfs_netfs_init_request,
+	.free_request		= nfs_netfs_free_request,
+	.begin_cache_operation	= nfs_netfs_begin_cache_operation,
+	.issue_read		= nfs_netfs_issue_read,
+	.clamp_length		= nfs_netfs_clamp_length
+};
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 38614ed8f951..fb782b917235 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -34,6 +34,49 @@ struct nfs_fscache_inode_auxdata {
 	u64	change_attr;
 };
 
+struct nfs_netfs_io_data {
+	/*
+	 * NFS may split a netfs_io_subrequest into multiple RPCs, each
+	 * with their own read completion.  In netfs, we can only call
+	 * netfs_subreq_terminated() once for each subrequest.  Use the
+	 * refcount here to double as a marker of the last RPC completion,
+	 * and only call netfs via netfs_subreq_terminated() once.
+	 */
+	refcount_t			refcount;
+	struct netfs_io_subrequest	*sreq;
+
+	/*
+	 * Final disposition of the netfs_io_subrequest, sent in
+	 * netfs_subreq_terminated()
+	 */
+	atomic64_t	transferred;
+	int		error;
+};
+
+static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
+{
+	refcount_inc(&netfs->refcount);
+}
+
+static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
+{
+	/* Only the last RPC completion should call netfs_subreq_terminated() */
+	if (refcount_dec_and_test(&netfs->refcount)) {
+		netfs_subreq_terminated(netfs->sreq,
+					netfs->error ?: atomic64_read(&netfs->transferred),
+					false);
+		kfree(netfs);
+	}
+}
+static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi)
+{
+	netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
+}
+extern void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr);
+extern void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr);
+extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
+extern void nfs_netfs_readpage_release(struct nfs_page *req);
+
 /*
  * fscache.c
  */
@@ -44,9 +87,8 @@ extern void nfs_fscache_init_inode(struct inode *);
 extern void nfs_fscache_clear_inode(struct inode *);
 extern void nfs_fscache_open_file(struct inode *, struct file *);
 extern void nfs_fscache_release_file(struct inode *, struct file *);
-
-extern int __nfs_fscache_read_page(struct inode *, struct page *);
-extern void __nfs_fscache_write_page(struct inode *, struct page *);
+extern int nfs_netfs_readahead(struct readahead_control *ractl);
+extern int nfs_netfs_read_folio(struct file *file, struct folio *folio);
 
 static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
 {
@@ -54,34 +96,11 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
 		if (current_is_kswapd() || !(gfp & __GFP_FS))
 			return false;
 		folio_wait_fscache(folio);
-		fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
-		nfs_inc_fscache_stats(folio->mapping->host,
-				      NFSIOS_FSCACHE_PAGES_UNCACHED);
 	}
+	fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
 	return true;
 }
 
-/*
- * Retrieve a page from an inode data storage object.
- */
-static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
-{
-	if (netfs_inode(inode)->cache)
-		return __nfs_fscache_read_page(inode, page);
-	return -ENOBUFS;
-}
-
-/*
- * Store a page newly fetched from the server in an inode data storage object
- * in the cache.
- */
-static inline void nfs_fscache_write_page(struct inode *inode,
-					   struct page *page)
-{
-	if (netfs_inode(inode)->cache)
-		__nfs_fscache_write_page(inode, page);
-}
-
 static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
 					      struct inode *inode)
 {
@@ -118,6 +137,14 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
 }
 
 #else /* CONFIG_NFS_FSCACHE */
+static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi) {}
+static inline void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr) {}
+static inline void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr) {}
+static inline void nfs_netfs_read_completion(struct nfs_pgio_header *hdr) {}
+static inline void nfs_netfs_readpage_release(struct nfs_page *req)
+{
+	unlock_page(req->wb_page);
+}
 static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
 
 static inline void nfs_fscache_init_inode(struct inode *inode) {}
@@ -125,16 +152,19 @@ static inline void nfs_fscache_clear_inode(struct inode *inode) {}
 static inline void nfs_fscache_open_file(struct inode *inode,
 					 struct file *filp) {}
 static inline void nfs_fscache_release_file(struct inode *inode, struct file *file) {}
-
-static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
+static inline int nfs_netfs_readahead(struct readahead_control *ractl)
 {
-	return true; /* may release folio */
+	return -ENOBUFS;
 }
-static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
+static inline int nfs_netfs_read_folio(struct file *file, struct folio *folio)
 {
 	return -ENOBUFS;
 }
-static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
+
+static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
+{
+	return true; /* may release folio */
+}
 static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
 
 static inline const char *nfs_server_fscache_state(struct nfs_server *server)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index aa2aec785ab5..b36a02b932e8 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -2249,6 +2249,8 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
 #ifdef CONFIG_NFS_V4_2
 	nfsi->xattr_cache = NULL;
 #endif
+	nfs_netfs_inode_init(nfsi);
+
 	return VFS_I(nfsi);
 }
 EXPORT_SYMBOL_GPL(nfs_alloc_inode);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 273687082992..e5589036c1f8 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
 extern int nfs_client_for_each_server(struct nfs_client *clp,
 				      int (*fn)(struct nfs_server *, void *),
 				      void *data);
+#ifdef CONFIG_NFS_FSCACHE
+extern const struct netfs_request_ops nfs_netfs_ops;
+#endif
+
 /* io.c */
 extern void nfs_start_io_read(struct inode *inode);
 extern void nfs_end_io_read(struct inode *inode);
@@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
 
 struct nfs_pgio_completion_ops;
 /* read.c */
+extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
 extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
 			struct inode *inode, bool force_mds,
 			const struct nfs_pgio_completion_ops *compl_ops);
+extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
+			       struct nfs_open_context *ctx,
+			       struct page *page);
+extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
 extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
 extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
 
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 317cedfa52bf..e28754476d1b 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -25,6 +25,7 @@
 #include "internal.h"
 #include "pnfs.h"
 #include "nfstrace.h"
+#include "fscache.h"
 
 #define NFSDBG_FACILITY		NFSDBG_PAGECACHE
 
@@ -68,6 +69,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
 	hdr->good_bytes = mirror->pg_count;
 	hdr->io_completion = desc->pg_io_completion;
 	hdr->dreq = desc->pg_dreq;
+#ifdef CONFIG_NFS_FSCACHE
+	if (desc->pg_netfs)
+		hdr->netfs = desc->pg_netfs;
+#endif
 	hdr->release = release;
 	hdr->completion_ops = desc->pg_completion_ops;
 	if (hdr->completion_ops->init_hdr)
@@ -846,6 +851,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
 	desc->pg_lseg = NULL;
 	desc->pg_io_completion = NULL;
 	desc->pg_dreq = NULL;
+#ifdef CONFIG_NFS_FSCACHE
+	desc->pg_netfs = NULL;
+#endif
 	desc->pg_bsize = bsize;
 
 	desc->pg_mirror_count = 1;
@@ -940,6 +948,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
 	/* Set up the argument struct */
 	nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
 	desc->pg_rpc_callops = &nfs_pgio_common_ops;
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nfs_generic_pgio);
@@ -1360,6 +1369,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
 
 	desc->pg_io_completion = hdr->io_completion;
 	desc->pg_dreq = hdr->dreq;
+#ifdef CONFIG_NFS_FSCACHE
+	desc->pg_netfs = hdr->netfs;
+#endif
 	list_splice_init(&hdr->pages, &pages);
 	while (!list_empty(&pages)) {
 		struct nfs_page *req = nfs_list_entry(pages.next);
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 525e82ea9a9e..c74c5fcba87d 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -30,7 +30,7 @@
 
 #define NFSDBG_FACILITY		NFSDBG_PAGECACHE
 
-static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
+const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
 static const struct nfs_rw_ops nfs_rw_read_ops;
 
 static struct kmem_cache *nfs_rdata_cachep;
@@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
 }
 EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
 
-static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
+void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
 {
 	struct nfs_pgio_mirror *pgm;
 	unsigned long npages;
@@ -110,20 +110,13 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
 
 static void nfs_readpage_release(struct nfs_page *req, int error)
 {
-	struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
 	struct page *page = req->wb_page;
 
-	dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
-		(unsigned long long)NFS_FILEID(inode), req->wb_bytes,
-		(long long)req_offset(req));
-
 	if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
 		SetPageError(page);
-	if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
-		if (PageUptodate(page))
-			nfs_fscache_write_page(inode, page);
-		unlock_page(page);
-	}
+	if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE))
+		nfs_netfs_readpage_release(req);
+
 	nfs_release_request(req);
 }
 
@@ -177,6 +170,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
 		nfs_list_remove_request(req);
 		nfs_readpage_release(req, error);
 	}
+	nfs_netfs_read_completion(hdr);
+
 out:
 	hdr->release(hdr);
 }
@@ -187,6 +182,7 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
 			      struct rpc_task_setup *task_setup_data, int how)
 {
 	rpc_ops->read_setup(hdr, msg);
+	nfs_netfs_initiate_read(hdr);
 	trace_nfs_initiate_read(hdr);
 }
 
@@ -202,7 +198,7 @@ nfs_async_read_error(struct list_head *head, int error)
 	}
 }
 
-static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
+const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
 	.error_cleanup = nfs_async_read_error,
 	.completion = nfs_read_completion,
 };
@@ -219,6 +215,7 @@ static int nfs_readpage_done(struct rpc_task *task,
 	if (status != 0)
 		return status;
 
+	nfs_netfs_readpage_done(hdr);
 	nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
 	trace_nfs_readpage_done(task, hdr);
 
@@ -294,12 +291,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
 
 	aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
 
-	if (!IS_SYNC(page->mapping->host)) {
-		error = nfs_fscache_read_page(page->mapping->host, page);
-		if (error == 0)
-			goto out_unlock;
-	}
-
 	new = nfs_create_request(ctx, page, 0, aligned_len);
 	if (IS_ERR(new))
 		goto out_error;
@@ -315,8 +306,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
 	return 0;
 out_error:
 	error = PTR_ERR(new);
-out_unlock:
-	unlock_page(page);
 out:
 	return error;
 }
@@ -355,6 +344,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
 	if (NFS_STALE(inode))
 		goto out_unlock;
 
+	ret = nfs_netfs_read_folio(file, folio);
+	if (!ret)
+		goto out;
+
 	if (file == NULL) {
 		ret = -EBADF;
 		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
@@ -368,8 +361,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
 			     &nfs_async_read_completion_ops);
 
 	ret = nfs_pageio_add_page(&pgio, ctx, page);
-	if (ret)
-		goto out;
+	if (ret) {
+		put_nfs_open_context(ctx);
+		goto out_unlock;
+	}
 
 	nfs_pageio_complete_read(&pgio);
 	ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
@@ -378,12 +373,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
 		if (!PageUptodate(page) && !ret)
 			ret = xchg(&ctx->error, 0);
 	}
-out:
 	put_nfs_open_context(ctx);
-	trace_nfs_aop_readpage_done(inode, page, ret);
-	return ret;
+	goto out;
+
 out_unlock:
 	unlock_page(page);
+out:
 	trace_nfs_aop_readpage_done(inode, page, ret);
 	return ret;
 }
@@ -405,6 +400,10 @@ void nfs_readahead(struct readahead_control *ractl)
 	if (NFS_STALE(inode))
 		goto out;
 
+	ret = nfs_netfs_readahead(ractl);
+	if (!ret)
+		goto out;
+
 	if (file == NULL) {
 		ret = -EBADF;
 		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index ba7e2e4b0926..8eeb16d9bacd 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
 	struct pnfs_layout_segment *pg_lseg;
 	struct nfs_io_completion *pg_io_completion;
 	struct nfs_direct_req	*pg_dreq;
+#ifdef CONFIG_NFS_FSCACHE
+	void			*pg_netfs;
+#endif
 	unsigned int		pg_bsize;	/* default bsize for mirrors */
 
 	u32			pg_mirror_count;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index e86cf6642d21..e196ef595908 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
 	const struct nfs_rw_ops	*rw_ops;
 	struct nfs_io_completion *io_completion;
 	struct nfs_direct_req	*dreq;
+#ifdef CONFIG_NFS_FSCACHE
+	void			*netfs;
+#endif
 
 	int			pnfs_error;
 	int			error;		/* merge with pnfs_error */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled
  2022-09-04  9:05 ` [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled Dave Wysochanski
@ 2022-09-04 13:59   ` Jeff Layton
  2022-09-04 19:51     ` David Wysochanski
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2022-09-04 13:59 UTC (permalink / raw)
  To: Dave Wysochanski, Anna Schumaker, Trond Myklebust, David Howells
  Cc: linux-nfs, linux-cachefs, Benjamin Maynard, Daire Byrne

On Sun, 2022-09-04 at 05:05 -0400, Dave Wysochanski wrote:
> Convert the NFS buffered read code paths to corresponding netfs APIs,
> but only when fscache is configured and enabled.
> 
> The netfs API defines struct netfs_request_ops which must be filled
> in by the network filesystem.  For NFS, we only need to define 5 of
> the functions, the main one being the issue_read() function.
> The issue_read() function is called by the netfs layer when a read
> cannot be fulfilled locally, and must be sent to the server (either
> the cache is not active, or it is active but the data is not available).
> Once the read from the server is complete, netfs requires a call to
> netfs_subreq_terminated() which conveys either how many bytes were read
> successfully, or an error.  Note that issue_read() is called with a
> structure, netfs_io_subrequest, which defines the IO requested, and
> contains a start and a length (both in bytes), and assumes the underlying
> netfs will return a either an error on the whole region, or the number
> of bytes successfully read.
> 
> The NFS IO path is page based and the main APIs are the pgio APIs defined
> in pagelist.c.  For the pgio APIs, there is no way for the caller to
> know how many RPCs will be sent and how the pages will be broken up
> into underlying RPCs, each of which will have their own return code.
> Thus, NFS needs some way to accommodate the netfs API requirement on
> the single response to the whole request, while also minimizing
> disruptive changes to the NFS pgio layer.  The approach taken with this
> patch is to allocate a small structure for each nfs_netfs_issue_read() call
> to keep the final error value or the number of bytes successfully read.
> The refcount on the structure is used also as a marker for the last
> RPC completion, updated inside nfs_netfs_read_initiate(), and
> nfs_netfs_read_done(), when a nfs_pgio_header contains a valid pointer
> to the data.  Then finally in nfs_read_completion(), call into
> nfs_netfs_read_completion() to update the final error value and bytes
> read, and check the refcount to determine whether this is the final
> RPC completion.  If this is the last RPC, then in the final put on
> the structure, call into netfs_subreq_terminated() with the final
> error value or the number of bytes successfully transferred.
> 
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> ---
>  fs/nfs/fscache.c         | 241 ++++++++++++++++++++++++---------------
>  fs/nfs/fscache.h         |  94 +++++++++------
>  fs/nfs/inode.c           |   2 +
>  fs/nfs/internal.h        |   9 ++
>  fs/nfs/pagelist.c        |  12 ++
>  fs/nfs/read.c            |  51 ++++-----
>  include/linux/nfs_page.h |   3 +
>  include/linux/nfs_xdr.h  |   3 +
>  8 files changed, 263 insertions(+), 152 deletions(-)
> 
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index a6fc1c8b6644..9b7df3d61c35 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -15,6 +15,9 @@
>  #include <linux/seq_file.h>
>  #include <linux/slab.h>
>  #include <linux/iversion.h>
> +#include <linux/xarray.h>
> +#include <linux/fscache.h>
> +#include <linux/netfs.h>
>  
>  #include "internal.h"
>  #include "iostat.h"
> @@ -184,7 +187,7 @@ void nfs_fscache_init_inode(struct inode *inode)
>   */
>  void nfs_fscache_clear_inode(struct inode *inode)
>  {
> -	fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
> +	fscache_relinquish_cookie(netfs_i_cookie(netfs_inode(inode)), false);
>  	netfs_inode(inode)->cache = NULL;
>  }
>  
> @@ -210,7 +213,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
>  void nfs_fscache_open_file(struct inode *inode, struct file *filp)
>  {
>  	struct nfs_fscache_inode_auxdata auxdata;
> -	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> +	struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
>  	bool open_for_write = inode_is_open_for_write(inode);
>  
>  	if (!fscache_cookie_valid(cookie))
> @@ -228,119 +231,169 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
>  void nfs_fscache_release_file(struct inode *inode, struct file *filp)
>  {
>  	struct nfs_fscache_inode_auxdata auxdata;
> -	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> +	struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
>  	loff_t i_size = i_size_read(inode);
>  
>  	nfs_fscache_update_auxdata(&auxdata, inode);
>  	fscache_unuse_cookie(cookie, &auxdata, &i_size);
>  }
>  
> -/*
> - * Fallback page reading interface.
> - */
> -static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> +int nfs_netfs_read_folio(struct file *file, struct folio *folio)
>  {
> -	struct netfs_cache_resources cres;
> -	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> -	struct iov_iter iter;
> -	struct bio_vec bvec[1];
> -	int ret;
> -
> -	memset(&cres, 0, sizeof(cres));
> -	bvec[0].bv_page		= page;
> -	bvec[0].bv_offset	= 0;
> -	bvec[0].bv_len		= PAGE_SIZE;
> -	iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> -
> -	ret = fscache_begin_read_operation(&cres, cookie);
> -	if (ret < 0)
> -		return ret;
> -
> -	ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> -			   NULL, NULL);
> -	fscache_end_operation(&cres);
> -	return ret;
> +	if (!netfs_inode(folio_inode(folio))->cache)
> +		return -ENOBUFS;
> +
> +	return netfs_read_folio(file, folio);
>  }
>  
> -/*
> - * Fallback page writing interface.
> - */
> -static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> -				       bool no_space_allocated_yet)
> +int nfs_netfs_readahead(struct readahead_control *ractl)
>  {
> -	struct netfs_cache_resources cres;
> -	struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> -	struct iov_iter iter;
> -	struct bio_vec bvec[1];
> -	loff_t start = page_offset(page);
> -	size_t len = PAGE_SIZE;
> -	int ret;
> -
> -	memset(&cres, 0, sizeof(cres));
> -	bvec[0].bv_page		= page;
> -	bvec[0].bv_offset	= 0;
> -	bvec[0].bv_len		= PAGE_SIZE;
> -	iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> -
> -	ret = fscache_begin_write_operation(&cres, cookie);
> -	if (ret < 0)
> -		return ret;
> -
> -	ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
> -				      no_space_allocated_yet);
> -	if (ret == 0)
> -		ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
> -	fscache_end_operation(&cres);
> -	return ret;
> +	struct inode *inode = ractl->mapping->host;
> +
> +	if (!netfs_inode(inode)->cache)
> +		return -ENOBUFS;
> +
> +	netfs_readahead(ractl);
> +	return 0;
>  }
>  
> -/*
> - * Retrieve a page from fscache
> - */
> -int __nfs_fscache_read_page(struct inode *inode, struct page *page)
> +atomic_t nfs_netfs_debug_id;
> +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
>  {
> -	int ret;
> +	rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
> +	rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
>  
> -	trace_nfs_fscache_read_page(inode, page);
> -	if (PageChecked(page)) {
> -		ClearPageChecked(page);
> -		ret = 1;
> -		goto out;
> -	}
> +	return 0;
> +}
>  
> -	ret = fscache_fallback_read_page(inode, page);
> -	if (ret < 0) {
> -		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
> -		SetPageChecked(page);
> -		goto out;
> -	}
> +static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> +{
> +	put_nfs_open_context(rreq->netfs_priv);
> +}
>  
> -	/* Read completed synchronously */
> -	nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
> -	SetPageUptodate(page);
> -	ret = 0;
> -out:
> -	trace_nfs_fscache_read_page_exit(inode, page, ret);
> -	return ret;
> +static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
> +{
> +	return fscache_begin_read_operation(&rreq->cache_resources,
> +					    netfs_i_cookie(netfs_inode(rreq->inode)));
>  }
>  
> -/*
> - * Store a newly fetched page in fscache.  We can be certain there's no page
> - * stored in the cache as yet otherwise we would've read it from there.
> - */
> -void __nfs_fscache_write_page(struct inode *inode, struct page *page)
> +static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
>  {
> -	int ret;
> +	struct nfs_netfs_io_data *netfs;
> +
> +	netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
> +	if (!netfs)
> +		return NULL;
> +	netfs->sreq = sreq;
> +	refcount_set(&netfs->refcount, 1);
> +	return netfs;
> +}
>  
> -	trace_nfs_fscache_write_page(inode, page);
> +static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
> +{
> +	size_t	rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
>  
> -	ret = fscache_fallback_write_page(inode, page, true);
> +	sreq->len = min(sreq->len, rsize);
> +	return true;
> +}
>  
> -	if (ret != 0) {
> -		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
> -		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
> -	} else {
> -		nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
> +static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
> +{
> +	struct nfs_pageio_descriptor pgio;
> +	struct inode *inode = sreq->rreq->inode;
> +	struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
> +	struct page *page;
> +	int err;
> +	pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
> +	pgoff_t last = ((sreq->start + sreq->len -
> +			 sreq->transferred - 1) >> PAGE_SHIFT);
> +	XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
> +
> +	nfs_pageio_init_read(&pgio, inode, false,
> +			     &nfs_async_read_completion_ops);
> +
> +	pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
> +	if (!pgio.pg_netfs)
> +		return netfs_subreq_terminated(sreq, -ENOMEM, false);
> +
> +	xas_lock(&xas);
> +	xas_for_each(&xas, page, last) {
> +		/* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs  */
> +		xas_pause(&xas);
> +		xas_unlock(&xas);
> +		err = nfs_pageio_add_page(&pgio, ctx, page);
> +		if (err < 0)
> +			return netfs_subreq_terminated(sreq, err, false);
> +		xas_lock(&xas);
>  	}
> -	trace_nfs_fscache_write_page_exit(inode, page, ret);
> +	xas_unlock(&xas);
> +	nfs_pageio_complete_read(&pgio);
> +	nfs_netfs_put(pgio.pg_netfs);
>  }
> +
> +void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr)
> +{
> +	struct nfs_netfs_io_data        *netfs = hdr->netfs;
> +
> +	if (!netfs)
> +		return;
> +
> +	nfs_netfs_get(netfs);
> +}
> +
> +void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr)
> +{
> +	struct nfs_netfs_io_data        *netfs = hdr->netfs;
> +
> +	if (!netfs)
> +		return;
> +
> +	if (hdr->res.op_status)
> +		/*
> +		 * Retryable errors such as BAD_STATEID will be re-issued,
> +		 * so reduce refcount.
> +		 */
> +		nfs_netfs_put(netfs);
> +}
> +
> +void nfs_netfs_readpage_release(struct nfs_page *req)
> +{
> +	struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> +
> +	/*
> +	 * If fscache is enabled, netfs will unlock pages.
> +	 */
> +	if (netfs_inode(inode)->cache)
> +		return;
> +
> +	unlock_page(req->wb_page);
> +}
> +
> +void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
> +{
> +	struct nfs_netfs_io_data        *netfs = hdr->netfs;
> +	struct netfs_io_subrequest      *sreq;
> +
> +	if (!netfs)
> +		return;
> +
> +	sreq = netfs->sreq;
> +	if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> +		__set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
> +
> +	if (hdr->error)
> +		netfs->error = hdr->error;
> +	else
> +		atomic64_add(hdr->res.count, &netfs->transferred);
> +
> +	nfs_netfs_put(netfs);
> +	hdr->netfs = NULL;
> +}
> +
> +const struct netfs_request_ops nfs_netfs_ops = {
> +	.init_request		= nfs_netfs_init_request,
> +	.free_request		= nfs_netfs_free_request,
> +	.begin_cache_operation	= nfs_netfs_begin_cache_operation,
> +	.issue_read		= nfs_netfs_issue_read,
> +	.clamp_length		= nfs_netfs_clamp_length
> +};
> diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> index 38614ed8f951..fb782b917235 100644
> --- a/fs/nfs/fscache.h
> +++ b/fs/nfs/fscache.h
> @@ -34,6 +34,49 @@ struct nfs_fscache_inode_auxdata {
>  	u64	change_attr;
>  };
>  
> +struct nfs_netfs_io_data {
> +	/*
> +	 * NFS may split a netfs_io_subrequest into multiple RPCs, each
> +	 * with their own read completion.  In netfs, we can only call
> +	 * netfs_subreq_terminated() once for each subrequest.  Use the
> +	 * refcount here to double as a marker of the last RPC completion,
> +	 * and only call netfs via netfs_subreq_terminated() once.
> +	 */
> +	refcount_t			refcount;
> +	struct netfs_io_subrequest	*sreq;
> +
> +	/*
> +	 * Final disposition of the netfs_io_subrequest, sent in
> +	 * netfs_subreq_terminated()
> +	 */
> +	atomic64_t	transferred;
> +	int		error;
> +};
> +
> +static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
> +{
> +	refcount_inc(&netfs->refcount);
> +}
> +
> +static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
> +{
> +	/* Only the last RPC completion should call netfs_subreq_terminated() */
> +	if (refcount_dec_and_test(&netfs->refcount)) {
> +		netfs_subreq_terminated(netfs->sreq,
> +					netfs->error ?: atomic64_read(&netfs->transferred),
> +					false);
> +		kfree(netfs);
> +	}
> +}
> +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi)
> +{
> +	netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
> +}
> +extern void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr);
> +extern void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr);
> +extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
> +extern void nfs_netfs_readpage_release(struct nfs_page *req);
> +
>  /*
>   * fscache.c
>   */
> @@ -44,9 +87,8 @@ extern void nfs_fscache_init_inode(struct inode *);
>  extern void nfs_fscache_clear_inode(struct inode *);
>  extern void nfs_fscache_open_file(struct inode *, struct file *);
>  extern void nfs_fscache_release_file(struct inode *, struct file *);
> -
> -extern int __nfs_fscache_read_page(struct inode *, struct page *);
> -extern void __nfs_fscache_write_page(struct inode *, struct page *);
> +extern int nfs_netfs_readahead(struct readahead_control *ractl);
> +extern int nfs_netfs_read_folio(struct file *file, struct folio *folio);
>  
>  static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
>  {
> @@ -54,34 +96,11 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
>  		if (current_is_kswapd() || !(gfp & __GFP_FS))
>  			return false;
>  		folio_wait_fscache(folio);
> -		fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> -		nfs_inc_fscache_stats(folio->mapping->host,
> -				      NFSIOS_FSCACHE_PAGES_UNCACHED);
>  	}
> +	fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
>  	return true;
>  }
>  
> -/*
> - * Retrieve a page from an inode data storage object.
> - */
> -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> -{
> -	if (netfs_inode(inode)->cache)
> -		return __nfs_fscache_read_page(inode, page);
> -	return -ENOBUFS;
> -}
> -
> -/*
> - * Store a page newly fetched from the server in an inode data storage object
> - * in the cache.
> - */
> -static inline void nfs_fscache_write_page(struct inode *inode,
> -					   struct page *page)
> -{
> -	if (netfs_inode(inode)->cache)
> -		__nfs_fscache_write_page(inode, page);
> -}
> -
>  static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
>  					      struct inode *inode)
>  {
> @@ -118,6 +137,14 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
>  }
>  
>  #else /* CONFIG_NFS_FSCACHE */
> +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi) {}
> +static inline void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr) {}
> +static inline void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr) {}
> +static inline void nfs_netfs_read_completion(struct nfs_pgio_header *hdr) {}
> +static inline void nfs_netfs_readpage_release(struct nfs_page *req)
> +{
> +	unlock_page(req->wb_page);
> +}
>  static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
>  
>  static inline void nfs_fscache_init_inode(struct inode *inode) {}
> @@ -125,16 +152,19 @@ static inline void nfs_fscache_clear_inode(struct inode *inode) {}
>  static inline void nfs_fscache_open_file(struct inode *inode,
>  					 struct file *filp) {}
>  static inline void nfs_fscache_release_file(struct inode *inode, struct file *file) {}
> -
> -static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> +static inline int nfs_netfs_readahead(struct readahead_control *ractl)
>  {
> -	return true; /* may release folio */
> +	return -ENOBUFS;
>  }
> -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> +static inline int nfs_netfs_read_folio(struct file *file, struct folio *folio)
>  {
>  	return -ENOBUFS;
>  }
> -static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
> +
> +static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> +{
> +	return true; /* may release folio */
> +}
>  static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
>  
>  static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index aa2aec785ab5..b36a02b932e8 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -2249,6 +2249,8 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
>  #ifdef CONFIG_NFS_V4_2
>  	nfsi->xattr_cache = NULL;
>  #endif
> +	nfs_netfs_inode_init(nfsi);
> +
>  	return VFS_I(nfsi);
>  }
>  EXPORT_SYMBOL_GPL(nfs_alloc_inode);
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index 273687082992..e5589036c1f8 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
>  extern int nfs_client_for_each_server(struct nfs_client *clp,
>  				      int (*fn)(struct nfs_server *, void *),
>  				      void *data);
> +#ifdef CONFIG_NFS_FSCACHE
> +extern const struct netfs_request_ops nfs_netfs_ops;
> +#endif
> +
>  /* io.c */
>  extern void nfs_start_io_read(struct inode *inode);
>  extern void nfs_end_io_read(struct inode *inode);
> @@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
>  
>  struct nfs_pgio_completion_ops;
>  /* read.c */
> +extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
>  extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
>  			struct inode *inode, bool force_mds,
>  			const struct nfs_pgio_completion_ops *compl_ops);
> +extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> +			       struct nfs_open_context *ctx,
> +			       struct page *page);
> +extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
>  extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
>  extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
>  
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index 317cedfa52bf..e28754476d1b 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -25,6 +25,7 @@
>  #include "internal.h"
>  #include "pnfs.h"
>  #include "nfstrace.h"
> +#include "fscache.h"
>  
>  #define NFSDBG_FACILITY		NFSDBG_PAGECACHE
>  
> @@ -68,6 +69,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
>  	hdr->good_bytes = mirror->pg_count;
>  	hdr->io_completion = desc->pg_io_completion;
>  	hdr->dreq = desc->pg_dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> +	if (desc->pg_netfs)
> +		hdr->netfs = desc->pg_netfs;
> +#endif
>  	hdr->release = release;
>  	hdr->completion_ops = desc->pg_completion_ops;
>  	if (hdr->completion_ops->init_hdr)
> @@ -846,6 +851,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
>  	desc->pg_lseg = NULL;
>  	desc->pg_io_completion = NULL;
>  	desc->pg_dreq = NULL;
> +#ifdef CONFIG_NFS_FSCACHE
> +	desc->pg_netfs = NULL;
> +#endif
>  	desc->pg_bsize = bsize;
>  
>  	desc->pg_mirror_count = 1;
> @@ -940,6 +948,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
>  	/* Set up the argument struct */
>  	nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
>  	desc->pg_rpc_callops = &nfs_pgio_common_ops;
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(nfs_generic_pgio);
> @@ -1360,6 +1369,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
>  
>  	desc->pg_io_completion = hdr->io_completion;
>  	desc->pg_dreq = hdr->dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> +	desc->pg_netfs = hdr->netfs;
> +#endif
>  	list_splice_init(&hdr->pages, &pages);
>  	while (!list_empty(&pages)) {
>  		struct nfs_page *req = nfs_list_entry(pages.next);
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index 525e82ea9a9e..c74c5fcba87d 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -30,7 +30,7 @@
>  
>  #define NFSDBG_FACILITY		NFSDBG_PAGECACHE
>  
> -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
>  static const struct nfs_rw_ops nfs_rw_read_ops;
>  
>  static struct kmem_cache *nfs_rdata_cachep;
> @@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
>  }
>  EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
>  
> -static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> +void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
>  {
>  	struct nfs_pgio_mirror *pgm;
>  	unsigned long npages;
> @@ -110,20 +110,13 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
>  
>  static void nfs_readpage_release(struct nfs_page *req, int error)
>  {
> -	struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
>  	struct page *page = req->wb_page;
>  
> -	dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
> -		(unsigned long long)NFS_FILEID(inode), req->wb_bytes,
> -		(long long)req_offset(req));
> -
>  	if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
>  		SetPageError(page);
> -	if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
> -		if (PageUptodate(page))
> -			nfs_fscache_write_page(inode, page);
> -		unlock_page(page);
> -	}
> +	if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE))
> +		nfs_netfs_readpage_release(req);
> +
>  	nfs_release_request(req);
>  }
>  
> @@ -177,6 +170,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
>  		nfs_list_remove_request(req);
>  		nfs_readpage_release(req, error);
>  	}
> +	nfs_netfs_read_completion(hdr);
> +
>  out:
>  	hdr->release(hdr);
>  }
> @@ -187,6 +182,7 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
>  			      struct rpc_task_setup *task_setup_data, int how)
>  {
>  	rpc_ops->read_setup(hdr, msg);
> +	nfs_netfs_initiate_read(hdr);
>  	trace_nfs_initiate_read(hdr);
>  }
>  
> @@ -202,7 +198,7 @@ nfs_async_read_error(struct list_head *head, int error)
>  	}
>  }
>  
> -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
>  	.error_cleanup = nfs_async_read_error,
>  	.completion = nfs_read_completion,
>  };
> @@ -219,6 +215,7 @@ static int nfs_readpage_done(struct rpc_task *task,
>  	if (status != 0)
>  		return status;
>  
> +	nfs_netfs_readpage_done(hdr);
>  	nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
>  	trace_nfs_readpage_done(task, hdr);
>  
> @@ -294,12 +291,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
>  
>  	aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
>  
> -	if (!IS_SYNC(page->mapping->host)) {
> -		error = nfs_fscache_read_page(page->mapping->host, page);
> -		if (error == 0)
> -			goto out_unlock;
> -	}
> -
>  	new = nfs_create_request(ctx, page, 0, aligned_len);
>  	if (IS_ERR(new))
>  		goto out_error;
> @@ -315,8 +306,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
>  	return 0;
>  out_error:
>  	error = PTR_ERR(new);
> -out_unlock:
> -	unlock_page(page);
>  out:
>  	return error;
>  }
> @@ -355,6 +344,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
>  	if (NFS_STALE(inode))
>  		goto out_unlock;
>  
> +	ret = nfs_netfs_read_folio(file, folio);
> +	if (!ret)
> +		goto out;
> +
>  	if (file == NULL) {
>  		ret = -EBADF;
>  		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> @@ -368,8 +361,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
>  			     &nfs_async_read_completion_ops);
>  
>  	ret = nfs_pageio_add_page(&pgio, ctx, page);
> -	if (ret)
> -		goto out;
> +	if (ret) {
> +		put_nfs_open_context(ctx);
> +		goto out_unlock;
> +	}
>  
>  	nfs_pageio_complete_read(&pgio);
>  	ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
> @@ -378,12 +373,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
>  		if (!PageUptodate(page) && !ret)
>  			ret = xchg(&ctx->error, 0);
>  	}
> -out:
>  	put_nfs_open_context(ctx);
> -	trace_nfs_aop_readpage_done(inode, page, ret);
> -	return ret;
> +	goto out;
> +
>  out_unlock:
>  	unlock_page(page);
> +out:
>  	trace_nfs_aop_readpage_done(inode, page, ret);
>  	return ret;
>  }
> @@ -405,6 +400,10 @@ void nfs_readahead(struct readahead_control *ractl)
>  	if (NFS_STALE(inode))
>  		goto out;
>  
> +	ret = nfs_netfs_readahead(ractl);
> +	if (!ret)
> +		goto out;
> +
>  	if (file == NULL) {
>  		ret = -EBADF;
>  		ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
> index ba7e2e4b0926..8eeb16d9bacd 100644
> --- a/include/linux/nfs_page.h
> +++ b/include/linux/nfs_page.h
> @@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
>  	struct pnfs_layout_segment *pg_lseg;
>  	struct nfs_io_completion *pg_io_completion;
>  	struct nfs_direct_req	*pg_dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> +	void			*pg_netfs;
> +#endif


Would it be possible to union this new field with pg_dreq? I don't think
they're ever both used in the same desc. There are some places that
check for pg_dreq == NULL that would need to be converted to use a new
flag or something, but that would allow us to avoid growing this struct.

>  	unsigned int		pg_bsize;	/* default bsize for mirrors */
>  
>  	u32			pg_mirror_count;
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index e86cf6642d21..e196ef595908 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
>  	const struct nfs_rw_ops	*rw_ops;
>  	struct nfs_io_completion *io_completion;
>  	struct nfs_direct_req	*dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> +	void			*netfs;
> +#endif
>  

Maybe also here too?

>  	int			pnfs_error;
>  	int			error;		/* merge with pnfs_error */

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled
  2022-09-04 13:59   ` Jeff Layton
@ 2022-09-04 19:51     ` David Wysochanski
  2022-09-06 10:53       ` Jeff Layton
  0 siblings, 1 reply; 8+ messages in thread
From: David Wysochanski @ 2022-09-04 19:51 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Anna Schumaker, Trond Myklebust, David Howells, linux-nfs,
	linux-cachefs, Benjamin Maynard, Daire Byrne

On Sun, Sep 4, 2022 at 9:59 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Sun, 2022-09-04 at 05:05 -0400, Dave Wysochanski wrote:
> > Convert the NFS buffered read code paths to corresponding netfs APIs,
> > but only when fscache is configured and enabled.
> >
> > The netfs API defines struct netfs_request_ops which must be filled
> > in by the network filesystem.  For NFS, we only need to define 5 of
> > the functions, the main one being the issue_read() function.
> > The issue_read() function is called by the netfs layer when a read
> > cannot be fulfilled locally, and must be sent to the server (either
> > the cache is not active, or it is active but the data is not available).
> > Once the read from the server is complete, netfs requires a call to
> > netfs_subreq_terminated() which conveys either how many bytes were read
> > successfully, or an error.  Note that issue_read() is called with a
> > structure, netfs_io_subrequest, which defines the IO requested, and
> > contains a start and a length (both in bytes), and assumes the underlying
> > netfs will return a either an error on the whole region, or the number
> > of bytes successfully read.
> >
> > The NFS IO path is page based and the main APIs are the pgio APIs defined
> > in pagelist.c.  For the pgio APIs, there is no way for the caller to
> > know how many RPCs will be sent and how the pages will be broken up
> > into underlying RPCs, each of which will have their own return code.
> > Thus, NFS needs some way to accommodate the netfs API requirement on
> > the single response to the whole request, while also minimizing
> > disruptive changes to the NFS pgio layer.  The approach taken with this
> > patch is to allocate a small structure for each nfs_netfs_issue_read() call
> > to keep the final error value or the number of bytes successfully read.
> > The refcount on the structure is used also as a marker for the last
> > RPC completion, updated inside nfs_netfs_read_initiate(), and
> > nfs_netfs_read_done(), when a nfs_pgio_header contains a valid pointer
> > to the data.  Then finally in nfs_read_completion(), call into
> > nfs_netfs_read_completion() to update the final error value and bytes
> > read, and check the refcount to determine whether this is the final
> > RPC completion.  If this is the last RPC, then in the final put on
> > the structure, call into netfs_subreq_terminated() with the final
> > error value or the number of bytes successfully transferred.
> >
> > Suggested-by: Jeff Layton <jlayton@kernel.org>
> > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > ---
> >  fs/nfs/fscache.c         | 241 ++++++++++++++++++++++++---------------
> >  fs/nfs/fscache.h         |  94 +++++++++------
> >  fs/nfs/inode.c           |   2 +
> >  fs/nfs/internal.h        |   9 ++
> >  fs/nfs/pagelist.c        |  12 ++
> >  fs/nfs/read.c            |  51 ++++-----
> >  include/linux/nfs_page.h |   3 +
> >  include/linux/nfs_xdr.h  |   3 +
> >  8 files changed, 263 insertions(+), 152 deletions(-)
> >
> > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > index a6fc1c8b6644..9b7df3d61c35 100644
> > --- a/fs/nfs/fscache.c
> > +++ b/fs/nfs/fscache.c
> > @@ -15,6 +15,9 @@
> >  #include <linux/seq_file.h>
> >  #include <linux/slab.h>
> >  #include <linux/iversion.h>
> > +#include <linux/xarray.h>
> > +#include <linux/fscache.h>
> > +#include <linux/netfs.h>
> >
> >  #include "internal.h"
> >  #include "iostat.h"
> > @@ -184,7 +187,7 @@ void nfs_fscache_init_inode(struct inode *inode)
> >   */
> >  void nfs_fscache_clear_inode(struct inode *inode)
> >  {
> > -     fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
> > +     fscache_relinquish_cookie(netfs_i_cookie(netfs_inode(inode)), false);
> >       netfs_inode(inode)->cache = NULL;
> >  }
> >
> > @@ -210,7 +213,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
> >  void nfs_fscache_open_file(struct inode *inode, struct file *filp)
> >  {
> >       struct nfs_fscache_inode_auxdata auxdata;
> > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> >       bool open_for_write = inode_is_open_for_write(inode);
> >
> >       if (!fscache_cookie_valid(cookie))
> > @@ -228,119 +231,169 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
> >  void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> >  {
> >       struct nfs_fscache_inode_auxdata auxdata;
> > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> >       loff_t i_size = i_size_read(inode);
> >
> >       nfs_fscache_update_auxdata(&auxdata, inode);
> >       fscache_unuse_cookie(cookie, &auxdata, &i_size);
> >  }
> >
> > -/*
> > - * Fallback page reading interface.
> > - */
> > -static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> > +int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> >  {
> > -     struct netfs_cache_resources cres;
> > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > -     struct iov_iter iter;
> > -     struct bio_vec bvec[1];
> > -     int ret;
> > -
> > -     memset(&cres, 0, sizeof(cres));
> > -     bvec[0].bv_page         = page;
> > -     bvec[0].bv_offset       = 0;
> > -     bvec[0].bv_len          = PAGE_SIZE;
> > -     iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > -
> > -     ret = fscache_begin_read_operation(&cres, cookie);
> > -     if (ret < 0)
> > -             return ret;
> > -
> > -     ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> > -                        NULL, NULL);
> > -     fscache_end_operation(&cres);
> > -     return ret;
> > +     if (!netfs_inode(folio_inode(folio))->cache)
> > +             return -ENOBUFS;
> > +
> > +     return netfs_read_folio(file, folio);
> >  }
> >
> > -/*
> > - * Fallback page writing interface.
> > - */
> > -static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> > -                                    bool no_space_allocated_yet)
> > +int nfs_netfs_readahead(struct readahead_control *ractl)
> >  {
> > -     struct netfs_cache_resources cres;
> > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > -     struct iov_iter iter;
> > -     struct bio_vec bvec[1];
> > -     loff_t start = page_offset(page);
> > -     size_t len = PAGE_SIZE;
> > -     int ret;
> > -
> > -     memset(&cres, 0, sizeof(cres));
> > -     bvec[0].bv_page         = page;
> > -     bvec[0].bv_offset       = 0;
> > -     bvec[0].bv_len          = PAGE_SIZE;
> > -     iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > -
> > -     ret = fscache_begin_write_operation(&cres, cookie);
> > -     if (ret < 0)
> > -             return ret;
> > -
> > -     ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
> > -                                   no_space_allocated_yet);
> > -     if (ret == 0)
> > -             ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
> > -     fscache_end_operation(&cres);
> > -     return ret;
> > +     struct inode *inode = ractl->mapping->host;
> > +
> > +     if (!netfs_inode(inode)->cache)
> > +             return -ENOBUFS;
> > +
> > +     netfs_readahead(ractl);
> > +     return 0;
> >  }
> >
> > -/*
> > - * Retrieve a page from fscache
> > - */
> > -int __nfs_fscache_read_page(struct inode *inode, struct page *page)
> > +atomic_t nfs_netfs_debug_id;
> > +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> >  {
> > -     int ret;
> > +     rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
> > +     rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
> >
> > -     trace_nfs_fscache_read_page(inode, page);
> > -     if (PageChecked(page)) {
> > -             ClearPageChecked(page);
> > -             ret = 1;
> > -             goto out;
> > -     }
> > +     return 0;
> > +}
> >
> > -     ret = fscache_fallback_read_page(inode, page);
> > -     if (ret < 0) {
> > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
> > -             SetPageChecked(page);
> > -             goto out;
> > -     }
> > +static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> > +{
> > +     put_nfs_open_context(rreq->netfs_priv);
> > +}
> >
> > -     /* Read completed synchronously */
> > -     nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
> > -     SetPageUptodate(page);
> > -     ret = 0;
> > -out:
> > -     trace_nfs_fscache_read_page_exit(inode, page, ret);
> > -     return ret;
> > +static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
> > +{
> > +     return fscache_begin_read_operation(&rreq->cache_resources,
> > +                                         netfs_i_cookie(netfs_inode(rreq->inode)));
> >  }
> >
> > -/*
> > - * Store a newly fetched page in fscache.  We can be certain there's no page
> > - * stored in the cache as yet otherwise we would've read it from there.
> > - */
> > -void __nfs_fscache_write_page(struct inode *inode, struct page *page)
> > +static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
> >  {
> > -     int ret;
> > +     struct nfs_netfs_io_data *netfs;
> > +
> > +     netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
> > +     if (!netfs)
> > +             return NULL;
> > +     netfs->sreq = sreq;
> > +     refcount_set(&netfs->refcount, 1);
> > +     return netfs;
> > +}
> >
> > -     trace_nfs_fscache_write_page(inode, page);
> > +static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
> > +{
> > +     size_t  rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
> >
> > -     ret = fscache_fallback_write_page(inode, page, true);
> > +     sreq->len = min(sreq->len, rsize);
> > +     return true;
> > +}
> >
> > -     if (ret != 0) {
> > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
> > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
> > -     } else {
> > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
> > +static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
> > +{
> > +     struct nfs_pageio_descriptor pgio;
> > +     struct inode *inode = sreq->rreq->inode;
> > +     struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
> > +     struct page *page;
> > +     int err;
> > +     pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
> > +     pgoff_t last = ((sreq->start + sreq->len -
> > +                      sreq->transferred - 1) >> PAGE_SHIFT);
> > +     XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
> > +
> > +     nfs_pageio_init_read(&pgio, inode, false,
> > +                          &nfs_async_read_completion_ops);
> > +
> > +     pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
> > +     if (!pgio.pg_netfs)
> > +             return netfs_subreq_terminated(sreq, -ENOMEM, false);
> > +
> > +     xas_lock(&xas);
> > +     xas_for_each(&xas, page, last) {
> > +             /* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs  */
> > +             xas_pause(&xas);
> > +             xas_unlock(&xas);
> > +             err = nfs_pageio_add_page(&pgio, ctx, page);
> > +             if (err < 0)
> > +                     return netfs_subreq_terminated(sreq, err, false);
> > +             xas_lock(&xas);
> >       }
> > -     trace_nfs_fscache_write_page_exit(inode, page, ret);
> > +     xas_unlock(&xas);
> > +     nfs_pageio_complete_read(&pgio);
> > +     nfs_netfs_put(pgio.pg_netfs);
> >  }
> > +
> > +void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr)
> > +{
> > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > +
> > +     if (!netfs)
> > +             return;
> > +
> > +     nfs_netfs_get(netfs);
> > +}
> > +
> > +void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr)
> > +{
> > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > +
> > +     if (!netfs)
> > +             return;
> > +
> > +     if (hdr->res.op_status)
> > +             /*
> > +              * Retryable errors such as BAD_STATEID will be re-issued,
> > +              * so reduce refcount.
> > +              */
> > +             nfs_netfs_put(netfs);
> > +}
> > +
> > +void nfs_netfs_readpage_release(struct nfs_page *req)
> > +{
> > +     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> > +
> > +     /*
> > +      * If fscache is enabled, netfs will unlock pages.
> > +      */
> > +     if (netfs_inode(inode)->cache)
> > +             return;
> > +
> > +     unlock_page(req->wb_page);
> > +}
> > +
> > +void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
> > +{
> > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > +     struct netfs_io_subrequest      *sreq;
> > +
> > +     if (!netfs)
> > +             return;
> > +
> > +     sreq = netfs->sreq;
> > +     if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> > +             __set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
> > +
> > +     if (hdr->error)
> > +             netfs->error = hdr->error;
> > +     else
> > +             atomic64_add(hdr->res.count, &netfs->transferred);
> > +
> > +     nfs_netfs_put(netfs);
> > +     hdr->netfs = NULL;
> > +}
> > +
> > +const struct netfs_request_ops nfs_netfs_ops = {
> > +     .init_request           = nfs_netfs_init_request,
> > +     .free_request           = nfs_netfs_free_request,
> > +     .begin_cache_operation  = nfs_netfs_begin_cache_operation,
> > +     .issue_read             = nfs_netfs_issue_read,
> > +     .clamp_length           = nfs_netfs_clamp_length
> > +};
> > diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> > index 38614ed8f951..fb782b917235 100644
> > --- a/fs/nfs/fscache.h
> > +++ b/fs/nfs/fscache.h
> > @@ -34,6 +34,49 @@ struct nfs_fscache_inode_auxdata {
> >       u64     change_attr;
> >  };
> >
> > +struct nfs_netfs_io_data {
> > +     /*
> > +      * NFS may split a netfs_io_subrequest into multiple RPCs, each
> > +      * with their own read completion.  In netfs, we can only call
> > +      * netfs_subreq_terminated() once for each subrequest.  Use the
> > +      * refcount here to double as a marker of the last RPC completion,
> > +      * and only call netfs via netfs_subreq_terminated() once.
> > +      */
> > +     refcount_t                      refcount;
> > +     struct netfs_io_subrequest      *sreq;
> > +
> > +     /*
> > +      * Final disposition of the netfs_io_subrequest, sent in
> > +      * netfs_subreq_terminated()
> > +      */
> > +     atomic64_t      transferred;
> > +     int             error;
> > +};
> > +
> > +static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
> > +{
> > +     refcount_inc(&netfs->refcount);
> > +}
> > +
> > +static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
> > +{
> > +     /* Only the last RPC completion should call netfs_subreq_terminated() */
> > +     if (refcount_dec_and_test(&netfs->refcount)) {
> > +             netfs_subreq_terminated(netfs->sreq,
> > +                                     netfs->error ?: atomic64_read(&netfs->transferred),
> > +                                     false);
> > +             kfree(netfs);
> > +     }
> > +}
> > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi)
> > +{
> > +     netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
> > +}
> > +extern void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr);
> > +extern void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr);
> > +extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
> > +extern void nfs_netfs_readpage_release(struct nfs_page *req);
> > +
> >  /*
> >   * fscache.c
> >   */
> > @@ -44,9 +87,8 @@ extern void nfs_fscache_init_inode(struct inode *);
> >  extern void nfs_fscache_clear_inode(struct inode *);
> >  extern void nfs_fscache_open_file(struct inode *, struct file *);
> >  extern void nfs_fscache_release_file(struct inode *, struct file *);
> > -
> > -extern int __nfs_fscache_read_page(struct inode *, struct page *);
> > -extern void __nfs_fscache_write_page(struct inode *, struct page *);
> > +extern int nfs_netfs_readahead(struct readahead_control *ractl);
> > +extern int nfs_netfs_read_folio(struct file *file, struct folio *folio);
> >
> >  static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> >  {
> > @@ -54,34 +96,11 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> >               if (current_is_kswapd() || !(gfp & __GFP_FS))
> >                       return false;
> >               folio_wait_fscache(folio);
> > -             fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> > -             nfs_inc_fscache_stats(folio->mapping->host,
> > -                                   NFSIOS_FSCACHE_PAGES_UNCACHED);
> >       }
> > +     fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> >       return true;
> >  }
> >
> > -/*
> > - * Retrieve a page from an inode data storage object.
> > - */
> > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > -{
> > -     if (netfs_inode(inode)->cache)
> > -             return __nfs_fscache_read_page(inode, page);
> > -     return -ENOBUFS;
> > -}
> > -
> > -/*
> > - * Store a page newly fetched from the server in an inode data storage object
> > - * in the cache.
> > - */
> > -static inline void nfs_fscache_write_page(struct inode *inode,
> > -                                        struct page *page)
> > -{
> > -     if (netfs_inode(inode)->cache)
> > -             __nfs_fscache_write_page(inode, page);
> > -}
> > -
> >  static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
> >                                             struct inode *inode)
> >  {
> > @@ -118,6 +137,14 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> >  }
> >
> >  #else /* CONFIG_NFS_FSCACHE */
> > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi) {}
> > +static inline void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr) {}
> > +static inline void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr) {}
> > +static inline void nfs_netfs_read_completion(struct nfs_pgio_header *hdr) {}
> > +static inline void nfs_netfs_readpage_release(struct nfs_page *req)
> > +{
> > +     unlock_page(req->wb_page);
> > +}
> >  static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
> >
> >  static inline void nfs_fscache_init_inode(struct inode *inode) {}
> > @@ -125,16 +152,19 @@ static inline void nfs_fscache_clear_inode(struct inode *inode) {}
> >  static inline void nfs_fscache_open_file(struct inode *inode,
> >                                        struct file *filp) {}
> >  static inline void nfs_fscache_release_file(struct inode *inode, struct file *file) {}
> > -
> > -static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > +static inline int nfs_netfs_readahead(struct readahead_control *ractl)
> >  {
> > -     return true; /* may release folio */
> > +     return -ENOBUFS;
> >  }
> > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > +static inline int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> >  {
> >       return -ENOBUFS;
> >  }
> > -static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
> > +
> > +static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > +{
> > +     return true; /* may release folio */
> > +}
> >  static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
> >
> >  static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index aa2aec785ab5..b36a02b932e8 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -2249,6 +2249,8 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
> >  #ifdef CONFIG_NFS_V4_2
> >       nfsi->xattr_cache = NULL;
> >  #endif
> > +     nfs_netfs_inode_init(nfsi);
> > +
> >       return VFS_I(nfsi);
> >  }
> >  EXPORT_SYMBOL_GPL(nfs_alloc_inode);
> > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > index 273687082992..e5589036c1f8 100644
> > --- a/fs/nfs/internal.h
> > +++ b/fs/nfs/internal.h
> > @@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
> >  extern int nfs_client_for_each_server(struct nfs_client *clp,
> >                                     int (*fn)(struct nfs_server *, void *),
> >                                     void *data);
> > +#ifdef CONFIG_NFS_FSCACHE
> > +extern const struct netfs_request_ops nfs_netfs_ops;
> > +#endif
> > +
> >  /* io.c */
> >  extern void nfs_start_io_read(struct inode *inode);
> >  extern void nfs_end_io_read(struct inode *inode);
> > @@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
> >
> >  struct nfs_pgio_completion_ops;
> >  /* read.c */
> > +extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> >  extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> >                       struct inode *inode, bool force_mds,
> >                       const struct nfs_pgio_completion_ops *compl_ops);
> > +extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > +                            struct nfs_open_context *ctx,
> > +                            struct page *page);
> > +extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
> >  extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
> >  extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
> >
> > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> > index 317cedfa52bf..e28754476d1b 100644
> > --- a/fs/nfs/pagelist.c
> > +++ b/fs/nfs/pagelist.c
> > @@ -25,6 +25,7 @@
> >  #include "internal.h"
> >  #include "pnfs.h"
> >  #include "nfstrace.h"
> > +#include "fscache.h"
> >
> >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> >
> > @@ -68,6 +69,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
> >       hdr->good_bytes = mirror->pg_count;
> >       hdr->io_completion = desc->pg_io_completion;
> >       hdr->dreq = desc->pg_dreq;
> > +#ifdef CONFIG_NFS_FSCACHE
> > +     if (desc->pg_netfs)
> > +             hdr->netfs = desc->pg_netfs;
> > +#endif
> >       hdr->release = release;
> >       hdr->completion_ops = desc->pg_completion_ops;
> >       if (hdr->completion_ops->init_hdr)
> > @@ -846,6 +851,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
> >       desc->pg_lseg = NULL;
> >       desc->pg_io_completion = NULL;
> >       desc->pg_dreq = NULL;
> > +#ifdef CONFIG_NFS_FSCACHE
> > +     desc->pg_netfs = NULL;
> > +#endif
> >       desc->pg_bsize = bsize;
> >
> >       desc->pg_mirror_count = 1;
> > @@ -940,6 +948,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
> >       /* Set up the argument struct */
> >       nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
> >       desc->pg_rpc_callops = &nfs_pgio_common_ops;
> > +
> >       return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(nfs_generic_pgio);
> > @@ -1360,6 +1369,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
> >
> >       desc->pg_io_completion = hdr->io_completion;
> >       desc->pg_dreq = hdr->dreq;
> > +#ifdef CONFIG_NFS_FSCACHE
> > +     desc->pg_netfs = hdr->netfs;
> > +#endif
> >       list_splice_init(&hdr->pages, &pages);
> >       while (!list_empty(&pages)) {
> >               struct nfs_page *req = nfs_list_entry(pages.next);
> > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > index 525e82ea9a9e..c74c5fcba87d 100644
> > --- a/fs/nfs/read.c
> > +++ b/fs/nfs/read.c
> > @@ -30,7 +30,7 @@
> >
> >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> >
> > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> >  static const struct nfs_rw_ops nfs_rw_read_ops;
> >
> >  static struct kmem_cache *nfs_rdata_cachep;
> > @@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> >  }
> >  EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
> >
> > -static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> > +void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> >  {
> >       struct nfs_pgio_mirror *pgm;
> >       unsigned long npages;
> > @@ -110,20 +110,13 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
> >
> >  static void nfs_readpage_release(struct nfs_page *req, int error)
> >  {
> > -     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> >       struct page *page = req->wb_page;
> >
> > -     dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
> > -             (unsigned long long)NFS_FILEID(inode), req->wb_bytes,
> > -             (long long)req_offset(req));
> > -
> >       if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
> >               SetPageError(page);
> > -     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
> > -             if (PageUptodate(page))
> > -                     nfs_fscache_write_page(inode, page);
> > -             unlock_page(page);
> > -     }
> > +     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE))
> > +             nfs_netfs_readpage_release(req);
> > +
> >       nfs_release_request(req);
> >  }
> >
> > @@ -177,6 +170,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
> >               nfs_list_remove_request(req);
> >               nfs_readpage_release(req, error);
> >       }
> > +     nfs_netfs_read_completion(hdr);
> > +
> >  out:
> >       hdr->release(hdr);
> >  }
> > @@ -187,6 +182,7 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
> >                             struct rpc_task_setup *task_setup_data, int how)
> >  {
> >       rpc_ops->read_setup(hdr, msg);
> > +     nfs_netfs_initiate_read(hdr);
> >       trace_nfs_initiate_read(hdr);
> >  }
> >
> > @@ -202,7 +198,7 @@ nfs_async_read_error(struct list_head *head, int error)
> >       }
> >  }
> >
> > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> >       .error_cleanup = nfs_async_read_error,
> >       .completion = nfs_read_completion,
> >  };
> > @@ -219,6 +215,7 @@ static int nfs_readpage_done(struct rpc_task *task,
> >       if (status != 0)
> >               return status;
> >
> > +     nfs_netfs_readpage_done(hdr);
> >       nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
> >       trace_nfs_readpage_done(task, hdr);
> >
> > @@ -294,12 +291,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> >
> >       aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
> >
> > -     if (!IS_SYNC(page->mapping->host)) {
> > -             error = nfs_fscache_read_page(page->mapping->host, page);
> > -             if (error == 0)
> > -                     goto out_unlock;
> > -     }
> > -
> >       new = nfs_create_request(ctx, page, 0, aligned_len);
> >       if (IS_ERR(new))
> >               goto out_error;
> > @@ -315,8 +306,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> >       return 0;
> >  out_error:
> >       error = PTR_ERR(new);
> > -out_unlock:
> > -     unlock_page(page);
> >  out:
> >       return error;
> >  }
> > @@ -355,6 +344,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> >       if (NFS_STALE(inode))
> >               goto out_unlock;
> >
> > +     ret = nfs_netfs_read_folio(file, folio);
> > +     if (!ret)
> > +             goto out;
> > +
> >       if (file == NULL) {
> >               ret = -EBADF;
> >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > @@ -368,8 +361,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> >                            &nfs_async_read_completion_ops);
> >
> >       ret = nfs_pageio_add_page(&pgio, ctx, page);
> > -     if (ret)
> > -             goto out;
> > +     if (ret) {
> > +             put_nfs_open_context(ctx);
> > +             goto out_unlock;
> > +     }
> >
> >       nfs_pageio_complete_read(&pgio);
> >       ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
> > @@ -378,12 +373,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> >               if (!PageUptodate(page) && !ret)
> >                       ret = xchg(&ctx->error, 0);
> >       }
> > -out:
> >       put_nfs_open_context(ctx);
> > -     trace_nfs_aop_readpage_done(inode, page, ret);
> > -     return ret;
> > +     goto out;
> > +
> >  out_unlock:
> >       unlock_page(page);
> > +out:
> >       trace_nfs_aop_readpage_done(inode, page, ret);
> >       return ret;
> >  }
> > @@ -405,6 +400,10 @@ void nfs_readahead(struct readahead_control *ractl)
> >       if (NFS_STALE(inode))
> >               goto out;
> >
> > +     ret = nfs_netfs_readahead(ractl);
> > +     if (!ret)
> > +             goto out;
> > +
> >       if (file == NULL) {
> >               ret = -EBADF;
> >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
> > index ba7e2e4b0926..8eeb16d9bacd 100644
> > --- a/include/linux/nfs_page.h
> > +++ b/include/linux/nfs_page.h
> > @@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
> >       struct pnfs_layout_segment *pg_lseg;
> >       struct nfs_io_completion *pg_io_completion;
> >       struct nfs_direct_req   *pg_dreq;
> > +#ifdef CONFIG_NFS_FSCACHE
> > +     void                    *pg_netfs;
> > +#endif
>
>
> Would it be possible to union this new field with pg_dreq? I don't think
> they're ever both used in the same desc. There are some places that
> check for pg_dreq == NULL that would need to be converted to use a new
> flag or something, but that would allow us to avoid growing this struct.
>

Yeah it's a good point, though I'm not sure how easy it is.
I was also thinking about whether to drop the #ifdefs in the two structures
since the netfs NULL pointers are benign if unused.

Do you mean something like the below - bit fields to indicate whether
the pointer is used or not?  There is an "io_flags" field inside
nfs_pageio_descriptor.  However, it is used for FLUSH_* flags,
so a new flag involving union members validity doesn't seem to fit there.
Not sure if this helps, but another way maybe to look at it is that
after this change, we could have 3 possibilities for nfs_pageio_descriptors
- netfs: right now only fscache enabled buffered READs are handled
- direct IO
- everything else: (neither direct nor netfs)

It may be possible to unionize, but looks a bit tricky with things like
pnfs resends (see pnfs.c and callers of nfs_pageio_init_read and
nfs_pageio_init_write).  Might need to update the init functions for
nfs_pageio_descriptor and nfs_pgio_header and looks a bit messy.
Might be worth it but not sure - may take some time to work it out
safely.

--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -100,10 +100,10 @@ struct nfs_pageio_descriptor {
        const struct nfs_pgio_completion_ops *pg_completion_ops;
        struct pnfs_layout_segment *pg_lseg;
        struct nfs_io_completion *pg_io_completion;
-       struct nfs_direct_req   *pg_dreq;
-#ifdef CONFIG_NFS_FSCACHE
-       void                    *pg_netfs;
-#endif
+       union {
+               struct nfs_direct_req   *pg_dreq;  /* pg_is_direct */
+               void                    *pg_netfs; /* pg_is_netfs */
+       };
        unsigned int            pg_bsize;       /* default bsize for mirrors */

        u32                     pg_mirror_count;
@@ -113,6 +113,8 @@ struct nfs_pageio_descriptor {
        u32                     pg_mirror_idx;  /* current mirror */
        unsigned short          pg_maxretrans;
        unsigned char           pg_moreio : 1;
+       unsigned char           pg_is_direct: 1;
+       unsigned char           pg_is_netfs: 1;
 };





> >       unsigned int            pg_bsize;       /* default bsize for mirrors */
> >
> >       u32                     pg_mirror_count;
> > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > index e86cf6642d21..e196ef595908 100644
> > --- a/include/linux/nfs_xdr.h
> > +++ b/include/linux/nfs_xdr.h
> > @@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
> >       const struct nfs_rw_ops *rw_ops;
> >       struct nfs_io_completion *io_completion;
> >       struct nfs_direct_req   *dreq;
> > +#ifdef CONFIG_NFS_FSCACHE
> > +     void                    *netfs;
> > +#endif
> >
>
> Maybe also here too?
>
> >       int                     pnfs_error;
> >       int                     error;          /* merge with pnfs_error */
>
> --
> Jeff Layton <jlayton@kernel.org>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled
  2022-09-04 19:51     ` David Wysochanski
@ 2022-09-06 10:53       ` Jeff Layton
  2022-09-13 20:11         ` David Wysochanski
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2022-09-06 10:53 UTC (permalink / raw)
  To: David Wysochanski
  Cc: Anna Schumaker, Trond Myklebust, David Howells, linux-nfs,
	linux-cachefs, Benjamin Maynard, Daire Byrne

On Sun, 2022-09-04 at 15:51 -0400, David Wysochanski wrote:
> On Sun, Sep 4, 2022 at 9:59 AM Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Sun, 2022-09-04 at 05:05 -0400, Dave Wysochanski wrote:
> > > Convert the NFS buffered read code paths to corresponding netfs APIs,
> > > but only when fscache is configured and enabled.
> > > 
> > > The netfs API defines struct netfs_request_ops which must be filled
> > > in by the network filesystem.  For NFS, we only need to define 5 of
> > > the functions, the main one being the issue_read() function.
> > > The issue_read() function is called by the netfs layer when a read
> > > cannot be fulfilled locally, and must be sent to the server (either
> > > the cache is not active, or it is active but the data is not available).
> > > Once the read from the server is complete, netfs requires a call to
> > > netfs_subreq_terminated() which conveys either how many bytes were read
> > > successfully, or an error.  Note that issue_read() is called with a
> > > structure, netfs_io_subrequest, which defines the IO requested, and
> > > contains a start and a length (both in bytes), and assumes the underlying
> > > netfs will return a either an error on the whole region, or the number
> > > of bytes successfully read.
> > > 
> > > The NFS IO path is page based and the main APIs are the pgio APIs defined
> > > in pagelist.c.  For the pgio APIs, there is no way for the caller to
> > > know how many RPCs will be sent and how the pages will be broken up
> > > into underlying RPCs, each of which will have their own return code.
> > > Thus, NFS needs some way to accommodate the netfs API requirement on
> > > the single response to the whole request, while also minimizing
> > > disruptive changes to the NFS pgio layer.  The approach taken with this
> > > patch is to allocate a small structure for each nfs_netfs_issue_read() call
> > > to keep the final error value or the number of bytes successfully read.
> > > The refcount on the structure is used also as a marker for the last
> > > RPC completion, updated inside nfs_netfs_read_initiate(), and
> > > nfs_netfs_read_done(), when a nfs_pgio_header contains a valid pointer
> > > to the data.  Then finally in nfs_read_completion(), call into
> > > nfs_netfs_read_completion() to update the final error value and bytes
> > > read, and check the refcount to determine whether this is the final
> > > RPC completion.  If this is the last RPC, then in the final put on
> > > the structure, call into netfs_subreq_terminated() with the final
> > > error value or the number of bytes successfully transferred.
> > > 
> > > Suggested-by: Jeff Layton <jlayton@kernel.org>
> > > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > > ---
> > >  fs/nfs/fscache.c         | 241 ++++++++++++++++++++++++---------------
> > >  fs/nfs/fscache.h         |  94 +++++++++------
> > >  fs/nfs/inode.c           |   2 +
> > >  fs/nfs/internal.h        |   9 ++
> > >  fs/nfs/pagelist.c        |  12 ++
> > >  fs/nfs/read.c            |  51 ++++-----
> > >  include/linux/nfs_page.h |   3 +
> > >  include/linux/nfs_xdr.h  |   3 +
> > >  8 files changed, 263 insertions(+), 152 deletions(-)
> > > 
> > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > > index a6fc1c8b6644..9b7df3d61c35 100644
> > > --- a/fs/nfs/fscache.c
> > > +++ b/fs/nfs/fscache.c
> > > @@ -15,6 +15,9 @@
> > >  #include <linux/seq_file.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/iversion.h>
> > > +#include <linux/xarray.h>
> > > +#include <linux/fscache.h>
> > > +#include <linux/netfs.h>
> > > 
> > >  #include "internal.h"
> > >  #include "iostat.h"
> > > @@ -184,7 +187,7 @@ void nfs_fscache_init_inode(struct inode *inode)
> > >   */
> > >  void nfs_fscache_clear_inode(struct inode *inode)
> > >  {
> > > -     fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
> > > +     fscache_relinquish_cookie(netfs_i_cookie(netfs_inode(inode)), false);
> > >       netfs_inode(inode)->cache = NULL;
> > >  }
> > > 
> > > @@ -210,7 +213,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
> > >  void nfs_fscache_open_file(struct inode *inode, struct file *filp)
> > >  {
> > >       struct nfs_fscache_inode_auxdata auxdata;
> > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> > >       bool open_for_write = inode_is_open_for_write(inode);
> > > 
> > >       if (!fscache_cookie_valid(cookie))
> > > @@ -228,119 +231,169 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
> > >  void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> > >  {
> > >       struct nfs_fscache_inode_auxdata auxdata;
> > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> > >       loff_t i_size = i_size_read(inode);
> > > 
> > >       nfs_fscache_update_auxdata(&auxdata, inode);
> > >       fscache_unuse_cookie(cookie, &auxdata, &i_size);
> > >  }
> > > 
> > > -/*
> > > - * Fallback page reading interface.
> > > - */
> > > -static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> > > +int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> > >  {
> > > -     struct netfs_cache_resources cres;
> > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > -     struct iov_iter iter;
> > > -     struct bio_vec bvec[1];
> > > -     int ret;
> > > -
> > > -     memset(&cres, 0, sizeof(cres));
> > > -     bvec[0].bv_page         = page;
> > > -     bvec[0].bv_offset       = 0;
> > > -     bvec[0].bv_len          = PAGE_SIZE;
> > > -     iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > > -
> > > -     ret = fscache_begin_read_operation(&cres, cookie);
> > > -     if (ret < 0)
> > > -             return ret;
> > > -
> > > -     ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> > > -                        NULL, NULL);
> > > -     fscache_end_operation(&cres);
> > > -     return ret;
> > > +     if (!netfs_inode(folio_inode(folio))->cache)
> > > +             return -ENOBUFS;
> > > +
> > > +     return netfs_read_folio(file, folio);
> > >  }
> > > 
> > > -/*
> > > - * Fallback page writing interface.
> > > - */
> > > -static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> > > -                                    bool no_space_allocated_yet)
> > > +int nfs_netfs_readahead(struct readahead_control *ractl)
> > >  {
> > > -     struct netfs_cache_resources cres;
> > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > -     struct iov_iter iter;
> > > -     struct bio_vec bvec[1];
> > > -     loff_t start = page_offset(page);
> > > -     size_t len = PAGE_SIZE;
> > > -     int ret;
> > > -
> > > -     memset(&cres, 0, sizeof(cres));
> > > -     bvec[0].bv_page         = page;
> > > -     bvec[0].bv_offset       = 0;
> > > -     bvec[0].bv_len          = PAGE_SIZE;
> > > -     iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > > -
> > > -     ret = fscache_begin_write_operation(&cres, cookie);
> > > -     if (ret < 0)
> > > -             return ret;
> > > -
> > > -     ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
> > > -                                   no_space_allocated_yet);
> > > -     if (ret == 0)
> > > -             ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
> > > -     fscache_end_operation(&cres);
> > > -     return ret;
> > > +     struct inode *inode = ractl->mapping->host;
> > > +
> > > +     if (!netfs_inode(inode)->cache)
> > > +             return -ENOBUFS;
> > > +
> > > +     netfs_readahead(ractl);
> > > +     return 0;
> > >  }
> > > 
> > > -/*
> > > - * Retrieve a page from fscache
> > > - */
> > > -int __nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > +atomic_t nfs_netfs_debug_id;
> > > +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> > >  {
> > > -     int ret;
> > > +     rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
> > > +     rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
> > > 
> > > -     trace_nfs_fscache_read_page(inode, page);
> > > -     if (PageChecked(page)) {
> > > -             ClearPageChecked(page);
> > > -             ret = 1;
> > > -             goto out;
> > > -     }
> > > +     return 0;
> > > +}
> > > 
> > > -     ret = fscache_fallback_read_page(inode, page);
> > > -     if (ret < 0) {
> > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
> > > -             SetPageChecked(page);
> > > -             goto out;
> > > -     }
> > > +static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> > > +{
> > > +     put_nfs_open_context(rreq->netfs_priv);
> > > +}
> > > 
> > > -     /* Read completed synchronously */
> > > -     nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
> > > -     SetPageUptodate(page);
> > > -     ret = 0;
> > > -out:
> > > -     trace_nfs_fscache_read_page_exit(inode, page, ret);
> > > -     return ret;
> > > +static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
> > > +{
> > > +     return fscache_begin_read_operation(&rreq->cache_resources,
> > > +                                         netfs_i_cookie(netfs_inode(rreq->inode)));
> > >  }
> > > 
> > > -/*
> > > - * Store a newly fetched page in fscache.  We can be certain there's no page
> > > - * stored in the cache as yet otherwise we would've read it from there.
> > > - */
> > > -void __nfs_fscache_write_page(struct inode *inode, struct page *page)
> > > +static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
> > >  {
> > > -     int ret;
> > > +     struct nfs_netfs_io_data *netfs;
> > > +
> > > +     netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
> > > +     if (!netfs)
> > > +             return NULL;
> > > +     netfs->sreq = sreq;
> > > +     refcount_set(&netfs->refcount, 1);
> > > +     return netfs;
> > > +}
> > > 
> > > -     trace_nfs_fscache_write_page(inode, page);
> > > +static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
> > > +{
> > > +     size_t  rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
> > > 
> > > -     ret = fscache_fallback_write_page(inode, page, true);
> > > +     sreq->len = min(sreq->len, rsize);
> > > +     return true;
> > > +}
> > > 
> > > -     if (ret != 0) {
> > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
> > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
> > > -     } else {
> > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
> > > +static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
> > > +{
> > > +     struct nfs_pageio_descriptor pgio;
> > > +     struct inode *inode = sreq->rreq->inode;
> > > +     struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
> > > +     struct page *page;
> > > +     int err;
> > > +     pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
> > > +     pgoff_t last = ((sreq->start + sreq->len -
> > > +                      sreq->transferred - 1) >> PAGE_SHIFT);
> > > +     XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
> > > +
> > > +     nfs_pageio_init_read(&pgio, inode, false,
> > > +                          &nfs_async_read_completion_ops);
> > > +
> > > +     pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
> > > +     if (!pgio.pg_netfs)
> > > +             return netfs_subreq_terminated(sreq, -ENOMEM, false);
> > > +
> > > +     xas_lock(&xas);
> > > +     xas_for_each(&xas, page, last) {
> > > +             /* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs  */
> > > +             xas_pause(&xas);
> > > +             xas_unlock(&xas);
> > > +             err = nfs_pageio_add_page(&pgio, ctx, page);
> > > +             if (err < 0)
> > > +                     return netfs_subreq_terminated(sreq, err, false);
> > > +             xas_lock(&xas);
> > >       }
> > > -     trace_nfs_fscache_write_page_exit(inode, page, ret);
> > > +     xas_unlock(&xas);
> > > +     nfs_pageio_complete_read(&pgio);
> > > +     nfs_netfs_put(pgio.pg_netfs);
> > >  }
> > > +
> > > +void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr)
> > > +{
> > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > +
> > > +     if (!netfs)
> > > +             return;
> > > +
> > > +     nfs_netfs_get(netfs);
> > > +}
> > > +
> > > +void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr)
> > > +{
> > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > +
> > > +     if (!netfs)
> > > +             return;
> > > +
> > > +     if (hdr->res.op_status)
> > > +             /*
> > > +              * Retryable errors such as BAD_STATEID will be re-issued,
> > > +              * so reduce refcount.
> > > +              */
> > > +             nfs_netfs_put(netfs);
> > > +}
> > > +
> > > +void nfs_netfs_readpage_release(struct nfs_page *req)
> > > +{
> > > +     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> > > +
> > > +     /*
> > > +      * If fscache is enabled, netfs will unlock pages.
> > > +      */
> > > +     if (netfs_inode(inode)->cache)
> > > +             return;
> > > +
> > > +     unlock_page(req->wb_page);
> > > +}
> > > +
> > > +void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
> > > +{
> > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > +     struct netfs_io_subrequest      *sreq;
> > > +
> > > +     if (!netfs)
> > > +             return;
> > > +
> > > +     sreq = netfs->sreq;
> > > +     if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> > > +             __set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
> > > +
> > > +     if (hdr->error)
> > > +             netfs->error = hdr->error;
> > > +     else
> > > +             atomic64_add(hdr->res.count, &netfs->transferred);
> > > +
> > > +     nfs_netfs_put(netfs);
> > > +     hdr->netfs = NULL;
> > > +}
> > > +
> > > +const struct netfs_request_ops nfs_netfs_ops = {
> > > +     .init_request           = nfs_netfs_init_request,
> > > +     .free_request           = nfs_netfs_free_request,
> > > +     .begin_cache_operation  = nfs_netfs_begin_cache_operation,
> > > +     .issue_read             = nfs_netfs_issue_read,
> > > +     .clamp_length           = nfs_netfs_clamp_length
> > > +};
> > > diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> > > index 38614ed8f951..fb782b917235 100644
> > > --- a/fs/nfs/fscache.h
> > > +++ b/fs/nfs/fscache.h
> > > @@ -34,6 +34,49 @@ struct nfs_fscache_inode_auxdata {
> > >       u64     change_attr;
> > >  };
> > > 
> > > +struct nfs_netfs_io_data {
> > > +     /*
> > > +      * NFS may split a netfs_io_subrequest into multiple RPCs, each
> > > +      * with their own read completion.  In netfs, we can only call
> > > +      * netfs_subreq_terminated() once for each subrequest.  Use the
> > > +      * refcount here to double as a marker of the last RPC completion,
> > > +      * and only call netfs via netfs_subreq_terminated() once.
> > > +      */
> > > +     refcount_t                      refcount;
> > > +     struct netfs_io_subrequest      *sreq;
> > > +
> > > +     /*
> > > +      * Final disposition of the netfs_io_subrequest, sent in
> > > +      * netfs_subreq_terminated()
> > > +      */
> > > +     atomic64_t      transferred;
> > > +     int             error;
> > > +};
> > > +
> > > +static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
> > > +{
> > > +     refcount_inc(&netfs->refcount);
> > > +}
> > > +
> > > +static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
> > > +{
> > > +     /* Only the last RPC completion should call netfs_subreq_terminated() */
> > > +     if (refcount_dec_and_test(&netfs->refcount)) {
> > > +             netfs_subreq_terminated(netfs->sreq,
> > > +                                     netfs->error ?: atomic64_read(&netfs->transferred),
> > > +                                     false);
> > > +             kfree(netfs);
> > > +     }
> > > +}
> > > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi)
> > > +{
> > > +     netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
> > > +}
> > > +extern void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr);
> > > +extern void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr);
> > > +extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
> > > +extern void nfs_netfs_readpage_release(struct nfs_page *req);
> > > +
> > >  /*
> > >   * fscache.c
> > >   */
> > > @@ -44,9 +87,8 @@ extern void nfs_fscache_init_inode(struct inode *);
> > >  extern void nfs_fscache_clear_inode(struct inode *);
> > >  extern void nfs_fscache_open_file(struct inode *, struct file *);
> > >  extern void nfs_fscache_release_file(struct inode *, struct file *);
> > > -
> > > -extern int __nfs_fscache_read_page(struct inode *, struct page *);
> > > -extern void __nfs_fscache_write_page(struct inode *, struct page *);
> > > +extern int nfs_netfs_readahead(struct readahead_control *ractl);
> > > +extern int nfs_netfs_read_folio(struct file *file, struct folio *folio);
> > > 
> > >  static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > >  {
> > > @@ -54,34 +96,11 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > >               if (current_is_kswapd() || !(gfp & __GFP_FS))
> > >                       return false;
> > >               folio_wait_fscache(folio);
> > > -             fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> > > -             nfs_inc_fscache_stats(folio->mapping->host,
> > > -                                   NFSIOS_FSCACHE_PAGES_UNCACHED);
> > >       }
> > > +     fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> > >       return true;
> > >  }
> > > 
> > > -/*
> > > - * Retrieve a page from an inode data storage object.
> > > - */
> > > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > -{
> > > -     if (netfs_inode(inode)->cache)
> > > -             return __nfs_fscache_read_page(inode, page);
> > > -     return -ENOBUFS;
> > > -}
> > > -
> > > -/*
> > > - * Store a page newly fetched from the server in an inode data storage object
> > > - * in the cache.
> > > - */
> > > -static inline void nfs_fscache_write_page(struct inode *inode,
> > > -                                        struct page *page)
> > > -{
> > > -     if (netfs_inode(inode)->cache)
> > > -             __nfs_fscache_write_page(inode, page);
> > > -}
> > > -
> > >  static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
> > >                                             struct inode *inode)
> > >  {
> > > @@ -118,6 +137,14 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> > >  }
> > > 
> > >  #else /* CONFIG_NFS_FSCACHE */
> > > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi) {}
> > > +static inline void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr) {}
> > > +static inline void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr) {}
> > > +static inline void nfs_netfs_read_completion(struct nfs_pgio_header *hdr) {}
> > > +static inline void nfs_netfs_readpage_release(struct nfs_page *req)
> > > +{
> > > +     unlock_page(req->wb_page);
> > > +}
> > >  static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
> > > 
> > >  static inline void nfs_fscache_init_inode(struct inode *inode) {}
> > > @@ -125,16 +152,19 @@ static inline void nfs_fscache_clear_inode(struct inode *inode) {}
> > >  static inline void nfs_fscache_open_file(struct inode *inode,
> > >                                        struct file *filp) {}
> > >  static inline void nfs_fscache_release_file(struct inode *inode, struct file *file) {}
> > > -
> > > -static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > +static inline int nfs_netfs_readahead(struct readahead_control *ractl)
> > >  {
> > > -     return true; /* may release folio */
> > > +     return -ENOBUFS;
> > >  }
> > > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > +static inline int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> > >  {
> > >       return -ENOBUFS;
> > >  }
> > > -static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
> > > +
> > > +static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > +{
> > > +     return true; /* may release folio */
> > > +}
> > >  static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
> > > 
> > >  static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > > index aa2aec785ab5..b36a02b932e8 100644
> > > --- a/fs/nfs/inode.c
> > > +++ b/fs/nfs/inode.c
> > > @@ -2249,6 +2249,8 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
> > >  #ifdef CONFIG_NFS_V4_2
> > >       nfsi->xattr_cache = NULL;
> > >  #endif
> > > +     nfs_netfs_inode_init(nfsi);
> > > +
> > >       return VFS_I(nfsi);
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_alloc_inode);
> > > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > > index 273687082992..e5589036c1f8 100644
> > > --- a/fs/nfs/internal.h
> > > +++ b/fs/nfs/internal.h
> > > @@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
> > >  extern int nfs_client_for_each_server(struct nfs_client *clp,
> > >                                     int (*fn)(struct nfs_server *, void *),
> > >                                     void *data);
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +extern const struct netfs_request_ops nfs_netfs_ops;
> > > +#endif
> > > +
> > >  /* io.c */
> > >  extern void nfs_start_io_read(struct inode *inode);
> > >  extern void nfs_end_io_read(struct inode *inode);
> > > @@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
> > > 
> > >  struct nfs_pgio_completion_ops;
> > >  /* read.c */
> > > +extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > >  extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> > >                       struct inode *inode, bool force_mds,
> > >                       const struct nfs_pgio_completion_ops *compl_ops);
> > > +extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > > +                            struct nfs_open_context *ctx,
> > > +                            struct page *page);
> > > +extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
> > >  extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
> > >  extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
> > > 
> > > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> > > index 317cedfa52bf..e28754476d1b 100644
> > > --- a/fs/nfs/pagelist.c
> > > +++ b/fs/nfs/pagelist.c
> > > @@ -25,6 +25,7 @@
> > >  #include "internal.h"
> > >  #include "pnfs.h"
> > >  #include "nfstrace.h"
> > > +#include "fscache.h"
> > > 
> > >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> > > 
> > > @@ -68,6 +69,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
> > >       hdr->good_bytes = mirror->pg_count;
> > >       hdr->io_completion = desc->pg_io_completion;
> > >       hdr->dreq = desc->pg_dreq;
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +     if (desc->pg_netfs)
> > > +             hdr->netfs = desc->pg_netfs;
> > > +#endif
> > >       hdr->release = release;
> > >       hdr->completion_ops = desc->pg_completion_ops;
> > >       if (hdr->completion_ops->init_hdr)
> > > @@ -846,6 +851,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
> > >       desc->pg_lseg = NULL;
> > >       desc->pg_io_completion = NULL;
> > >       desc->pg_dreq = NULL;
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +     desc->pg_netfs = NULL;
> > > +#endif
> > >       desc->pg_bsize = bsize;
> > > 
> > >       desc->pg_mirror_count = 1;
> > > @@ -940,6 +948,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
> > >       /* Set up the argument struct */
> > >       nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
> > >       desc->pg_rpc_callops = &nfs_pgio_common_ops;
> > > +
> > >       return 0;
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_generic_pgio);
> > > @@ -1360,6 +1369,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
> > > 
> > >       desc->pg_io_completion = hdr->io_completion;
> > >       desc->pg_dreq = hdr->dreq;
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +     desc->pg_netfs = hdr->netfs;
> > > +#endif
> > >       list_splice_init(&hdr->pages, &pages);
> > >       while (!list_empty(&pages)) {
> > >               struct nfs_page *req = nfs_list_entry(pages.next);
> > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > index 525e82ea9a9e..c74c5fcba87d 100644
> > > --- a/fs/nfs/read.c
> > > +++ b/fs/nfs/read.c
> > > @@ -30,7 +30,7 @@
> > > 
> > >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> > > 
> > > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > >  static const struct nfs_rw_ops nfs_rw_read_ops;
> > > 
> > >  static struct kmem_cache *nfs_rdata_cachep;
> > > @@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
> > > 
> > > -static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> > > +void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> > >  {
> > >       struct nfs_pgio_mirror *pgm;
> > >       unsigned long npages;
> > > @@ -110,20 +110,13 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
> > > 
> > >  static void nfs_readpage_release(struct nfs_page *req, int error)
> > >  {
> > > -     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> > >       struct page *page = req->wb_page;
> > > 
> > > -     dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
> > > -             (unsigned long long)NFS_FILEID(inode), req->wb_bytes,
> > > -             (long long)req_offset(req));
> > > -
> > >       if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
> > >               SetPageError(page);
> > > -     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
> > > -             if (PageUptodate(page))
> > > -                     nfs_fscache_write_page(inode, page);
> > > -             unlock_page(page);
> > > -     }
> > > +     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE))
> > > +             nfs_netfs_readpage_release(req);
> > > +
> > >       nfs_release_request(req);
> > >  }
> > > 
> > > @@ -177,6 +170,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
> > >               nfs_list_remove_request(req);
> > >               nfs_readpage_release(req, error);
> > >       }
> > > +     nfs_netfs_read_completion(hdr);
> > > +
> > >  out:
> > >       hdr->release(hdr);
> > >  }
> > > @@ -187,6 +182,7 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
> > >                             struct rpc_task_setup *task_setup_data, int how)
> > >  {
> > >       rpc_ops->read_setup(hdr, msg);
> > > +     nfs_netfs_initiate_read(hdr);
> > >       trace_nfs_initiate_read(hdr);
> > >  }
> > > 
> > > @@ -202,7 +198,7 @@ nfs_async_read_error(struct list_head *head, int error)
> > >       }
> > >  }
> > > 
> > > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> > > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> > >       .error_cleanup = nfs_async_read_error,
> > >       .completion = nfs_read_completion,
> > >  };
> > > @@ -219,6 +215,7 @@ static int nfs_readpage_done(struct rpc_task *task,
> > >       if (status != 0)
> > >               return status;
> > > 
> > > +     nfs_netfs_readpage_done(hdr);
> > >       nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
> > >       trace_nfs_readpage_done(task, hdr);
> > > 
> > > @@ -294,12 +291,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > > 
> > >       aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
> > > 
> > > -     if (!IS_SYNC(page->mapping->host)) {
> > > -             error = nfs_fscache_read_page(page->mapping->host, page);
> > > -             if (error == 0)
> > > -                     goto out_unlock;
> > > -     }
> > > -
> > >       new = nfs_create_request(ctx, page, 0, aligned_len);
> > >       if (IS_ERR(new))
> > >               goto out_error;
> > > @@ -315,8 +306,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > >       return 0;
> > >  out_error:
> > >       error = PTR_ERR(new);
> > > -out_unlock:
> > > -     unlock_page(page);
> > >  out:
> > >       return error;
> > >  }
> > > @@ -355,6 +344,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > >       if (NFS_STALE(inode))
> > >               goto out_unlock;
> > > 
> > > +     ret = nfs_netfs_read_folio(file, folio);
> > > +     if (!ret)
> > > +             goto out;
> > > +
> > >       if (file == NULL) {
> > >               ret = -EBADF;
> > >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > > @@ -368,8 +361,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > >                            &nfs_async_read_completion_ops);
> > > 
> > >       ret = nfs_pageio_add_page(&pgio, ctx, page);
> > > -     if (ret)
> > > -             goto out;
> > > +     if (ret) {
> > > +             put_nfs_open_context(ctx);
> > > +             goto out_unlock;
> > > +     }
> > > 
> > >       nfs_pageio_complete_read(&pgio);
> > >       ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
> > > @@ -378,12 +373,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > >               if (!PageUptodate(page) && !ret)
> > >                       ret = xchg(&ctx->error, 0);
> > >       }
> > > -out:
> > >       put_nfs_open_context(ctx);
> > > -     trace_nfs_aop_readpage_done(inode, page, ret);
> > > -     return ret;
> > > +     goto out;
> > > +
> > >  out_unlock:
> > >       unlock_page(page);
> > > +out:
> > >       trace_nfs_aop_readpage_done(inode, page, ret);
> > >       return ret;
> > >  }
> > > @@ -405,6 +400,10 @@ void nfs_readahead(struct readahead_control *ractl)
> > >       if (NFS_STALE(inode))
> > >               goto out;
> > > 
> > > +     ret = nfs_netfs_readahead(ractl);
> > > +     if (!ret)
> > > +             goto out;
> > > +
> > >       if (file == NULL) {
> > >               ret = -EBADF;
> > >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > > diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
> > > index ba7e2e4b0926..8eeb16d9bacd 100644
> > > --- a/include/linux/nfs_page.h
> > > +++ b/include/linux/nfs_page.h
> > > @@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
> > >       struct pnfs_layout_segment *pg_lseg;
> > >       struct nfs_io_completion *pg_io_completion;
> > >       struct nfs_direct_req   *pg_dreq;
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +     void                    *pg_netfs;
> > > +#endif
> > 
> > 
> > Would it be possible to union this new field with pg_dreq? I don't think
> > they're ever both used in the same desc. There are some places that
> > check for pg_dreq == NULL that would need to be converted to use a new
> > flag or something, but that would allow us to avoid growing this struct.
> > 
> 
> Yeah it's a good point, though I'm not sure how easy it is.
> I was also thinking about whether to drop the #ifdefs in the two structures
> since the netfs NULL pointers are benign if unused.
> 
> Do you mean something like the below - bit fields to indicate whether
> the pointer is used or not?  There is an "io_flags" field inside
> nfs_pageio_descriptor.  However, it is used for FLUSH_* flags,
> so a new flag involving union members validity doesn't seem to fit there.
> Not sure if this helps, but another way maybe to look at it is that
> after this change, we could have 3 possibilities for nfs_pageio_descriptors
> - netfs: right now only fscache enabled buffered READs are handled
> - direct IO
> - everything else: (neither direct nor netfs)
> 
> It may be possible to unionize, but looks a bit tricky with things like
> pnfs resends (see pnfs.c and callers of nfs_pageio_init_read and
> nfs_pageio_init_write).  Might need to update the init functions for
> nfs_pageio_descriptor and nfs_pgio_header and looks a bit messy.
> Might be worth it but not sure - may take some time to work it out
> safely.
> 

I didn't put a lot of thought into the suggestion. I just noticed that
the two fields aren't used at the same time and count potentially be
unioned.

My thinking was to use the pg_ioflags, but you're right that they are
all named with FLUSH_* prefixes. They could be renamed though. The
bitfield flags are another option tool.

I'm not sure I follow the difficulty with nfs_pageio_init_{read,write}.

In any case, I'll leave it up to Trond/Anna whether they want this done
before merging.

> --- a/include/linux/nfs_page.h
> +++ b/include/linux/nfs_page.h
> @@ -100,10 +100,10 @@ struct nfs_pageio_descriptor {
>         const struct nfs_pgio_completion_ops *pg_completion_ops;
>         struct pnfs_layout_segment *pg_lseg;
>         struct nfs_io_completion *pg_io_completion;
> -       struct nfs_direct_req   *pg_dreq;
> -#ifdef CONFIG_NFS_FSCACHE
> -       void                    *pg_netfs;
> -#endif
> +       union {
> +               struct nfs_direct_req   *pg_dreq;  /* pg_is_direct */
> +               void                    *pg_netfs; /* pg_is_netfs */
> +       };
>         unsigned int            pg_bsize;       /* default bsize for mirrors */
> 
>         u32                     pg_mirror_count;
> @@ -113,6 +113,8 @@ struct nfs_pageio_descriptor {
>         u32                     pg_mirror_idx;  /* current mirror */
>         unsigned short          pg_maxretrans;
>         unsigned char           pg_moreio : 1;
> +       unsigned char           pg_is_direct: 1;
> +       unsigned char           pg_is_netfs: 1;
>  };
> 
> 
> 
> 
> 
> > >       unsigned int            pg_bsize;       /* default bsize for mirrors */
> > > 
> > >       u32                     pg_mirror_count;
> > > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > > index e86cf6642d21..e196ef595908 100644
> > > --- a/include/linux/nfs_xdr.h
> > > +++ b/include/linux/nfs_xdr.h
> > > @@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
> > >       const struct nfs_rw_ops *rw_ops;
> > >       struct nfs_io_completion *io_completion;
> > >       struct nfs_direct_req   *dreq;
> > > +#ifdef CONFIG_NFS_FSCACHE
> > > +     void                    *netfs;
> > > +#endif
> > > 
> > 
> > Maybe also here too?
> > 
> > >       int                     pnfs_error;
> > >       int                     error;          /* merge with pnfs_error */
> > 
> > --
> > Jeff Layton <jlayton@kernel.org>
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled
  2022-09-06 10:53       ` Jeff Layton
@ 2022-09-13 20:11         ` David Wysochanski
  0 siblings, 0 replies; 8+ messages in thread
From: David Wysochanski @ 2022-09-13 20:11 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Anna Schumaker, Trond Myklebust, David Howells, linux-nfs,
	linux-cachefs, Benjamin Maynard, Daire Byrne

On Tue, Sep 6, 2022 at 6:53 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Sun, 2022-09-04 at 15:51 -0400, David Wysochanski wrote:
> > On Sun, Sep 4, 2022 at 9:59 AM Jeff Layton <jlayton@kernel.org> wrote:
> > >
> > > On Sun, 2022-09-04 at 05:05 -0400, Dave Wysochanski wrote:
> > > > Convert the NFS buffered read code paths to corresponding netfs APIs,
> > > > but only when fscache is configured and enabled.
> > > >
> > > > The netfs API defines struct netfs_request_ops which must be filled
> > > > in by the network filesystem.  For NFS, we only need to define 5 of
> > > > the functions, the main one being the issue_read() function.
> > > > The issue_read() function is called by the netfs layer when a read
> > > > cannot be fulfilled locally, and must be sent to the server (either
> > > > the cache is not active, or it is active but the data is not available).
> > > > Once the read from the server is complete, netfs requires a call to
> > > > netfs_subreq_terminated() which conveys either how many bytes were read
> > > > successfully, or an error.  Note that issue_read() is called with a
> > > > structure, netfs_io_subrequest, which defines the IO requested, and
> > > > contains a start and a length (both in bytes), and assumes the underlying
> > > > netfs will return a either an error on the whole region, or the number
> > > > of bytes successfully read.
> > > >
> > > > The NFS IO path is page based and the main APIs are the pgio APIs defined
> > > > in pagelist.c.  For the pgio APIs, there is no way for the caller to
> > > > know how many RPCs will be sent and how the pages will be broken up
> > > > into underlying RPCs, each of which will have their own return code.
> > > > Thus, NFS needs some way to accommodate the netfs API requirement on
> > > > the single response to the whole request, while also minimizing
> > > > disruptive changes to the NFS pgio layer.  The approach taken with this
> > > > patch is to allocate a small structure for each nfs_netfs_issue_read() call
> > > > to keep the final error value or the number of bytes successfully read.
> > > > The refcount on the structure is used also as a marker for the last
> > > > RPC completion, updated inside nfs_netfs_read_initiate(), and
> > > > nfs_netfs_read_done(), when a nfs_pgio_header contains a valid pointer
> > > > to the data.  Then finally in nfs_read_completion(), call into
> > > > nfs_netfs_read_completion() to update the final error value and bytes
> > > > read, and check the refcount to determine whether this is the final
> > > > RPC completion.  If this is the last RPC, then in the final put on
> > > > the structure, call into netfs_subreq_terminated() with the final
> > > > error value or the number of bytes successfully transferred.
> > > >
> > > > Suggested-by: Jeff Layton <jlayton@kernel.org>
> > > > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > > > ---
> > > >  fs/nfs/fscache.c         | 241 ++++++++++++++++++++++++---------------
> > > >  fs/nfs/fscache.h         |  94 +++++++++------
> > > >  fs/nfs/inode.c           |   2 +
> > > >  fs/nfs/internal.h        |   9 ++
> > > >  fs/nfs/pagelist.c        |  12 ++
> > > >  fs/nfs/read.c            |  51 ++++-----
> > > >  include/linux/nfs_page.h |   3 +
> > > >  include/linux/nfs_xdr.h  |   3 +
> > > >  8 files changed, 263 insertions(+), 152 deletions(-)
> > > >
> > > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > > > index a6fc1c8b6644..9b7df3d61c35 100644
> > > > --- a/fs/nfs/fscache.c
> > > > +++ b/fs/nfs/fscache.c
> > > > @@ -15,6 +15,9 @@
> > > >  #include <linux/seq_file.h>
> > > >  #include <linux/slab.h>
> > > >  #include <linux/iversion.h>
> > > > +#include <linux/xarray.h>
> > > > +#include <linux/fscache.h>
> > > > +#include <linux/netfs.h>
> > > >
> > > >  #include "internal.h"
> > > >  #include "iostat.h"
> > > > @@ -184,7 +187,7 @@ void nfs_fscache_init_inode(struct inode *inode)
> > > >   */
> > > >  void nfs_fscache_clear_inode(struct inode *inode)
> > > >  {
> > > > -     fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
> > > > +     fscache_relinquish_cookie(netfs_i_cookie(netfs_inode(inode)), false);
> > > >       netfs_inode(inode)->cache = NULL;
> > > >  }
> > > >
> > > > @@ -210,7 +213,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
> > > >  void nfs_fscache_open_file(struct inode *inode, struct file *filp)
> > > >  {
> > > >       struct nfs_fscache_inode_auxdata auxdata;
> > > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> > > >       bool open_for_write = inode_is_open_for_write(inode);
> > > >
> > > >       if (!fscache_cookie_valid(cookie))
> > > > @@ -228,119 +231,169 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
> > > >  void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> > > >  {
> > > >       struct nfs_fscache_inode_auxdata auxdata;
> > > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > > +     struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode));
> > > >       loff_t i_size = i_size_read(inode);
> > > >
> > > >       nfs_fscache_update_auxdata(&auxdata, inode);
> > > >       fscache_unuse_cookie(cookie, &auxdata, &i_size);
> > > >  }
> > > >
> > > > -/*
> > > > - * Fallback page reading interface.
> > > > - */
> > > > -static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> > > > +int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> > > >  {
> > > > -     struct netfs_cache_resources cres;
> > > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > > -     struct iov_iter iter;
> > > > -     struct bio_vec bvec[1];
> > > > -     int ret;
> > > > -
> > > > -     memset(&cres, 0, sizeof(cres));
> > > > -     bvec[0].bv_page         = page;
> > > > -     bvec[0].bv_offset       = 0;
> > > > -     bvec[0].bv_len          = PAGE_SIZE;
> > > > -     iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > > > -
> > > > -     ret = fscache_begin_read_operation(&cres, cookie);
> > > > -     if (ret < 0)
> > > > -             return ret;
> > > > -
> > > > -     ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> > > > -                        NULL, NULL);
> > > > -     fscache_end_operation(&cres);
> > > > -     return ret;
> > > > +     if (!netfs_inode(folio_inode(folio))->cache)
> > > > +             return -ENOBUFS;
> > > > +
> > > > +     return netfs_read_folio(file, folio);
> > > >  }
> > > >
> > > > -/*
> > > > - * Fallback page writing interface.
> > > > - */
> > > > -static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> > > > -                                    bool no_space_allocated_yet)
> > > > +int nfs_netfs_readahead(struct readahead_control *ractl)
> > > >  {
> > > > -     struct netfs_cache_resources cres;
> > > > -     struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > > > -     struct iov_iter iter;
> > > > -     struct bio_vec bvec[1];
> > > > -     loff_t start = page_offset(page);
> > > > -     size_t len = PAGE_SIZE;
> > > > -     int ret;
> > > > -
> > > > -     memset(&cres, 0, sizeof(cres));
> > > > -     bvec[0].bv_page         = page;
> > > > -     bvec[0].bv_offset       = 0;
> > > > -     bvec[0].bv_len          = PAGE_SIZE;
> > > > -     iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > > > -
> > > > -     ret = fscache_begin_write_operation(&cres, cookie);
> > > > -     if (ret < 0)
> > > > -             return ret;
> > > > -
> > > > -     ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
> > > > -                                   no_space_allocated_yet);
> > > > -     if (ret == 0)
> > > > -             ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
> > > > -     fscache_end_operation(&cres);
> > > > -     return ret;
> > > > +     struct inode *inode = ractl->mapping->host;
> > > > +
> > > > +     if (!netfs_inode(inode)->cache)
> > > > +             return -ENOBUFS;
> > > > +
> > > > +     netfs_readahead(ractl);
> > > > +     return 0;
> > > >  }
> > > >
> > > > -/*
> > > > - * Retrieve a page from fscache
> > > > - */
> > > > -int __nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > > +atomic_t nfs_netfs_debug_id;
> > > > +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> > > >  {
> > > > -     int ret;
> > > > +     rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
> > > > +     rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
> > > >
> > > > -     trace_nfs_fscache_read_page(inode, page);
> > > > -     if (PageChecked(page)) {
> > > > -             ClearPageChecked(page);
> > > > -             ret = 1;
> > > > -             goto out;
> > > > -     }
> > > > +     return 0;
> > > > +}
> > > >
> > > > -     ret = fscache_fallback_read_page(inode, page);
> > > > -     if (ret < 0) {
> > > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
> > > > -             SetPageChecked(page);
> > > > -             goto out;
> > > > -     }
> > > > +static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> > > > +{
> > > > +     put_nfs_open_context(rreq->netfs_priv);
> > > > +}
> > > >
> > > > -     /* Read completed synchronously */
> > > > -     nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
> > > > -     SetPageUptodate(page);
> > > > -     ret = 0;
> > > > -out:
> > > > -     trace_nfs_fscache_read_page_exit(inode, page, ret);
> > > > -     return ret;
> > > > +static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
> > > > +{
> > > > +     return fscache_begin_read_operation(&rreq->cache_resources,
> > > > +                                         netfs_i_cookie(netfs_inode(rreq->inode)));
> > > >  }
> > > >
> > > > -/*
> > > > - * Store a newly fetched page in fscache.  We can be certain there's no page
> > > > - * stored in the cache as yet otherwise we would've read it from there.
> > > > - */
> > > > -void __nfs_fscache_write_page(struct inode *inode, struct page *page)
> > > > +static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
> > > >  {
> > > > -     int ret;
> > > > +     struct nfs_netfs_io_data *netfs;
> > > > +
> > > > +     netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
> > > > +     if (!netfs)
> > > > +             return NULL;
> > > > +     netfs->sreq = sreq;
> > > > +     refcount_set(&netfs->refcount, 1);
> > > > +     return netfs;
> > > > +}
> > > >
> > > > -     trace_nfs_fscache_write_page(inode, page);
> > > > +static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
> > > > +{
> > > > +     size_t  rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
> > > >
> > > > -     ret = fscache_fallback_write_page(inode, page, true);
> > > > +     sreq->len = min(sreq->len, rsize);
> > > > +     return true;
> > > > +}
> > > >
> > > > -     if (ret != 0) {
> > > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
> > > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
> > > > -     } else {
> > > > -             nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
> > > > +static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
> > > > +{
> > > > +     struct nfs_pageio_descriptor pgio;
> > > > +     struct inode *inode = sreq->rreq->inode;
> > > > +     struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
> > > > +     struct page *page;
> > > > +     int err;
> > > > +     pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
> > > > +     pgoff_t last = ((sreq->start + sreq->len -
> > > > +                      sreq->transferred - 1) >> PAGE_SHIFT);
> > > > +     XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
> > > > +
> > > > +     nfs_pageio_init_read(&pgio, inode, false,
> > > > +                          &nfs_async_read_completion_ops);
> > > > +
> > > > +     pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
> > > > +     if (!pgio.pg_netfs)
> > > > +             return netfs_subreq_terminated(sreq, -ENOMEM, false);
> > > > +
> > > > +     xas_lock(&xas);
> > > > +     xas_for_each(&xas, page, last) {
> > > > +             /* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs  */
> > > > +             xas_pause(&xas);
> > > > +             xas_unlock(&xas);
> > > > +             err = nfs_pageio_add_page(&pgio, ctx, page);
> > > > +             if (err < 0)
> > > > +                     return netfs_subreq_terminated(sreq, err, false);
> > > > +             xas_lock(&xas);
> > > >       }
> > > > -     trace_nfs_fscache_write_page_exit(inode, page, ret);
> > > > +     xas_unlock(&xas);
> > > > +     nfs_pageio_complete_read(&pgio);
> > > > +     nfs_netfs_put(pgio.pg_netfs);
> > > >  }
> > > > +
> > > > +void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr)
> > > > +{
> > > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > > +
> > > > +     if (!netfs)
> > > > +             return;
> > > > +
> > > > +     nfs_netfs_get(netfs);
> > > > +}
> > > > +
> > > > +void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr)
> > > > +{
> > > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > > +
> > > > +     if (!netfs)
> > > > +             return;
> > > > +
> > > > +     if (hdr->res.op_status)
> > > > +             /*
> > > > +              * Retryable errors such as BAD_STATEID will be re-issued,
> > > > +              * so reduce refcount.
> > > > +              */
> > > > +             nfs_netfs_put(netfs);
> > > > +}
> > > > +
> > > > +void nfs_netfs_readpage_release(struct nfs_page *req)
> > > > +{
> > > > +     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> > > > +
> > > > +     /*
> > > > +      * If fscache is enabled, netfs will unlock pages.
> > > > +      */
> > > > +     if (netfs_inode(inode)->cache)
> > > > +             return;
> > > > +
> > > > +     unlock_page(req->wb_page);
> > > > +}
> > > > +
> > > > +void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
> > > > +{
> > > > +     struct nfs_netfs_io_data        *netfs = hdr->netfs;
> > > > +     struct netfs_io_subrequest      *sreq;
> > > > +
> > > > +     if (!netfs)
> > > > +             return;
> > > > +
> > > > +     sreq = netfs->sreq;
> > > > +     if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> > > > +             __set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
> > > > +
> > > > +     if (hdr->error)
> > > > +             netfs->error = hdr->error;
> > > > +     else
> > > > +             atomic64_add(hdr->res.count, &netfs->transferred);
> > > > +
> > > > +     nfs_netfs_put(netfs);
> > > > +     hdr->netfs = NULL;
> > > > +}
> > > > +
> > > > +const struct netfs_request_ops nfs_netfs_ops = {
> > > > +     .init_request           = nfs_netfs_init_request,
> > > > +     .free_request           = nfs_netfs_free_request,
> > > > +     .begin_cache_operation  = nfs_netfs_begin_cache_operation,
> > > > +     .issue_read             = nfs_netfs_issue_read,
> > > > +     .clamp_length           = nfs_netfs_clamp_length
> > > > +};
> > > > diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> > > > index 38614ed8f951..fb782b917235 100644
> > > > --- a/fs/nfs/fscache.h
> > > > +++ b/fs/nfs/fscache.h
> > > > @@ -34,6 +34,49 @@ struct nfs_fscache_inode_auxdata {
> > > >       u64     change_attr;
> > > >  };
> > > >
> > > > +struct nfs_netfs_io_data {
> > > > +     /*
> > > > +      * NFS may split a netfs_io_subrequest into multiple RPCs, each
> > > > +      * with their own read completion.  In netfs, we can only call
> > > > +      * netfs_subreq_terminated() once for each subrequest.  Use the
> > > > +      * refcount here to double as a marker of the last RPC completion,
> > > > +      * and only call netfs via netfs_subreq_terminated() once.
> > > > +      */
> > > > +     refcount_t                      refcount;
> > > > +     struct netfs_io_subrequest      *sreq;
> > > > +
> > > > +     /*
> > > > +      * Final disposition of the netfs_io_subrequest, sent in
> > > > +      * netfs_subreq_terminated()
> > > > +      */
> > > > +     atomic64_t      transferred;
> > > > +     int             error;
> > > > +};
> > > > +
> > > > +static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
> > > > +{
> > > > +     refcount_inc(&netfs->refcount);
> > > > +}
> > > > +
> > > > +static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
> > > > +{
> > > > +     /* Only the last RPC completion should call netfs_subreq_terminated() */
> > > > +     if (refcount_dec_and_test(&netfs->refcount)) {
> > > > +             netfs_subreq_terminated(netfs->sreq,
> > > > +                                     netfs->error ?: atomic64_read(&netfs->transferred),
> > > > +                                     false);
> > > > +             kfree(netfs);
> > > > +     }
> > > > +}
> > > > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi)
> > > > +{
> > > > +     netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
> > > > +}
> > > > +extern void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr);
> > > > +extern void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr);
> > > > +extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
> > > > +extern void nfs_netfs_readpage_release(struct nfs_page *req);
> > > > +
> > > >  /*
> > > >   * fscache.c
> > > >   */
> > > > @@ -44,9 +87,8 @@ extern void nfs_fscache_init_inode(struct inode *);
> > > >  extern void nfs_fscache_clear_inode(struct inode *);
> > > >  extern void nfs_fscache_open_file(struct inode *, struct file *);
> > > >  extern void nfs_fscache_release_file(struct inode *, struct file *);
> > > > -
> > > > -extern int __nfs_fscache_read_page(struct inode *, struct page *);
> > > > -extern void __nfs_fscache_write_page(struct inode *, struct page *);
> > > > +extern int nfs_netfs_readahead(struct readahead_control *ractl);
> > > > +extern int nfs_netfs_read_folio(struct file *file, struct folio *folio);
> > > >
> > > >  static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > >  {
> > > > @@ -54,34 +96,11 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > >               if (current_is_kswapd() || !(gfp & __GFP_FS))
> > > >                       return false;
> > > >               folio_wait_fscache(folio);
> > > > -             fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> > > > -             nfs_inc_fscache_stats(folio->mapping->host,
> > > > -                                   NFSIOS_FSCACHE_PAGES_UNCACHED);
> > > >       }
> > > > +     fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> > > >       return true;
> > > >  }
> > > >
> > > > -/*
> > > > - * Retrieve a page from an inode data storage object.
> > > > - */
> > > > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > > -{
> > > > -     if (netfs_inode(inode)->cache)
> > > > -             return __nfs_fscache_read_page(inode, page);
> > > > -     return -ENOBUFS;
> > > > -}
> > > > -
> > > > -/*
> > > > - * Store a page newly fetched from the server in an inode data storage object
> > > > - * in the cache.
> > > > - */
> > > > -static inline void nfs_fscache_write_page(struct inode *inode,
> > > > -                                        struct page *page)
> > > > -{
> > > > -     if (netfs_inode(inode)->cache)
> > > > -             __nfs_fscache_write_page(inode, page);
> > > > -}
> > > > -
> > > >  static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
> > > >                                             struct inode *inode)
> > > >  {
> > > > @@ -118,6 +137,14 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> > > >  }
> > > >
> > > >  #else /* CONFIG_NFS_FSCACHE */
> > > > +static inline void nfs_netfs_inode_init(struct nfs_inode *nfsi) {}
> > > > +static inline void nfs_netfs_initiate_read(struct nfs_pgio_header *hdr) {}
> > > > +static inline void nfs_netfs_readpage_done(struct nfs_pgio_header *hdr) {}
> > > > +static inline void nfs_netfs_read_completion(struct nfs_pgio_header *hdr) {}
> > > > +static inline void nfs_netfs_readpage_release(struct nfs_page *req)
> > > > +{
> > > > +     unlock_page(req->wb_page);
> > > > +}
> > > >  static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
> > > >
> > > >  static inline void nfs_fscache_init_inode(struct inode *inode) {}
> > > > @@ -125,16 +152,19 @@ static inline void nfs_fscache_clear_inode(struct inode *inode) {}
> > > >  static inline void nfs_fscache_open_file(struct inode *inode,
> > > >                                        struct file *filp) {}
> > > >  static inline void nfs_fscache_release_file(struct inode *inode, struct file *file) {}
> > > > -
> > > > -static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > > +static inline int nfs_netfs_readahead(struct readahead_control *ractl)
> > > >  {
> > > > -     return true; /* may release folio */
> > > > +     return -ENOBUFS;
> > > >  }
> > > > -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> > > > +static inline int nfs_netfs_read_folio(struct file *file, struct folio *folio)
> > > >  {
> > > >       return -ENOBUFS;
> > > >  }
> > > > -static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
> > > > +
> > > > +static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> > > > +{
> > > > +     return true; /* may release folio */
> > > > +}
> > > >  static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
> > > >
> > > >  static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> > > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > > > index aa2aec785ab5..b36a02b932e8 100644
> > > > --- a/fs/nfs/inode.c
> > > > +++ b/fs/nfs/inode.c
> > > > @@ -2249,6 +2249,8 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
> > > >  #ifdef CONFIG_NFS_V4_2
> > > >       nfsi->xattr_cache = NULL;
> > > >  #endif
> > > > +     nfs_netfs_inode_init(nfsi);
> > > > +
> > > >       return VFS_I(nfsi);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(nfs_alloc_inode);
> > > > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > > > index 273687082992..e5589036c1f8 100644
> > > > --- a/fs/nfs/internal.h
> > > > +++ b/fs/nfs/internal.h
> > > > @@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
> > > >  extern int nfs_client_for_each_server(struct nfs_client *clp,
> > > >                                     int (*fn)(struct nfs_server *, void *),
> > > >                                     void *data);
> > > > +#ifdef CONFIG_NFS_FSCACHE
> > > > +extern const struct netfs_request_ops nfs_netfs_ops;
> > > > +#endif
> > > > +
> > > >  /* io.c */
> > > >  extern void nfs_start_io_read(struct inode *inode);
> > > >  extern void nfs_end_io_read(struct inode *inode);
> > > > @@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
> > > >
> > > >  struct nfs_pgio_completion_ops;
> > > >  /* read.c */
> > > > +extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > > >  extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> > > >                       struct inode *inode, bool force_mds,
> > > >                       const struct nfs_pgio_completion_ops *compl_ops);
> > > > +extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > > > +                            struct nfs_open_context *ctx,
> > > > +                            struct page *page);
> > > > +extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
> > > >  extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
> > > >  extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
> > > >
> > > > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> > > > index 317cedfa52bf..e28754476d1b 100644
> > > > --- a/fs/nfs/pagelist.c
> > > > +++ b/fs/nfs/pagelist.c
> > > > @@ -25,6 +25,7 @@
> > > >  #include "internal.h"
> > > >  #include "pnfs.h"
> > > >  #include "nfstrace.h"
> > > > +#include "fscache.h"
> > > >
> > > >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> > > >
> > > > @@ -68,6 +69,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
> > > >       hdr->good_bytes = mirror->pg_count;
> > > >       hdr->io_completion = desc->pg_io_completion;
> > > >       hdr->dreq = desc->pg_dreq;
> > > > +#ifdef CONFIG_NFS_FSCACHE
> > > > +     if (desc->pg_netfs)
> > > > +             hdr->netfs = desc->pg_netfs;
> > > > +#endif
> > > >       hdr->release = release;
> > > >       hdr->completion_ops = desc->pg_completion_ops;
> > > >       if (hdr->completion_ops->init_hdr)
> > > > @@ -846,6 +851,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
> > > >       desc->pg_lseg = NULL;
> > > >       desc->pg_io_completion = NULL;
> > > >       desc->pg_dreq = NULL;
> > > > +#ifdef CONFIG_NFS_FSCACHE
> > > > +     desc->pg_netfs = NULL;
> > > > +#endif
> > > >       desc->pg_bsize = bsize;
> > > >
> > > >       desc->pg_mirror_count = 1;
> > > > @@ -940,6 +948,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
> > > >       /* Set up the argument struct */
> > > >       nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
> > > >       desc->pg_rpc_callops = &nfs_pgio_common_ops;
> > > > +
> > > >       return 0;
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(nfs_generic_pgio);
> > > > @@ -1360,6 +1369,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
> > > >
> > > >       desc->pg_io_completion = hdr->io_completion;
> > > >       desc->pg_dreq = hdr->dreq;
> > > > +#ifdef CONFIG_NFS_FSCACHE
> > > > +     desc->pg_netfs = hdr->netfs;
> > > > +#endif
> > > >       list_splice_init(&hdr->pages, &pages);
> > > >       while (!list_empty(&pages)) {
> > > >               struct nfs_page *req = nfs_list_entry(pages.next);
> > > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > > index 525e82ea9a9e..c74c5fcba87d 100644
> > > > --- a/fs/nfs/read.c
> > > > +++ b/fs/nfs/read.c
> > > > @@ -30,7 +30,7 @@
> > > >
> > > >  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
> > > >
> > > > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > > > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> > > >  static const struct nfs_rw_ops nfs_rw_read_ops;
> > > >
> > > >  static struct kmem_cache *nfs_rdata_cachep;
> > > > @@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
> > > >
> > > > -static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> > > > +void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> > > >  {
> > > >       struct nfs_pgio_mirror *pgm;
> > > >       unsigned long npages;
> > > > @@ -110,20 +110,13 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
> > > >
> > > >  static void nfs_readpage_release(struct nfs_page *req, int error)
> > > >  {
> > > > -     struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> > > >       struct page *page = req->wb_page;
> > > >
> > > > -     dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
> > > > -             (unsigned long long)NFS_FILEID(inode), req->wb_bytes,
> > > > -             (long long)req_offset(req));
> > > > -
> > > >       if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
> > > >               SetPageError(page);
> > > > -     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
> > > > -             if (PageUptodate(page))
> > > > -                     nfs_fscache_write_page(inode, page);
> > > > -             unlock_page(page);
> > > > -     }
> > > > +     if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE))
> > > > +             nfs_netfs_readpage_release(req);
> > > > +
> > > >       nfs_release_request(req);
> > > >  }
> > > >
> > > > @@ -177,6 +170,8 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
> > > >               nfs_list_remove_request(req);
> > > >               nfs_readpage_release(req, error);
> > > >       }
> > > > +     nfs_netfs_read_completion(hdr);
> > > > +
> > > >  out:
> > > >       hdr->release(hdr);
> > > >  }
> > > > @@ -187,6 +182,7 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
> > > >                             struct rpc_task_setup *task_setup_data, int how)
> > > >  {
> > > >       rpc_ops->read_setup(hdr, msg);
> > > > +     nfs_netfs_initiate_read(hdr);
> > > >       trace_nfs_initiate_read(hdr);
> > > >  }
> > > >
> > > > @@ -202,7 +198,7 @@ nfs_async_read_error(struct list_head *head, int error)
> > > >       }
> > > >  }
> > > >
> > > > -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> > > > +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> > > >       .error_cleanup = nfs_async_read_error,
> > > >       .completion = nfs_read_completion,
> > > >  };
> > > > @@ -219,6 +215,7 @@ static int nfs_readpage_done(struct rpc_task *task,
> > > >       if (status != 0)
> > > >               return status;
> > > >
> > > > +     nfs_netfs_readpage_done(hdr);
> > > >       nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
> > > >       trace_nfs_readpage_done(task, hdr);
> > > >
> > > > @@ -294,12 +291,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > > >
> > > >       aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
> > > >
> > > > -     if (!IS_SYNC(page->mapping->host)) {
> > > > -             error = nfs_fscache_read_page(page->mapping->host, page);
> > > > -             if (error == 0)
> > > > -                     goto out_unlock;
> > > > -     }
> > > > -
> > > >       new = nfs_create_request(ctx, page, 0, aligned_len);
> > > >       if (IS_ERR(new))
> > > >               goto out_error;
> > > > @@ -315,8 +306,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> > > >       return 0;
> > > >  out_error:
> > > >       error = PTR_ERR(new);
> > > > -out_unlock:
> > > > -     unlock_page(page);
> > > >  out:
> > > >       return error;
> > > >  }
> > > > @@ -355,6 +344,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > > >       if (NFS_STALE(inode))
> > > >               goto out_unlock;
> > > >
> > > > +     ret = nfs_netfs_read_folio(file, folio);
> > > > +     if (!ret)
> > > > +             goto out;
> > > > +
> > > >       if (file == NULL) {
> > > >               ret = -EBADF;
> > > >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > > > @@ -368,8 +361,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > > >                            &nfs_async_read_completion_ops);
> > > >
> > > >       ret = nfs_pageio_add_page(&pgio, ctx, page);
> > > > -     if (ret)
> > > > -             goto out;
> > > > +     if (ret) {
> > > > +             put_nfs_open_context(ctx);
> > > > +             goto out_unlock;
> > > > +     }
> > > >
> > > >       nfs_pageio_complete_read(&pgio);
> > > >       ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
> > > > @@ -378,12 +373,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> > > >               if (!PageUptodate(page) && !ret)
> > > >                       ret = xchg(&ctx->error, 0);
> > > >       }
> > > > -out:
> > > >       put_nfs_open_context(ctx);
> > > > -     trace_nfs_aop_readpage_done(inode, page, ret);
> > > > -     return ret;
> > > > +     goto out;
> > > > +
> > > >  out_unlock:
> > > >       unlock_page(page);
> > > > +out:
> > > >       trace_nfs_aop_readpage_done(inode, page, ret);
> > > >       return ret;
> > > >  }
> > > > @@ -405,6 +400,10 @@ void nfs_readahead(struct readahead_control *ractl)
> > > >       if (NFS_STALE(inode))
> > > >               goto out;
> > > >
> > > > +     ret = nfs_netfs_readahead(ractl);
> > > > +     if (!ret)
> > > > +             goto out;
> > > > +
> > > >       if (file == NULL) {
> > > >               ret = -EBADF;
> > > >               ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> > > > diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
> > > > index ba7e2e4b0926..8eeb16d9bacd 100644
> > > > --- a/include/linux/nfs_page.h
> > > > +++ b/include/linux/nfs_page.h
> > > > @@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
> > > >       struct pnfs_layout_segment *pg_lseg;
> > > >       struct nfs_io_completion *pg_io_completion;
> > > >       struct nfs_direct_req   *pg_dreq;
> > > > +#ifdef CONFIG_NFS_FSCACHE
> > > > +     void                    *pg_netfs;
> > > > +#endif
> > >
> > >
> > > Would it be possible to union this new field with pg_dreq? I don't think
> > > they're ever both used in the same desc. There are some places that
> > > check for pg_dreq == NULL that would need to be converted to use a new
> > > flag or something, but that would allow us to avoid growing this struct.
> > >
> >
> > Yeah it's a good point, though I'm not sure how easy it is.
> > I was also thinking about whether to drop the #ifdefs in the two structures
> > since the netfs NULL pointers are benign if unused.
> >
> > Do you mean something like the below - bit fields to indicate whether
> > the pointer is used or not?  There is an "io_flags" field inside
> > nfs_pageio_descriptor.  However, it is used for FLUSH_* flags,
> > so a new flag involving union members validity doesn't seem to fit there.
> > Not sure if this helps, but another way maybe to look at it is that
> > after this change, we could have 3 possibilities for nfs_pageio_descriptors
> > - netfs: right now only fscache enabled buffered READs are handled
> > - direct IO
> > - everything else: (neither direct nor netfs)
> >
> > It may be possible to unionize, but looks a bit tricky with things like
> > pnfs resends (see pnfs.c and callers of nfs_pageio_init_read and
> > nfs_pageio_init_write).  Might need to update the init functions for
> > nfs_pageio_descriptor and nfs_pgio_header and looks a bit messy.
> > Might be worth it but not sure - may take some time to work it out
> > safely.
> >
>
> I didn't put a lot of thought into the suggestion. I just noticed that
> the two fields aren't used at the same time and count potentially be
> unioned.
>
> My thinking was to use the pg_ioflags, but you're right that they are
> all named with FLUSH_* prefixes. They could be renamed though. The
> bitfield flags are another option tool.
>
> I'm not sure I follow the difficulty with nfs_pageio_init_{read,write}.
>
Once we get away from a NULL pointer dreq meaning something,
we get into issues with pnfs_read_done_resend_to_mds(),
pnfs_read_resend_pnfs(), and pnfs_write_done_resend_to_mds()
because a new nfs_pageio_descriptor is initialized from
nfs_pgio_header.  So then we need flags inside struct
nfs_pgio_header too.


> In any case, I'll leave it up to Trond/Anna whether they want this done
> before merging.
>
I took a stab at this but did not address the issues with nfs_pgio_header.
I am not sure about this type of change because it gets into refactoring
the main NFS IO path, which I think Trond has objected to.  Personally
I don't think this saving of a pointer is worth it, unless we get into other
refactoring of the IO path, and maybe add helpers for initializers of
these structures.  Plus there are other pointers in nfs_pageio_descriptor
as well that may or may not get used like "pg_lseg" so adding pg_netfs
seems reasonable to me.

Trond or Anna, any thoughts here?


---
 fs/nfs/direct.c          |  4 ++++
 fs/nfs/fscache.c         |  2 ++
 fs/nfs/pagelist.c        | 19 ++++++++-----------
 fs/nfs/pnfs.c            |  2 +-
 include/linux/nfs_page.h | 11 +++++++----
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 1707f46b1335..92930280485d 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -324,6 +324,8 @@ static ssize_t
nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
                  &nfs_direct_read_completion_ops);
     get_dreq(dreq);
     desc.pg_dreq = dreq;
+    desc.pg_dreq_valid = 1;
+    desc.pg_netfs_valid = 0;
     inode_dio_begin(inode);

     while (iov_iter_count(iter)) {
@@ -526,6 +528,8 @@ static void nfs_direct_write_reschedule(struct
nfs_direct_req *dreq)
     nfs_pageio_init_write(&desc, dreq->inode, FLUSH_STABLE, false,
                   &nfs_direct_write_completion_ops);
     desc.pg_dreq = dreq;
+    desc.pg_dreq_valid = 1;
+    desc.pg_netfs_valid = 0;

     list_for_each_entry_safe(req, tmp, &reqs, wb_list) {
         /* Bump the transmission count */
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 9b7df3d61c35..9231815dd543 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -313,6 +313,8 @@ static void nfs_netfs_issue_read(struct
netfs_io_subrequest *sreq)
                  &nfs_async_read_completion_ops);

     pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
+    pgio.pg_netfs_valid = 1;
+    pgio.pg_dreq_valid = 0;
     if (!pgio.pg_netfs)
         return netfs_subreq_terminated(sreq, -ENOMEM, false);

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index e28754476d1b..461fc0c395a7 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -68,11 +68,10 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
     hdr->io_start = req_offset(hdr->req);
     hdr->good_bytes = mirror->pg_count;
     hdr->io_completion = desc->pg_io_completion;
-    hdr->dreq = desc->pg_dreq;
-#ifdef CONFIG_NFS_FSCACHE
-    if (desc->pg_netfs)
+    if (desc->pg_dreq_valid)
+        hdr->dreq = desc->pg_dreq;
+    if (desc->pg_netfs_valid)
         hdr->netfs = desc->pg_netfs;
-#endif
     hdr->release = release;
     hdr->completion_ops = desc->pg_completion_ops;
     if (hdr->completion_ops->init_hdr)
@@ -851,9 +850,7 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
     desc->pg_lseg = NULL;
     desc->pg_io_completion = NULL;
     desc->pg_dreq = NULL;
-#ifdef CONFIG_NFS_FSCACHE
     desc->pg_netfs = NULL;
-#endif
     desc->pg_bsize = bsize;

     desc->pg_mirror_count = 1;
@@ -920,7 +917,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
         }
     }

-    nfs_init_cinfo(&cinfo, desc->pg_inode, desc->pg_dreq);
+    nfs_init_cinfo(&cinfo, desc->pg_inode, desc->pg_dreq_valid ?
desc->pg_dreq : NULL);
     pages = hdr->page_array.pagevec;
     last_page = NULL;
     pageused = 0;
@@ -1368,10 +1365,10 @@ int nfs_pageio_resend(struct
nfs_pageio_descriptor *desc,
     LIST_HEAD(pages);

     desc->pg_io_completion = hdr->io_completion;
-    desc->pg_dreq = hdr->dreq;
-#ifdef CONFIG_NFS_FSCACHE
-    desc->pg_netfs = hdr->netfs;
-#endif
+    if (desc->pg_dreq_valid)
+        desc->pg_dreq = hdr->dreq;
+    if (desc->pg_netfs_valid)
+        desc->pg_netfs = hdr->netfs;
     list_splice_init(&hdr->pages, &pages);
     while (!list_empty(&pages)) {
         struct nfs_page *req = nfs_list_entry(pages.next);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 035bf2eac2cf..d48a97df6f6b 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2708,7 +2708,7 @@ pnfs_generic_pg_init_read(struct
nfs_pageio_descriptor *pgio, struct nfs_page *r
     pnfs_generic_pg_check_layout(pgio);
     pnfs_generic_pg_check_range(pgio, req);
     if (pgio->pg_lseg == NULL) {
-        if (pgio->pg_dreq == NULL)
+        if (!pgio->pg_dreq_valid && pgio->pg_dreq == NULL)
             rd_size = i_size_read(pgio->pg_inode) - req_offset(req);
         else
             rd_size = nfs_dreq_bytes_left(pgio->pg_dreq);
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index 8eeb16d9bacd..82bb77ba57b7 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -100,10 +100,11 @@ struct nfs_pageio_descriptor {
     const struct nfs_pgio_completion_ops *pg_completion_ops;
     struct pnfs_layout_segment *pg_lseg;
     struct nfs_io_completion *pg_io_completion;
-    struct nfs_direct_req    *pg_dreq;
-#ifdef CONFIG_NFS_FSCACHE
-    void            *pg_netfs;
-#endif
+    union {
+        struct nfs_direct_req   *pg_dreq;  /* see pg_dreq_valid */
+        void                    *pg_netfs; /* see pg_netfs_valid */
+    };
+
     unsigned int        pg_bsize;    /* default bsize for mirrors */

     u32            pg_mirror_count;
@@ -113,6 +114,8 @@ struct nfs_pageio_descriptor {
     u32            pg_mirror_idx;    /* current mirror */
     unsigned short        pg_maxretrans;
     unsigned char        pg_moreio : 1;
+    unsigned char        pg_dreq_valid: 1;  /* pg_dreq is in use */
+    unsigned char        pg_netfs_valid: 1; /* pg_netfs is in use */
 };

 /* arbitrarily selected limit to number of mirrors */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-13 20:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-04  9:05 [PATCH v6 0/3] Convert NFS with fscache to the netfs API Dave Wysochanski
2022-09-04  9:05 ` [PATCH v6 1/3] NFS: Rename readpage_async_filler to nfs_pageio_add_page Dave Wysochanski
2022-09-04  9:05 ` [PATCH v6 2/3] NFS: Configure support for netfs when NFS fscache is configured Dave Wysochanski
2022-09-04  9:05 ` [PATCH v6 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled Dave Wysochanski
2022-09-04 13:59   ` Jeff Layton
2022-09-04 19:51     ` David Wysochanski
2022-09-06 10:53       ` Jeff Layton
2022-09-13 20:11         ` David Wysochanski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.