linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/10] NFS client readdir per-page validation
@ 2021-01-20 16:59 Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 01/10] NFS: save the directory's change attribute on pagecache pages Benjamin Coddington
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Due to the constraint that the NFS readdir page cache must contain every
entry in cookie order from zero up to the entry of interest, the time or
operations required to complete a directory listing increase exponentially
with the size of the directory if the client is unable to keep the pagecache
stable.  The pagecache can be invalidated by a changing directory, or by
memory pressure on the client.  This can cause some trouble for the NFS
client reading large directories over slow connections.

We have a hueristic that allows eventual completion, but it only works as
long as there are no other readers simultaneously filling the pagecache.

I think we can resolve this problem by implementing per-page validation.  By
storing the directory's change version on the page, and checking for changes
to the directory on every READDIR, we can validate pages against each
reader's version of entry aligment.  Rather than attempting to assemble the
entire directory in a consistent manner in the pagecache, we can just
retrieve the section we're interested in emitting.

This set is a first pass at implementing this idea.  Please help me pound it
into acceptable shape or point out problems!  Thanks for any feedback.

Here's a small program that does a great job of demonstraing the client's
current readdir pagecache performance problem by dropping the directory's
pagecache at an interval while trying to emit every entry:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <signal.h>

#define BUF_SIZE 1024
#define EVICT_INTERVAL 5

int evict_pagecache(int fd) {
                return posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
}

int main(int argc, char **argv)
{
        int dir_fd;
        pid_t pid;
        cpu_set_t *cpusetp = CPU_ALLOC(2);
        off_t off;

        char buf[BUF_SIZE];

        if (argc < 2) { printf("%s <dir>\n", argv[0]); return 1; }

        dir_fd = open(argv[1], O_RDONLY|O_DIRECTORY|O_CLOEXEC);
        if (dir_fd < 0) { printf("cannot open dir\n"); return 1; }

        pid = fork();
        if (pid == 0) {
                CPU_SET(1, cpusetp);
                sched_setaffinity(0, sizeof(cpu_set_t), cpusetp);
                do {
                        evict_pagecache(dir_fd);
                        off = lseek(dir_fd, 0, SEEK_CUR);
                        printf("currently at %llu\n", off);
                        usleep(EVICT_INTERVAL * 1000000);
                } while (1);
        } else {
                CPU_SET(0, cpusetp);
                sched_setaffinity(0, sizeof(cpu_set_t), cpusetp);
                while (syscall(SYS_getdents, dir_fd, buf, BUF_SIZE)) {}
                kill(pid, SIGINT);
        }

        close(dir_fd);
        return 0;
}


Benjamin Coddington (10):
  NFS: save the directory's change attribute on pagecache pages
  NFSv4: Send GETATTR with READDIR
  NFS: Add a struct to track readdir pagecache location
  NFS: Keep the readdir pagecache cursor updated
  NFS: readdir per-page cache validation
  NFS: stash the readdir pagecache cursor on the open directory context
  NFS: Support headless readdir pagecache pages
  NFS: Reset pagecache cursor on llseek
  NFS: Remove nfs_readdir_dont_search_cache()
  NFS: Revalidate the directory pagecache on every nfs_readdir()

 fs/nfs/dir.c              | 210 +++++++++++++++++++++++++++-----------
 fs/nfs/nfs42proc.c        |   2 +-
 fs/nfs/nfs4proc.c         |  27 +++--
 fs/nfs/nfs4xdr.c          |   6 ++
 include/linux/nfs_fs.h    |   8 +-
 include/linux/nfs_fs_sb.h |   5 +
 include/linux/nfs_xdr.h   |   2 +
 7 files changed, 188 insertions(+), 72 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v1 01/10] NFS: save the directory's change attribute on pagecache pages
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 02/10] NFSv4: Send GETATTR with READDIR Benjamin Coddington
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

After a pagecache page has been filled with entries, set PagePrivate and
the directory's change attribute on the page.  This will help us perform
per-page invalidations in a later patch.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index ef827ae193d2..ade73ca42a52 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -53,6 +53,9 @@ static int nfs_closedir(struct inode *, struct file *);
 static int nfs_readdir(struct file *, struct dir_context *);
 static int nfs_fsync_dir(struct file *, loff_t, loff_t, int);
 static loff_t nfs_llseek_dir(struct file *, loff_t, int);
+static void nfs_readdir_invalidatepage(struct page *,
+			unsigned int, unsigned int);
+static int nfs_readdir_clear_page(struct page*, gfp_t);
 static void nfs_readdir_clear_array(struct page*);
 
 const struct file_operations nfs_dir_operations = {
@@ -65,6 +68,8 @@ const struct file_operations nfs_dir_operations = {
 };
 
 const struct address_space_operations nfs_dir_aops = {
+	.invalidatepage = nfs_readdir_invalidatepage,
+	.releasepage = nfs_readdir_clear_page,
 	.freepage = nfs_readdir_clear_array,
 };
 
@@ -181,6 +186,27 @@ static void nfs_readdir_page_init_array(struct page *page, u64 last_cookie)
 	array->last_cookie = last_cookie;
 	array->cookies_are_ordered = 1;
 	kunmap_atomic(array);
+	set_page_private(page, 0);
+}
+
+static int
+nfs_readdir_clear_page(struct page *page, gfp_t gfp_mask)
+{
+	detach_page_private(page);
+	return 1;
+}
+
+static void
+nfs_readdir_invalidatepage(struct page *page, unsigned int offset,
+							unsigned int length)
+{
+	nfs_readdir_clear_page(page, GFP_KERNEL);
+}
+
+static void
+nfs_readdir_set_page_verifier(struct page *page, unsigned long verf)
+{
+	attach_page_private(page, (void *)verf);
 }
 
 /*
@@ -744,6 +770,8 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc,
 		if (status != -ENOSPC)
 			continue;
 
+		nfs_readdir_set_page_verifier(page, desc->dir_verifier);
+
 		if (page->mapping != mapping) {
 			if (!--narrays)
 				break;
@@ -770,10 +798,13 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc,
 	case -EBADCOOKIE:
 		if (entry->eof) {
 			nfs_readdir_page_set_eof(page);
+			nfs_readdir_set_page_verifier(page, desc->dir_verifier);
 			status = 0;
 		}
 		break;
 	case -ENOSPC:
+		nfs_readdir_set_page_verifier(page, desc->dir_verifier);
+		fallthrough;
 	case -EAGAIN:
 		status = 0;
 		break;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 02/10] NFSv4: Send GETATTR with READDIR
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 01/10] NFS: save the directory's change attribute on pagecache pages Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 03/10] NFS: Add a struct to track readdir pagecache location Benjamin Coddington
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

For each batch of entries, track whether the directory has changed.  We can
use this information to better manage the cache when reading long
directories.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/nfs42proc.c        |  2 +-
 fs/nfs/nfs4proc.c         | 27 +++++++++++++++++++--------
 fs/nfs/nfs4xdr.c          |  6 ++++++
 include/linux/nfs_fs_sb.h |  5 +++++
 include/linux/nfs_xdr.h   |  2 ++
 5 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index f3fd935620fc..576296728889 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -1011,7 +1011,7 @@ static int _nfs42_proc_clone(struct rpc_message *msg, struct file *src_f,
 		.src_offset = src_offset,
 		.dst_offset = dst_offset,
 		.count = count,
-		.dst_bitmask = server->cache_consistency_bitmask,
+		.dst_bitmask = server->cache_consistency_bitmask_nl,
 	};
 	struct nfs42_clone_res res = {
 		.server	= server,
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2f4679a62712..5a2ea98dc1a6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3654,7 +3654,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
 	if (calldata->arg.fmode == 0 || calldata->arg.fmode == FMODE_READ) {
 		/* Close-to-open cache consistency revalidation */
 		if (!nfs4_have_delegation(inode, FMODE_READ)) {
-			calldata->arg.bitmask = NFS_SERVER(inode)->cache_consistency_bitmask;
+			calldata->arg.bitmask = NFS_SERVER(inode)->cache_consistency_bitmask_nl;
 			nfs4_bitmask_adjust(calldata->arg.bitmask, inode, NFS_SERVER(inode), NULL);
 		} else
 			calldata->arg.bitmask = NULL;
@@ -3882,7 +3882,10 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f
 		memcpy(server->cache_consistency_bitmask, res.attr_bitmask, sizeof(server->cache_consistency_bitmask));
 		server->cache_consistency_bitmask[0] &= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
 		server->cache_consistency_bitmask[1] &= FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
-		server->cache_consistency_bitmask[2] = 0;
+		server->cache_consistency_bitmask[2] = res.attr_bitmask[2] & FATTR4_WORD2_SECURITY_LABEL;
+
+		memcpy(server->cache_consistency_bitmask_nl, server->cache_consistency_bitmask, sizeof(server->cache_consistency_bitmask));
+		server->cache_consistency_bitmask_nl[2] = 0;
 
 		/* Avoid a regression due to buggy server */
 		for (i = 0; i < ARRAY_SIZE(res.exclcreat_bitmask); i++)
@@ -4451,7 +4454,7 @@ static int _nfs4_proc_access(struct inode *inode, struct nfs_access_entry *entry
 		res.fattr = nfs_alloc_fattr();
 		if (res.fattr == NULL)
 			return -ENOMEM;
-		args.bitmask = server->cache_consistency_bitmask;
+		args.bitmask = server->cache_consistency_bitmask_nl;
 	}
 	status = nfs4_call_sync(server->client, server, &msg, &args.seq_args, &res.seq_res, 0);
 	if (!status) {
@@ -4980,14 +4983,19 @@ static int _nfs4_proc_readdir(struct nfs_readdir_arg *nr_arg,
 		.rpc_resp = &res,
 		.rpc_cred = nr_arg->cred,
 	};
-	int			status;
+	int			status = -ENOMEM;
 
 	dprintk("%s: dentry = %pd2, cookie = %llu\n", __func__,
 		nr_arg->dentry, (unsigned long long)nr_arg->cookie);
 	if (!(server->caps & NFS_CAP_SECURITY_LABEL))
-		args.bitmask = server->attr_bitmask_nl;
+		args.bitmask = server->cache_consistency_bitmask_nl;
 	else
-		args.bitmask = server->attr_bitmask;
+		args.bitmask = server->cache_consistency_bitmask;
+
+	res.dir_attr = nfs_alloc_fattr();
+	if (res.dir_attr == NULL)
+		goto out;
+	res.server = NFS_SERVER(dir);
 
 	nfs4_setup_readdir(nr_arg->cookie, nr_arg->verf, nr_arg->dentry, &args);
 	res.pgbase = args.pgbase;
@@ -5000,6 +5008,9 @@ static int _nfs4_proc_readdir(struct nfs_readdir_arg *nr_arg,
 
 	nfs_invalidate_atime(dir);
 
+	nfs_refresh_inode(dir, res.dir_attr);
+	nfs_free_fattr(res.dir_attr);
+out:
 	dprintk("%s: returns %d\n", __func__, status);
 	return status;
 }
@@ -5465,7 +5476,7 @@ static void nfs4_proc_write_setup(struct nfs_pgio_header *hdr,
 		hdr->args.bitmask = NULL;
 		hdr->res.fattr = NULL;
 	} else {
-		hdr->args.bitmask = server->cache_consistency_bitmask;
+		hdr->args.bitmask = server->cache_consistency_bitmask_nl;
 		nfs4_bitmask_adjust(hdr->args.bitmask, hdr->inode, server, NULL);
 	}
 
@@ -6505,7 +6516,7 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred,
 
 	data->args.fhandle = &data->fh;
 	data->args.stateid = &data->stateid;
-	data->args.bitmask = server->cache_consistency_bitmask;
+	data->args.bitmask = server->cache_consistency_bitmask_nl;
 	nfs4_bitmask_adjust(data->args.bitmask, inode, server, NULL);
 	nfs_copy_fh(&data->fh, NFS_FH(inode));
 	nfs4_stateid_copy(&data->stateid, stateid);
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index ac6b79ee9355..02a3df42d61d 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -459,10 +459,12 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
 #define NFS4_enc_readdir_sz	(compound_encode_hdr_maxsz + \
 				encode_sequence_maxsz + \
 				encode_putfh_maxsz + \
+				encode_getattr_maxsz + \
 				encode_readdir_maxsz)
 #define NFS4_dec_readdir_sz	(compound_decode_hdr_maxsz + \
 				decode_sequence_maxsz + \
 				decode_putfh_maxsz + \
+				decode_getattr_maxsz + \
 				decode_readdir_maxsz)
 #define NFS4_enc_write_sz	(compound_encode_hdr_maxsz + \
 				encode_sequence_maxsz + \
@@ -2519,6 +2521,7 @@ static void nfs4_xdr_enc_readdir(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
+	encode_getfattr(xdr, args->bitmask, &hdr);
 	encode_readdir(xdr, args, req, &hdr);
 
 	rpc_prepare_reply_pages(req, args->pages, args->pgbase,
@@ -6703,6 +6706,9 @@ static int nfs4_xdr_dec_readdir(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
 	if (status)
 		goto out;
 	status = decode_putfh(xdr);
+	if (status)
+		goto out;
+	status = decode_getfattr(xdr, res->dir_attr, res->server);
 	if (status)
 		goto out;
 	status = decode_readdir(xdr, rqstp, res);
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 38e60ec742df..8b20ff42c22d 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -211,6 +211,11 @@ struct nfs_server {
 						   of change attribute, size, ctime
 						   and mtime attributes supported by
 						   the server */
+	u32			cache_consistency_bitmask_nl[3];
+						/* V4 bitmask representing the subset
+						   of change attribute, size, ctime
+						   and mtime attributes supported by
+						   the server excluding label support */
 	u32			acl_bitmask;	/* V4 bitmask representing the ACEs
 						   that are supported on this
 						   filesystem */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 3327239fa2f9..de12053ed4d3 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1139,6 +1139,8 @@ struct nfs4_readdir_res {
 	struct nfs4_sequence_res	seq_res;
 	nfs4_verifier			verifier;
 	unsigned int			pgbase;
+	struct nfs_fattr		*dir_attr;
+	const struct nfs_server *server;
 };
 
 struct nfs4_readlink {
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 03/10] NFS: Add a struct to track readdir pagecache location
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 01/10] NFS: save the directory's change attribute on pagecache pages Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 02/10] NFSv4: Send GETATTR with READDIR Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated Benjamin Coddington
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Directory entries in the NFS readdir pagecache are referenced by their
cookie value and offset.  By defining a structure to group these values,
we'll simplify changes to validate pagecache pages in patches that follow.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c           | 34 +++++++++++++++++-----------------
 include/linux/nfs_fs.h |  6 ++++++
 2 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index ade73ca42a52..6626aff9f54d 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -154,10 +154,9 @@ struct nfs_readdir_descriptor {
 	struct file	*file;
 	struct page	*page;
 	struct dir_context *ctx;
-	pgoff_t		page_index;
 	u64		dir_cookie;
-	u64		last_cookie;
 	u64		dup_cookie;
+	struct nfs_dir_page_cursor pgc;
 	loff_t		current_index;
 	loff_t		prev_index;
 
@@ -166,7 +165,6 @@ struct nfs_readdir_descriptor {
 	unsigned long	timestamp;
 	unsigned long	gencount;
 	unsigned long	attr_gencount;
-	unsigned int	cache_entry_index;
 	signed char duped;
 	bool plus;
 	bool eof;
@@ -426,7 +424,7 @@ static int nfs_readdir_search_for_pos(struct nfs_cache_array *array,
 
 	index = (unsigned int)diff;
 	desc->dir_cookie = array->array[index].cookie;
-	desc->cache_entry_index = index;
+	desc->pgc.entry_index = index;
 	return 0;
 out_eof:
 	desc->eof = true;
@@ -494,7 +492,7 @@ static int nfs_readdir_search_for_cookie(struct nfs_cache_array *array,
 			else
 				desc->ctx->pos = new_pos;
 			desc->prev_index = new_pos;
-			desc->cache_entry_index = i;
+			desc->pgc.entry_index = i;
 			return 0;
 		}
 	}
@@ -521,9 +519,9 @@ static int nfs_readdir_search_array(struct nfs_readdir_descriptor *desc)
 		status = nfs_readdir_search_for_cookie(array, desc);
 
 	if (status == -EAGAIN) {
-		desc->last_cookie = array->last_cookie;
+		desc->pgc.index_cookie = array->last_cookie;
 		desc->current_index += array->size;
-		desc->page_index++;
+		desc->pgc.page_index++;
 	}
 	kunmap_atomic(array);
 	return status;
@@ -927,8 +925,8 @@ static struct page *
 nfs_readdir_page_get_cached(struct nfs_readdir_descriptor *desc)
 {
 	return nfs_readdir_page_get_locked(desc->file->f_mapping,
-					   desc->page_index,
-					   desc->last_cookie);
+					   desc->pgc.page_index,
+					   desc->pgc.index_cookie);
 }
 
 /*
@@ -952,7 +950,7 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc)
 			nfs_readdir_page_unlock_and_put_cached(desc);
 			if (res == -EBADCOOKIE || res == -ENOTSYNC) {
 				invalidate_inode_pages2(desc->file->f_mapping);
-				desc->page_index = 0;
+				desc->pgc.page_index = 0;
 				return -EAGAIN;
 			}
 			return res;
@@ -961,7 +959,7 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc)
 	}
 	res = nfs_readdir_search_array(desc);
 	if (res == 0) {
-		nfsi->page_index = desc->page_index;
+		nfsi->page_index = desc->pgc.page_index;
 		return 0;
 	}
 	nfs_readdir_page_unlock_and_put_cached(desc);
@@ -991,10 +989,10 @@ static int readdir_search_pagecache(struct nfs_readdir_descriptor *desc)
 		return -EBADCOOKIE;
 
 	do {
-		if (desc->page_index == 0) {
+		if (desc->pgc.page_index == 0) {
 			desc->current_index = 0;
 			desc->prev_index = 0;
-			desc->last_cookie = 0;
+			desc->pgc.index_cookie = 0;
 		}
 		res = find_and_lock_cache_page(desc);
 	} while (res == -EAGAIN);
@@ -1012,7 +1010,7 @@ static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
 	unsigned int i = 0;
 
 	array = kmap(desc->page);
-	for (i = desc->cache_entry_index; i < array->size; i++) {
+	for (i = desc->pgc.entry_index; i < array->size; i++) {
 		struct nfs_cache_array_entry *ent;
 
 		ent = &array->array[i];
@@ -1070,8 +1068,10 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc)
 	if (!arrays[0])
 		goto out;
 
-	desc->page_index = 0;
-	desc->last_cookie = desc->dir_cookie;
+	/* These values aren't referenced until we return, move/remove? */
+	desc->pgc.page_index = 0;
+	desc->pgc.index_cookie = desc->dir_cookie;
+
 	desc->duped = 0;
 
 	status = nfs_readdir_xdr_to_array(desc, desc->verf, verf, arrays, sz);
@@ -1154,7 +1154,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
 		if (res == -ETOOSMALL && desc->plus) {
 			clear_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(inode)->flags);
 			nfs_zap_caches(inode);
-			desc->page_index = 0;
+			desc->pgc.page_index = 0;
 			desc->plus = false;
 			desc->eof = false;
 			continue;
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 681ed98e4ba8..557e54b7d555 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -91,6 +91,12 @@ struct nfs_open_context {
 	struct rcu_head	rcu_head;
 };
 
+struct nfs_dir_page_cursor {
+	__u64 index_cookie;
+	pgoff_t page_index;
+	unsigned int entry_index;
+};
+
 struct nfs_open_dir_context {
 	struct list_head list;
 	unsigned long attr_gencount;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (2 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 03/10] NFS: Add a struct to track readdir pagecache location Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-21 20:00   ` Trond Myklebust
  2021-01-20 16:59 ` [PATCH v1 05/10] NFS: readdir per-page cache validation Benjamin Coddington
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Whenever we successfully locate our dir_cookie within the pagecache, or
finish emitting entries to userspace, update the pagecache cursor.  These
updates provide marker points to validate pagecache pages in a future
patch.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 6626aff9f54d..7f6c84c8a412 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -150,6 +150,10 @@ struct nfs_cache_array {
 	struct nfs_cache_array_entry array[];
 };
 
+static const int cache_entries_per_page =
+	(PAGE_SIZE - sizeof(struct nfs_cache_array)) /
+	sizeof(struct nfs_cache_array_entry);
+
 struct nfs_readdir_descriptor {
 	struct file	*file;
 	struct page	*page;
@@ -251,6 +255,21 @@ static bool nfs_readdir_array_is_full(struct nfs_cache_array *array)
 	return array->page_full;
 }
 
+static void nfs_readdir_set_cursor(struct nfs_readdir_descriptor *desc, int index)
+{
+	desc->pgc.entry_index = index;
+	desc->pgc.index_cookie = desc->dir_cookie;
+}
+
+static void nfs_readdir_cursor_next(struct nfs_dir_page_cursor *pgc, u64 cookie)
+{
+	pgc->index_cookie = cookie;
+	if (++pgc->entry_index == cache_entries_per_page) {
+		pgc->entry_index = 0;
+		pgc->page_index++;
+	}
+}
+
 /*
  * the caller is responsible for freeing qstr.name
  * when called by nfs_readdir_add_to_array, the strings will be freed in
@@ -424,7 +443,7 @@ static int nfs_readdir_search_for_pos(struct nfs_cache_array *array,
 
 	index = (unsigned int)diff;
 	desc->dir_cookie = array->array[index].cookie;
-	desc->pgc.entry_index = index;
+	nfs_readdir_set_cursor(desc, index);
 	return 0;
 out_eof:
 	desc->eof = true;
@@ -492,7 +511,7 @@ static int nfs_readdir_search_for_cookie(struct nfs_cache_array *array,
 			else
 				desc->ctx->pos = new_pos;
 			desc->prev_index = new_pos;
-			desc->pgc.entry_index = i;
+			nfs_readdir_set_cursor(desc, i);
 			return 0;
 		}
 	}
@@ -519,9 +538,9 @@ static int nfs_readdir_search_array(struct nfs_readdir_descriptor *desc)
 		status = nfs_readdir_search_for_cookie(array, desc);
 
 	if (status == -EAGAIN) {
-		desc->pgc.index_cookie = array->last_cookie;
+		desc->pgc.entry_index = array->size - 1;
+		nfs_readdir_cursor_next(&desc->pgc, array->last_cookie);
 		desc->current_index += array->size;
-		desc->pgc.page_index++;
 	}
 	kunmap_atomic(array);
 	return status;
@@ -1035,6 +1054,8 @@ static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
 		desc->eof = true;
 
 	kunmap(desc->page);
+	desc->pgc.entry_index = i-1;
+	nfs_readdir_cursor_next(&desc->pgc, desc->dir_cookie);
 	dfprintk(DIRCACHE, "NFS: nfs_do_filldir() filling ended @ cookie %llu\n",
 			(unsigned long long)desc->dir_cookie);
 }
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 05/10] NFS: readdir per-page cache validation
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (3 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 20:08   ` kernel test robot
  2021-01-20 16:59 ` [PATCH v1 06/10] NFS: stash the readdir pagecache cursor on the open directory context Benjamin Coddington
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

The current implementation of the readdir page cache requires that all
pages contain entries ordered such that the cookie references lead to the
first entry as represented by cookie 0.  The invalidation of the cache
truncates either the entire cache or every page beyond a known good page.

A process that wants to emit directory entries near the end of a directory
must first fill in any entries missing in the cache near the beginning of
the directory in order that the entries decoded from READDIR XDR are
appropriately page-aligned for any readers thay may come later (and for
some error handling).

However, if we're careful to check the alignment of directory entries on
each page when the page is read, then it should be permissable to allow
"disconnected" filling of the pagecache.  Rather than requiring pagecache
data to always be positionally aligned, we can instead validate that each
page is properly aligned to the reading process' directory context. If it
doesn't match our alignment, we'll refresh the entries in the page so that
it does.

This patch implements a check for validity for each page as it is obtained
from the pagecache.  A page is valid if it was filled within the client's
current version of the directory and if the entries are aligned with the
current reader's directory context.

Invalid pages are re-filled by READDIR operations before being used to emit
entries for the current reader.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 71 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 7f6c84c8a412..5fc22d97054a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -188,13 +188,16 @@ static void nfs_readdir_page_init_array(struct page *page, u64 last_cookie)
 	array->last_cookie = last_cookie;
 	array->cookies_are_ordered = 1;
 	kunmap_atomic(array);
-	set_page_private(page, 0);
+	if (page->mapping)
+		set_page_private(page, nfs_save_change_attribute(page->mapping->host));
+	SetPageUptodate(page);
 }
 
 static int
 nfs_readdir_clear_page(struct page *page, gfp_t gfp_mask)
 {
-	detach_page_private(page);
+	unsigned long change_attr;
+	change_attr = (unsigned long)detach_page_private(page);
 	return 1;
 }
 
@@ -225,6 +228,15 @@ void nfs_readdir_clear_array(struct page *page)
 		kfree(array->array[i].name);
 	nfs_readdir_array_init(array);
 	kunmap_atomic(array);
+	ClearPageUptodate(page);
+}
+
+static void
+nfs_readdir_recycle_page(struct page *page, u64 last_cookie)
+{
+	nfs_readdir_clear_array(page);
+	nfs_readdir_invalidatepage(page, 0, 0);
+	nfs_readdir_page_init_array(page, last_cookie);
 }
 
 static struct page *
@@ -341,18 +353,47 @@ int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page)
 	return ret;
 }
 
+static bool
+nfs_readdir_page_valid(struct page *page, unsigned int entry_index, u64 index_cookie)
+{
+	bool ret = false;
+	struct nfs_cache_array *array;
+
+	if (page_private(page) != nfs_save_change_attribute(page->mapping->host))
+		goto out;
+
+	ret = true;
+	array = kmap_atomic(page);
+
+	if (array->size == 0 && array->last_cookie == index_cookie)
+		goto out_unmap;
+
+	if (array->size > entry_index &&
+		array->array[entry_index].cookie == index_cookie)
+		goto out_unmap;
+
+	ret = false;
+out_unmap:
+	kunmap_atomic(array);
+out:
+	return ret;
+}
+
 static struct page *nfs_readdir_page_get_locked(struct address_space *mapping,
-						pgoff_t index, u64 last_cookie)
+						struct nfs_dir_page_cursor *pgc)
 {
 	struct page *page;
 
-	page = grab_cache_page(mapping, index);
-	if (page && !PageUptodate(page)) {
-		nfs_readdir_page_init_array(page, last_cookie);
-		if (invalidate_inode_pages2_range(mapping, index + 1, -1) < 0)
-			nfs_zap_mapping(mapping->host, mapping);
-		SetPageUptodate(page);
-	}
+	page = grab_cache_page(mapping, pgc->page_index);
+
+	if (!page)
+		return page;
+
+	if (!PageUptodate(page))
+		nfs_readdir_page_init_array(page, pgc->index_cookie);
+
+	if (!nfs_readdir_page_valid(page, pgc->entry_index, pgc->index_cookie))
+		nfs_readdir_recycle_page(page, pgc->index_cookie);
 
 	return page;
 }
@@ -398,8 +439,12 @@ static struct page *nfs_readdir_page_get_next(struct address_space *mapping,
 					      pgoff_t index, u64 cookie)
 {
 	struct page *page;
+	struct nfs_dir_page_cursor pgc = {
+		.page_index = index,
+		.index_cookie = cookie,
+	};
 
-	page = nfs_readdir_page_get_locked(mapping, index, cookie);
+	page = nfs_readdir_page_get_locked(mapping, &pgc);
 	if (page) {
 		if (nfs_readdir_page_last_cookie(page) == cookie)
 			return page;
@@ -943,9 +988,7 @@ nfs_readdir_page_unlock_and_put_cached(struct nfs_readdir_descriptor *desc)
 static struct page *
 nfs_readdir_page_get_cached(struct nfs_readdir_descriptor *desc)
 {
-	return nfs_readdir_page_get_locked(desc->file->f_mapping,
-					   desc->pgc.page_index,
-					   desc->pgc.index_cookie);
+	return nfs_readdir_page_get_locked(desc->file->f_mapping, &desc->pgc);
 }
 
 /*
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 06/10] NFS: stash the readdir pagecache cursor on the open directory context
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (4 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 05/10] NFS: readdir per-page cache validation Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 07/10] NFS: Support headless readdir pagecache pages Benjamin Coddington
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Now that we have per-page cache validation, we can allow filling of the
pagecache from arbitrary offsets.  Store the pagecache cursor on the open
directory context so it can be used to validate our entries the next time
we enter nfs_readdir() without having to fill the pagecache from the
beginning.

The open_directory_context's dir_cookie and index_cookie will store the
same value between calls to nfs_readdir(), so we can save a little room in
the struct by dropping dir_cookie.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c           | 15 ++++++++-------
 include/linux/nfs_fs.h |  2 +-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5fc22d97054a..4e21b21c75d0 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -81,7 +81,7 @@ static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir
 	if (ctx != NULL) {
 		ctx->duped = 0;
 		ctx->attr_gencount = nfsi->attr_gencount;
-		ctx->dir_cookie = 0;
+		memset(&ctx->pgc, 0, sizeof(struct nfs_dir_page_cursor));
 		ctx->dup_cookie = 0;
 		spin_lock(&dir->i_lock);
 		if (list_empty(&nfsi->open_files) &&
@@ -365,7 +365,8 @@ nfs_readdir_page_valid(struct page *page, unsigned int entry_index, u64 index_co
 	ret = true;
 	array = kmap_atomic(page);
 
-	if (array->size == 0 && array->last_cookie == index_cookie)
+	if ((array->size == 0 || array->size == entry_index)
+		&& array->last_cookie == index_cookie)
 		goto out_unmap;
 
 	if (array->size > entry_index &&
@@ -1054,7 +1055,6 @@ static int readdir_search_pagecache(struct nfs_readdir_descriptor *desc)
 		if (desc->pgc.page_index == 0) {
 			desc->current_index = 0;
 			desc->prev_index = 0;
-			desc->pgc.index_cookie = 0;
 		}
 		res = find_and_lock_cache_page(desc);
 	} while (res == -EAGAIN);
@@ -1192,7 +1192,8 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
 	desc->plus = nfs_use_readdirplus(inode, ctx);
 
 	spin_lock(&file->f_lock);
-	desc->dir_cookie = dir_ctx->dir_cookie;
+	desc->dir_cookie = dir_ctx->pgc.index_cookie;
+	desc->pgc = dir_ctx->pgc;
 	desc->dup_cookie = dir_ctx->dup_cookie;
 	desc->duped = dir_ctx->duped;
 	desc->attr_gencount = dir_ctx->attr_gencount;
@@ -1231,7 +1232,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
 	} while (!desc->eof);
 
 	spin_lock(&file->f_lock);
-	dir_ctx->dir_cookie = desc->dir_cookie;
+	dir_ctx->pgc = desc->pgc;
 	dir_ctx->dup_cookie = desc->dup_cookie;
 	dir_ctx->duped = desc->duped;
 	dir_ctx->attr_gencount = desc->attr_gencount;
@@ -1273,9 +1274,9 @@ static loff_t nfs_llseek_dir(struct file *filp, loff_t offset, int whence)
 	if (offset != filp->f_pos) {
 		filp->f_pos = offset;
 		if (nfs_readdir_use_cookie(filp))
-			dir_ctx->dir_cookie = offset;
+			dir_ctx->pgc.index_cookie = offset;
 		else
-			dir_ctx->dir_cookie = 0;
+			dir_ctx->pgc.index_cookie = 0;
 		if (offset == 0)
 			memset(dir_ctx->verf, 0, sizeof(dir_ctx->verf));
 		dir_ctx->duped = 0;
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 557e54b7d555..cd35137a935b 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -101,7 +101,7 @@ struct nfs_open_dir_context {
 	struct list_head list;
 	unsigned long attr_gencount;
 	__be32	verf[NFS_DIR_VERIFIER_SIZE];
-	__u64 dir_cookie;
+	struct nfs_dir_page_cursor pgc;
 	__u64 dup_cookie;
 	signed char duped;
 };
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 07/10] NFS: Support headless readdir pagecache pages
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (5 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 06/10] NFS: stash the readdir pagecache cursor on the open directory context Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-21 17:24   ` Anna Schumaker
  2021-01-20 16:59 ` [PATCH v1 08/10] NFS: Reset pagecache cursor on llseek Benjamin Coddington
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

It is now possible that a reader will resume a directory listing after an
invalidation and fill the rest of the pages with the offset left over from
the last partially-filled page.  These pages will then be recycled and
refilled by the next reader since their alignment is incorrect.

Add an index to the nfs_cache_array that will indicate where the next entry
should be filled.  This allows partially-filled pages to have the best
alignment possible.  They are more likely to be useful to readers that
follow.

This optimization targets the case when there are multiple processes
listing the directory simultaneously.  Often the processes will collect and
block on the same page waiting for a READDIR call to fill the pagecache.
If the pagecache is invalidated, a partially-filled page will usually
result.  This partially-filled page can immediately be used by all
processes to emit entries rather than having to discard and refill it for
every process.

The addition of another integer to struct nfs_cache_array increases its
size to 24 bytes. We do not lose the original capacity of 127 entries per
page.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 45 ++++++++++++++++++++++++++-------------------
 1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 4e21b21c75d0..d6101e45fd66 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -143,6 +143,7 @@ struct nfs_cache_array_entry {
 
 struct nfs_cache_array {
 	u64 last_cookie;
+	unsigned int index;
 	unsigned int size;
 	unsigned char page_full : 1,
 		      page_is_eof : 1,
@@ -179,13 +180,15 @@ static void nfs_readdir_array_init(struct nfs_cache_array *array)
 	memset(array, 0, sizeof(struct nfs_cache_array));
 }
 
-static void nfs_readdir_page_init_array(struct page *page, u64 last_cookie)
+static void
+nfs_readdir_page_init_array(struct page *page, struct nfs_dir_page_cursor *pgc)
 {
 	struct nfs_cache_array *array;
 
 	array = kmap_atomic(page);
 	nfs_readdir_array_init(array);
-	array->last_cookie = last_cookie;
+	array->last_cookie = pgc->index_cookie;
+	array->index = pgc->entry_index;
 	array->cookies_are_ordered = 1;
 	kunmap_atomic(array);
 	if (page->mapping)
@@ -224,7 +227,7 @@ void nfs_readdir_clear_array(struct page *page)
 	int i;
 
 	array = kmap_atomic(page);
-	for (i = 0; i < array->size; i++)
+	for (i = array->index - array->size; i < array->size; i++)
 		kfree(array->array[i].name);
 	nfs_readdir_array_init(array);
 	kunmap_atomic(array);
@@ -232,19 +235,20 @@ void nfs_readdir_clear_array(struct page *page)
 }
 
 static void
-nfs_readdir_recycle_page(struct page *page, u64 last_cookie)
+nfs_readdir_recycle_page(struct page *page, struct nfs_dir_page_cursor *pgc)
 {
 	nfs_readdir_clear_array(page);
 	nfs_readdir_invalidatepage(page, 0, 0);
-	nfs_readdir_page_init_array(page, last_cookie);
+	nfs_readdir_page_init_array(page, pgc);
 }
 
 static struct page *
 nfs_readdir_page_array_alloc(u64 last_cookie, gfp_t gfp_flags)
 {
 	struct page *page = alloc_page(gfp_flags);
+	struct nfs_dir_page_cursor pgc = { .index_cookie = last_cookie };
 	if (page)
-		nfs_readdir_page_init_array(page, last_cookie);
+		nfs_readdir_page_init_array(page, &pgc);
 	return page;
 }
 
@@ -309,7 +313,7 @@ static int nfs_readdir_array_can_expand(struct nfs_cache_array *array)
 
 	if (array->page_full)
 		return -ENOSPC;
-	cache_entry = &array->array[array->size + 1];
+	cache_entry = &array->array[array->index + 1];
 	if ((char *)cache_entry - (char *)array > PAGE_SIZE) {
 		array->page_full = 1;
 		return -ENOSPC;
@@ -336,7 +340,7 @@ int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page)
 		goto out;
 	}
 
-	cache_entry = &array->array[array->size];
+	cache_entry = &array->array[array->index];
 	cache_entry->cookie = entry->prev_cookie;
 	cache_entry->ino = entry->ino;
 	cache_entry->d_type = entry->d_type;
@@ -345,6 +349,7 @@ int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page)
 	array->last_cookie = entry->cookie;
 	if (array->last_cookie <= cache_entry->cookie)
 		array->cookies_are_ordered = 0;
+	array->index++;
 	array->size++;
 	if (entry->eof != 0)
 		nfs_readdir_array_set_eof(array);
@@ -365,13 +370,15 @@ nfs_readdir_page_valid(struct page *page, unsigned int entry_index, u64 index_co
 	ret = true;
 	array = kmap_atomic(page);
 
-	if ((array->size == 0 || array->size == entry_index)
-		&& array->last_cookie == index_cookie)
-		goto out_unmap;
+	if (entry_index >= array->index - array->size) {
+		if ((array->size == 0 || array->size == entry_index)
+			&& array->last_cookie == index_cookie)
+			goto out_unmap;
 
-	if (array->size > entry_index &&
-		array->array[entry_index].cookie == index_cookie)
-		goto out_unmap;
+		if (array->size > entry_index &&
+			array->array[entry_index].cookie == index_cookie)
+			goto out_unmap;
+	}
 
 	ret = false;
 out_unmap:
@@ -391,10 +398,10 @@ static struct page *nfs_readdir_page_get_locked(struct address_space *mapping,
 		return page;
 
 	if (!PageUptodate(page))
-		nfs_readdir_page_init_array(page, pgc->index_cookie);
+		nfs_readdir_page_init_array(page, pgc);
 
 	if (!nfs_readdir_page_valid(page, pgc->entry_index, pgc->index_cookie))
-		nfs_readdir_recycle_page(page, pgc->index_cookie);
+		nfs_readdir_recycle_page(page, pgc);
 
 	return page;
 }
@@ -513,7 +520,7 @@ static bool nfs_readdir_array_cookie_in_range(struct nfs_cache_array *array,
 	/* Optimisation for monotonically increasing cookies */
 	if (cookie >= array->last_cookie)
 		return false;
-	if (array->size && cookie < array->array[0].cookie)
+	if (array->size && cookie < array->array[array->index - array->size].cookie)
 		return false;
 	return true;
 }
@@ -528,7 +535,7 @@ static int nfs_readdir_search_for_cookie(struct nfs_cache_array *array,
 	if (!nfs_readdir_array_cookie_in_range(array, desc->dir_cookie))
 		goto check_eof;
 
-	for (i = 0; i < array->size; i++) {
+	for (i = array->index - array->size; i < array->index; i++) {
 		if (array->array[i].cookie == desc->dir_cookie) {
 			struct nfs_inode *nfsi = NFS_I(file_inode(desc->file));
 
@@ -1072,7 +1079,7 @@ static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
 	unsigned int i = 0;
 
 	array = kmap(desc->page);
-	for (i = desc->pgc.entry_index; i < array->size; i++) {
+	for (i = desc->pgc.entry_index; i < array->index; i++) {
 		struct nfs_cache_array_entry *ent;
 
 		ent = &array->array[i];
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 08/10] NFS: Reset pagecache cursor on llseek
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (6 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 07/10] NFS: Support headless readdir pagecache pages Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 09/10] NFS: Remove nfs_readdir_dont_search_cache() Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 10/10] NFS: Revalidate the directory pagecache on every nfs_readdir() Benjamin Coddington
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Unless we reset the cursor, a series of lseek() getdents() calls will
continue filling the pagecache indefinitely.  Instead, reset the cursor so
that we always start filling at the beginning of the pagecache.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index d6101e45fd66..7ca79d4b25ec 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1284,6 +1284,8 @@ static loff_t nfs_llseek_dir(struct file *filp, loff_t offset, int whence)
 			dir_ctx->pgc.index_cookie = offset;
 		else
 			dir_ctx->pgc.index_cookie = 0;
+		dir_ctx->pgc.page_index = 0;
+		dir_ctx->pgc.entry_index = 0;
 		if (offset == 0)
 			memset(dir_ctx->verf, 0, sizeof(dir_ctx->verf));
 		dir_ctx->duped = 0;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 09/10] NFS: Remove nfs_readdir_dont_search_cache()
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (7 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 08/10] NFS: Reset pagecache cursor on llseek Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  2021-01-20 16:59 ` [PATCH v1 10/10] NFS: Revalidate the directory pagecache on every nfs_readdir() Benjamin Coddington
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Since we can now fill the pagecache at arbitrary offsets, we no longer need
this optimization.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 7ca79d4b25ec..fc9e72341220 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1036,28 +1036,11 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc)
 	return res;
 }
 
-static bool nfs_readdir_dont_search_cache(struct nfs_readdir_descriptor *desc)
-{
-	struct address_space *mapping = desc->file->f_mapping;
-	struct inode *dir = file_inode(desc->file);
-	unsigned int dtsize = NFS_SERVER(dir)->dtsize;
-	loff_t size = i_size_read(dir);
-
-	/*
-	 * Default to uncached readdir if the page cache is empty, and
-	 * we're looking for a non-zero cookie in a large directory.
-	 */
-	return desc->dir_cookie != 0 && mapping->nrpages == 0 && size > dtsize;
-}
-
 /* Search for desc->dir_cookie from the beginning of the page cache */
 static int readdir_search_pagecache(struct nfs_readdir_descriptor *desc)
 {
 	int res;
 
-	if (nfs_readdir_dont_search_cache(desc))
-		return -EBADCOOKIE;
-
 	do {
 		if (desc->pgc.page_index == 0) {
 			desc->current_index = 0;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 10/10] NFS: Revalidate the directory pagecache on every nfs_readdir()
  2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
                   ` (8 preceding siblings ...)
  2021-01-20 16:59 ` [PATCH v1 09/10] NFS: Remove nfs_readdir_dont_search_cache() Benjamin Coddington
@ 2021-01-20 16:59 ` Benjamin Coddington
  9 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-20 16:59 UTC (permalink / raw)
  To: linux-nfs

Since there's no longer a significant performance penalty for dropping the
pagecache, and because we want to ensure that the directory's change
attribute is accurate before validating pagecache pages, revalidate the
directory's mapping on every call to nfs_readdir().

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index fc9e72341220..842412454e3d 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1167,11 +1167,9 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
 	 * to either find the entry with the appropriate number or
 	 * revalidate the cookie.
 	 */
-	if (ctx->pos == 0 || nfs_attribute_cache_expired(inode)) {
-		res = nfs_revalidate_mapping(inode, file->f_mapping);
-		if (res < 0)
-			goto out;
-	}
+	res = nfs_revalidate_mapping(inode, file->f_mapping);
+	if (res < 0)
+		goto out;
 
 	res = -ENOMEM;
 	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 05/10] NFS: readdir per-page cache validation
  2021-01-20 16:59 ` [PATCH v1 05/10] NFS: readdir per-page cache validation Benjamin Coddington
@ 2021-01-20 20:08   ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2021-01-20 20:08 UTC (permalink / raw)
  To: Benjamin Coddington, linux-nfs; +Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2060 bytes --]

Hi Benjamin,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nfs/linux-next]
[also build test WARNING on v5.11-rc4 next-20210120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Benjamin-Coddington/NFS-client-readdir-per-page-validation/20210121-011011
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
config: nds32-defconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/27dada15cc13542a100bb36e20b49c6315c6170c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Benjamin-Coddington/NFS-client-readdir-per-page-validation/20210121-011011
        git checkout 27dada15cc13542a100bb36e20b49c6315c6170c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   fs/nfs/dir.c: In function 'nfs_readdir_clear_page':
>> fs/nfs/dir.c:199:16: warning: variable 'change_attr' set but not used [-Wunused-but-set-variable]
     199 |  unsigned long change_attr;
         |                ^~~~~~~~~~~


vim +/change_attr +199 fs/nfs/dir.c

   195	
   196	static int
   197	nfs_readdir_clear_page(struct page *page, gfp_t gfp_mask)
   198	{
 > 199		unsigned long change_attr;
   200		change_attr = (unsigned long)detach_page_private(page);
   201		return 1;
   202	}
   203	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10815 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 07/10] NFS: Support headless readdir pagecache pages
  2021-01-20 16:59 ` [PATCH v1 07/10] NFS: Support headless readdir pagecache pages Benjamin Coddington
@ 2021-01-21 17:24   ` Anna Schumaker
  2021-01-21 17:57     ` Benjamin Coddington
  0 siblings, 1 reply; 17+ messages in thread
From: Anna Schumaker @ 2021-01-21 17:24 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: Linux NFS Mailing List

Hi Ben,

On Wed, Jan 20, 2021 at 12:04 PM Benjamin Coddington
<bcodding@redhat.com> wrote:
>
> It is now possible that a reader will resume a directory listing after an
> invalidation and fill the rest of the pages with the offset left over from
> the last partially-filled page.  These pages will then be recycled and
> refilled by the next reader since their alignment is incorrect.
>
> Add an index to the nfs_cache_array that will indicate where the next entry
> should be filled.  This allows partially-filled pages to have the best
> alignment possible.  They are more likely to be useful to readers that
> follow.
>
> This optimization targets the case when there are multiple processes
> listing the directory simultaneously.  Often the processes will collect and
> block on the same page waiting for a READDIR call to fill the pagecache.
> If the pagecache is invalidated, a partially-filled page will usually
> result.  This partially-filled page can immediately be used by all
> processes to emit entries rather than having to discard and refill it for
> every process.
>
> The addition of another integer to struct nfs_cache_array increases its
> size to 24 bytes. We do not lose the original capacity of 127 entries per
> page.

This patch causes cthon basic test #6 to start failing with unexpected
dir entries across all NFS versions:

  ./test6: readdir
  basic tests failed
      ./test6: (/mnt/test/anna/Connectathon/cthon04) unexpected dir entry 'h'

Luckily, the next patch seems to resolve this issue. Could they maybe
be squashed together?

Thanks,
Anna


Anna

>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/dir.c | 45 ++++++++++++++++++++++++++-------------------
>  1 file changed, 26 insertions(+), 19 deletions(-)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 4e21b21c75d0..d6101e45fd66 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -143,6 +143,7 @@ struct nfs_cache_array_entry {
>
>  struct nfs_cache_array {
>         u64 last_cookie;
> +       unsigned int index;
>         unsigned int size;
>         unsigned char page_full : 1,
>                       page_is_eof : 1,
> @@ -179,13 +180,15 @@ static void nfs_readdir_array_init(struct nfs_cache_array *array)
>         memset(array, 0, sizeof(struct nfs_cache_array));
>  }
>
> -static void nfs_readdir_page_init_array(struct page *page, u64 last_cookie)
> +static void
> +nfs_readdir_page_init_array(struct page *page, struct nfs_dir_page_cursor *pgc)
>  {
>         struct nfs_cache_array *array;
>
>         array = kmap_atomic(page);
>         nfs_readdir_array_init(array);
> -       array->last_cookie = last_cookie;
> +       array->last_cookie = pgc->index_cookie;
> +       array->index = pgc->entry_index;
>         array->cookies_are_ordered = 1;
>         kunmap_atomic(array);
>         if (page->mapping)
> @@ -224,7 +227,7 @@ void nfs_readdir_clear_array(struct page *page)
>         int i;
>
>         array = kmap_atomic(page);
> -       for (i = 0; i < array->size; i++)
> +       for (i = array->index - array->size; i < array->size; i++)
>                 kfree(array->array[i].name);
>         nfs_readdir_array_init(array);
>         kunmap_atomic(array);
> @@ -232,19 +235,20 @@ void nfs_readdir_clear_array(struct page *page)
>  }
>
>  static void
> -nfs_readdir_recycle_page(struct page *page, u64 last_cookie)
> +nfs_readdir_recycle_page(struct page *page, struct nfs_dir_page_cursor *pgc)
>  {
>         nfs_readdir_clear_array(page);
>         nfs_readdir_invalidatepage(page, 0, 0);
> -       nfs_readdir_page_init_array(page, last_cookie);
> +       nfs_readdir_page_init_array(page, pgc);
>  }
>
>  static struct page *
>  nfs_readdir_page_array_alloc(u64 last_cookie, gfp_t gfp_flags)
>  {
>         struct page *page = alloc_page(gfp_flags);
> +       struct nfs_dir_page_cursor pgc = { .index_cookie = last_cookie };
>         if (page)
> -               nfs_readdir_page_init_array(page, last_cookie);
> +               nfs_readdir_page_init_array(page, &pgc);
>         return page;
>  }
>
> @@ -309,7 +313,7 @@ static int nfs_readdir_array_can_expand(struct nfs_cache_array *array)
>
>         if (array->page_full)
>                 return -ENOSPC;
> -       cache_entry = &array->array[array->size + 1];
> +       cache_entry = &array->array[array->index + 1];
>         if ((char *)cache_entry - (char *)array > PAGE_SIZE) {
>                 array->page_full = 1;
>                 return -ENOSPC;
> @@ -336,7 +340,7 @@ int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page)
>                 goto out;
>         }
>
> -       cache_entry = &array->array[array->size];
> +       cache_entry = &array->array[array->index];
>         cache_entry->cookie = entry->prev_cookie;
>         cache_entry->ino = entry->ino;
>         cache_entry->d_type = entry->d_type;
> @@ -345,6 +349,7 @@ int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page)
>         array->last_cookie = entry->cookie;
>         if (array->last_cookie <= cache_entry->cookie)
>                 array->cookies_are_ordered = 0;
> +       array->index++;
>         array->size++;
>         if (entry->eof != 0)
>                 nfs_readdir_array_set_eof(array);
> @@ -365,13 +370,15 @@ nfs_readdir_page_valid(struct page *page, unsigned int entry_index, u64 index_co
>         ret = true;
>         array = kmap_atomic(page);
>
> -       if ((array->size == 0 || array->size == entry_index)
> -               && array->last_cookie == index_cookie)
> -               goto out_unmap;
> +       if (entry_index >= array->index - array->size) {
> +               if ((array->size == 0 || array->size == entry_index)
> +                       && array->last_cookie == index_cookie)
> +                       goto out_unmap;
>
> -       if (array->size > entry_index &&
> -               array->array[entry_index].cookie == index_cookie)
> -               goto out_unmap;
> +               if (array->size > entry_index &&
> +                       array->array[entry_index].cookie == index_cookie)
> +                       goto out_unmap;
> +       }
>
>         ret = false;
>  out_unmap:
> @@ -391,10 +398,10 @@ static struct page *nfs_readdir_page_get_locked(struct address_space *mapping,
>                 return page;
>
>         if (!PageUptodate(page))
> -               nfs_readdir_page_init_array(page, pgc->index_cookie);
> +               nfs_readdir_page_init_array(page, pgc);
>
>         if (!nfs_readdir_page_valid(page, pgc->entry_index, pgc->index_cookie))
> -               nfs_readdir_recycle_page(page, pgc->index_cookie);
> +               nfs_readdir_recycle_page(page, pgc);
>
>         return page;
>  }
> @@ -513,7 +520,7 @@ static bool nfs_readdir_array_cookie_in_range(struct nfs_cache_array *array,
>         /* Optimisation for monotonically increasing cookies */
>         if (cookie >= array->last_cookie)
>                 return false;
> -       if (array->size && cookie < array->array[0].cookie)
> +       if (array->size && cookie < array->array[array->index - array->size].cookie)
>                 return false;
>         return true;
>  }
> @@ -528,7 +535,7 @@ static int nfs_readdir_search_for_cookie(struct nfs_cache_array *array,
>         if (!nfs_readdir_array_cookie_in_range(array, desc->dir_cookie))
>                 goto check_eof;
>
> -       for (i = 0; i < array->size; i++) {
> +       for (i = array->index - array->size; i < array->index; i++) {
>                 if (array->array[i].cookie == desc->dir_cookie) {
>                         struct nfs_inode *nfsi = NFS_I(file_inode(desc->file));
>
> @@ -1072,7 +1079,7 @@ static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
>         unsigned int i = 0;
>
>         array = kmap(desc->page);
> -       for (i = desc->pgc.entry_index; i < array->size; i++) {
> +       for (i = desc->pgc.entry_index; i < array->index; i++) {
>                 struct nfs_cache_array_entry *ent;
>
>                 ent = &array->array[i];
> --
> 2.25.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 07/10] NFS: Support headless readdir pagecache pages
  2021-01-21 17:24   ` Anna Schumaker
@ 2021-01-21 17:57     ` Benjamin Coddington
  0 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-21 17:57 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: Linux NFS Mailing List

On 21 Jan 2021, at 12:24, Anna Schumaker wrote:

> Hi Ben,
>
> On Wed, Jan 20, 2021 at 12:04 PM Benjamin Coddington
> <bcodding@redhat.com> wrote:
>>
>> It is now possible that a reader will resume a directory listing 
>> after an
>> invalidation and fill the rest of the pages with the offset left over 
>> from
>> the last partially-filled page.  These pages will then be recycled 
>> and
>> refilled by the next reader since their alignment is incorrect.
>>
>> Add an index to the nfs_cache_array that will indicate where the next 
>> entry
>> should be filled.  This allows partially-filled pages to have the 
>> best
>> alignment possible.  They are more likely to be useful to readers 
>> that
>> follow.
>>
>> This optimization targets the case when there are multiple processes
>> listing the directory simultaneously.  Often the processes will 
>> collect and
>> block on the same page waiting for a READDIR call to fill the 
>> pagecache.
>> If the pagecache is invalidated, a partially-filled page will usually
>> result.  This partially-filled page can immediately be used by all
>> processes to emit entries rather than having to discard and refill it 
>> for
>> every process.
>>
>> The addition of another integer to struct nfs_cache_array increases 
>> its
>> size to 24 bytes. We do not lose the original capacity of 127 entries 
>> per
>> page.
>
> This patch causes cthon basic test #6 to start failing with unexpected
> dir entries across all NFS versions:
>
>   ./test6: readdir
>   basic tests failed
>       ./test6: (/mnt/test/anna/Connectathon/cthon04) unexpected dir 
> entry 'h'

Ah, yes -- because we /must/ revalidate before using any cached pages.  
I
think this test unlinks files and that would be a breakage.

> Luckily, the next patch seems to resolve this issue. Could they maybe
> be squashed together?

Yes, they should be squashed - I'll do that in another pass.  I'd like 
to
get rid of uncached_readdir() altogether and then send another version.
Thanks for the test!

Ben


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated
  2021-01-20 16:59 ` [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated Benjamin Coddington
@ 2021-01-21 20:00   ` Trond Myklebust
  2021-01-21 20:11     ` Trond Myklebust
  0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2021-01-21 20:00 UTC (permalink / raw)
  To: linux-nfs, bcodding

On Wed, 2021-01-20 at 11:59 -0500, Benjamin Coddington wrote:
> Whenever we successfully locate our dir_cookie within the pagecache,
> or
> finish emitting entries to userspace, update the pagecache cursor. 
> These
> updates provide marker points to validate pagecache pages in a future
> patch.

How isn't this going to end up subject to the exact same problem that
Dave Wysochanski's patchset had?

> 
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/dir.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 6626aff9f54d..7f6c84c8a412 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -150,6 +150,10 @@ struct nfs_cache_array {
>         struct nfs_cache_array_entry array[];
>  };
>  
> +static const int cache_entries_per_page =
> +       (PAGE_SIZE - sizeof(struct nfs_cache_array)) /
> +       sizeof(struct nfs_cache_array_entry);
> +
>  struct nfs_readdir_descriptor {
>         struct file     *file;
>         struct page     *page;
> @@ -251,6 +255,21 @@ static bool nfs_readdir_array_is_full(struct
> nfs_cache_array *array)
>         return array->page_full;
>  }
>  
> +static void nfs_readdir_set_cursor(struct nfs_readdir_descriptor
> *desc, int index)
> +{
> +       desc->pgc.entry_index = index;
> +       desc->pgc.index_cookie = desc->dir_cookie;
> +}
> +
> +static void nfs_readdir_cursor_next(struct nfs_dir_page_cursor *pgc,
> u64 cookie)
> +{
> +       pgc->index_cookie = cookie;
> +       if (++pgc->entry_index == cache_entries_per_page) {
> +               pgc->entry_index = 0;
> +               pgc->page_index++;
> +       }
> +}
> +
>  /*
>   * the caller is responsible for freeing qstr.name
>   * when called by nfs_readdir_add_to_array, the strings will be
> freed in
> @@ -424,7 +443,7 @@ static int nfs_readdir_search_for_pos(struct
> nfs_cache_array *array,
>  
>         index = (unsigned int)diff;
>         desc->dir_cookie = array->array[index].cookie;
> -       desc->pgc.entry_index = index;
> +       nfs_readdir_set_cursor(desc, index);
>         return 0;
>  out_eof:
>         desc->eof = true;
> @@ -492,7 +511,7 @@ static int nfs_readdir_search_for_cookie(struct
> nfs_cache_array *array,
>                         else
>                                 desc->ctx->pos = new_pos;
>                         desc->prev_index = new_pos;
> -                       desc->pgc.entry_index = i;
> +                       nfs_readdir_set_cursor(desc, i);
>                         return 0;
>                 }
>         }
> @@ -519,9 +538,9 @@ static int nfs_readdir_search_array(struct
> nfs_readdir_descriptor *desc)
>                 status = nfs_readdir_search_for_cookie(array, desc);
>  
>         if (status == -EAGAIN) {
> -               desc->pgc.index_cookie = array->last_cookie;
> +               desc->pgc.entry_index = array->size - 1;
> +               nfs_readdir_cursor_next(&desc->pgc, array-
> >last_cookie);
>                 desc->current_index += array->size;
> -               desc->pgc.page_index++;
>         }
>         kunmap_atomic(array);
>         return status;
> @@ -1035,6 +1054,8 @@ static void nfs_do_filldir(struct
> nfs_readdir_descriptor *desc)
>                 desc->eof = true;
>  
>         kunmap(desc->page);
> +       desc->pgc.entry_index = i-1;
> +       nfs_readdir_cursor_next(&desc->pgc, desc->dir_cookie);
>         dfprintk(DIRCACHE, "NFS: nfs_do_filldir() filling ended @
> cookie %llu\n",
>                         (unsigned long long)desc->dir_cookie);
>  }

-- 
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022
​
www.hammer.space


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated
  2021-01-21 20:00   ` Trond Myklebust
@ 2021-01-21 20:11     ` Trond Myklebust
  2021-01-22  0:21       ` Benjamin Coddington
  0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2021-01-21 20:11 UTC (permalink / raw)
  To: linux-nfs, bcodding

On Thu, 2021-01-21 at 20:00 +0000, Trond Myklebust wrote:
> On Wed, 2021-01-20 at 11:59 -0500, Benjamin Coddington wrote:
> > Whenever we successfully locate our dir_cookie within the
> > pagecache,
> > or
> > finish emitting entries to userspace, update the pagecache cursor. 
> > These
> > updates provide marker points to validate pagecache pages in a
> > future
> > patch.
> 
> How isn't this going to end up subject to the exact same problem that
> Dave Wysochanski's patchset had?

IOW: how is this not also making invalid assumptions around page cache
layout stability across READDIR calls?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated
  2021-01-21 20:11     ` Trond Myklebust
@ 2021-01-22  0:21       ` Benjamin Coddington
  0 siblings, 0 replies; 17+ messages in thread
From: Benjamin Coddington @ 2021-01-22  0:21 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On 21 Jan 2021, at 15:11, Trond Myklebust wrote:

> On Thu, 2021-01-21 at 20:00 +0000, Trond Myklebust wrote:
>> On Wed, 2021-01-20 at 11:59 -0500, Benjamin Coddington wrote:
>>> Whenever we successfully locate our dir_cookie within the
>>> pagecache,
>>> or
>>> finish emitting entries to userspace, update the pagecache cursor. 
>>> These
>>> updates provide marker points to validate pagecache pages in a
>>> future
>>> patch.
>>
>> How isn't this going to end up subject to the exact same problem that
>> Dave Wysochanski's patchset had?
>
> IOW: how is this not also making invalid assumptions around page cache
> layout stability across READDIR calls?

IIRC, Dave's approach was to store the index along with dir_cookie in
order to skip having to re-fill the cache to resume a listing.  That
approach assumed that the index still referred to page data aligned
with the current reader's dir_cookie.  But in between calls to
nfs_readdir() it would be possible for another reader to fill the cache 
with
a different alignment due to directory changes.  In which case the index 
is not
going to point to the next entry, we'll either skip entries or repeat 
them.

With this approach, we don't assume this is the case.  For every page 
with
data, the alignment of the entries is verified to be in the same 
position
as that reader's last pass through.  If not, the page data is discarded 
and
refreshed with a new READDIR op - just like an uncached_readdir path, 
but we
still fill the pagecache.  Every READDIR also updates the change_attr, 
so
even if there are cached pages beyond our current location that match 
our
alignment, we detect the case where those pages are actually invalid due 
to
changes on the server, and we re-fill them.

Another way to think about this is that instead of trying to cache the 
complete
representation of the directory aligned to the first entry in the 
pagecache,
we're instead just caching the results of any READDIR at convenient 
offsets
in the pagecache for other readers that might follow.  The READDIR 
results
are only usable if they match the current version of the directory and 
their
alignment is correct.

If you were to lseek to nearly the end of a directory, the first call to
nfs_readdir() will end up with results of a READDIR hitting the cache 
and
filling page->index 0.  That's ok because any other reader coming along 
with
a different alignment will discard that data and refresh it. We are 
going to
create a performance penalty for two readers that want to regularly fill
entries at different alignments, but I think that case is probably 
fairly
rare.  I am making a guess, but I think the most common usage of readdir 
is
by readers that want to traverse the entire directory in order.

So for that case all the readers benefit from the work of other 
processes,
the cache can be filled in parallel, and readers at the beginning don't
prevent readers at the end from filling in entries.  We no longer have 
to
worry whether we have enough memory to list a directory, or play games 
with
timeouts.

Ben


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-22  0:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 16:59 [PATCH v1 00/10] NFS client readdir per-page validation Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 01/10] NFS: save the directory's change attribute on pagecache pages Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 02/10] NFSv4: Send GETATTR with READDIR Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 03/10] NFS: Add a struct to track readdir pagecache location Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 04/10] NFS: Keep the readdir pagecache cursor updated Benjamin Coddington
2021-01-21 20:00   ` Trond Myklebust
2021-01-21 20:11     ` Trond Myklebust
2021-01-22  0:21       ` Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 05/10] NFS: readdir per-page cache validation Benjamin Coddington
2021-01-20 20:08   ` kernel test robot
2021-01-20 16:59 ` [PATCH v1 06/10] NFS: stash the readdir pagecache cursor on the open directory context Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 07/10] NFS: Support headless readdir pagecache pages Benjamin Coddington
2021-01-21 17:24   ` Anna Schumaker
2021-01-21 17:57     ` Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 08/10] NFS: Reset pagecache cursor on llseek Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 09/10] NFS: Remove nfs_readdir_dont_search_cache() Benjamin Coddington
2021-01-20 16:59 ` [PATCH v1 10/10] NFS: Revalidate the directory pagecache on every nfs_readdir() Benjamin Coddington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).