Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/8] Reduce the refcount pressure of NFS on struct cred
@ 2020-02-10 19:13 Trond Myklebust
  2020-02-10 19:13 ` [PATCH 1/8] NFS: alloc_nfs_open_context() must use the file cred when available Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

The NFS and RPC layer currently always takes a reference to the
credential passed by the user, even in the case where the RPC call
is synchronous, or the cred is also being pinned by file pointers
and open contexts.
In addition, the access cache will take a reference for each cache entry
on each file that we've looked up.

This patch set attempts to reduce the amount of references that the NFS
layer holds, by optimising away a few cases where we're taking the
refcount unnecessarily. It also sets a more stringent limit on the
number of access cache entries that the NFS layer holds.

Trond Myklebust (8):
  NFS: alloc_nfs_open_context() must use the file cred when available
  SUNRPC: Add a flag to avoid reference counts on credentials
  SUNRPC: Don't take a reference to the cred on synchronous tasks
  NFS: Assume cred is pinned by open context in I/O requests
  NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O
  NFSv4: Avoid unnecessary credential references in layoutget
  NFS: Avoid referencing the cred twice in async rename/unlink
  NFS: Limit the size of the access cache by default

 fs/nfs/dir.c                 |  2 +-
 fs/nfs/inode.c               | 10 +++++-----
 fs/nfs/nfs4proc.c            | 12 ++++++------
 fs/nfs/pagelist.c            |  2 +-
 fs/nfs/pnfs.c                |  3 +--
 fs/nfs/unlink.c              |  4 ++--
 fs/nfs/write.c               |  2 +-
 include/linux/sunrpc/sched.h |  1 +
 net/sunrpc/clnt.c            |  8 ++++++--
 net/sunrpc/sched.c           |  3 ++-
 10 files changed, 26 insertions(+), 21 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] NFS: alloc_nfs_open_context() must use the file cred when available
  2020-02-10 19:13 [PATCH 0/8] Reduce the refcount pressure of NFS on struct cred Trond Myklebust
@ 2020-02-10 19:13 ` Trond Myklebust
  2020-02-10 19:13   ` [PATCH 2/8] SUNRPC: Add a flag to avoid reference counts on credentials Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

If we're creating a nfs_open_context() for a specific file pointer,
we must use the cred assigned to that file.

Fixes: a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/inode.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 11bf15800ac9..a10fb87c6ac3 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -959,16 +959,16 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
 						struct file *filp)
 {
 	struct nfs_open_context *ctx;
-	const struct cred *cred = get_current_cred();
 
 	ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
-	if (!ctx) {
-		put_cred(cred);
+	if (!ctx)
 		return ERR_PTR(-ENOMEM);
-	}
 	nfs_sb_active(dentry->d_sb);
 	ctx->dentry = dget(dentry);
-	ctx->cred = cred;
+	if (filp)
+		ctx->cred = get_cred(filp->f_cred);
+	else
+		ctx->cred = get_current_cred();
 	ctx->ll_cred = NULL;
 	ctx->state = NULL;
 	ctx->mode = f_mode;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/8] SUNRPC: Add a flag to avoid reference counts on credentials
  2020-02-10 19:13 ` [PATCH 1/8] NFS: alloc_nfs_open_context() must use the file cred when available Trond Myklebust
@ 2020-02-10 19:13   ` Trond Myklebust
  2020-02-10 19:13     ` [PATCH 3/8] SUNRPC: Don't take a reference to the cred on synchronous tasks Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

Add a flag to signal to the RPC layer that the credential is already
pinned for the duration of the RPC call.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 include/linux/sunrpc/sched.h | 1 +
 net/sunrpc/clnt.c            | 5 +++--
 net/sunrpc/sched.c           | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h
index a6ef35184ef1..df696efdd675 100644
--- a/include/linux/sunrpc/sched.h
+++ b/include/linux/sunrpc/sched.h
@@ -132,6 +132,7 @@ struct rpc_task_setup {
 #define RPC_TASK_TIMEOUT	0x1000		/* fail with ETIMEDOUT on timeout */
 #define RPC_TASK_NOCONNECT	0x2000		/* return ENOTCONN if not connected */
 #define RPC_TASK_NO_RETRANS_TIMEOUT	0x4000		/* wait forever for a reply */
+#define RPC_TASK_CRED_NOREF	0x8000		/* No refcount on the credential */
 
 #define RPC_IS_ASYNC(t)		((t)->tk_flags & RPC_TASK_ASYNC)
 #define RPC_IS_SWAPPER(t)	((t)->tk_flags & RPC_TASK_SWAPPER)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 7324b21f923e..2345e563c2f4 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1099,8 +1099,9 @@ rpc_task_set_rpc_message(struct rpc_task *task, const struct rpc_message *msg)
 		task->tk_msg.rpc_proc = msg->rpc_proc;
 		task->tk_msg.rpc_argp = msg->rpc_argp;
 		task->tk_msg.rpc_resp = msg->rpc_resp;
-		if (msg->rpc_cred != NULL)
-			task->tk_msg.rpc_cred = get_cred(msg->rpc_cred);
+		task->tk_msg.rpc_cred = msg->rpc_cred;
+		if (!(task->tk_flags & RPC_TASK_CRED_NOREF))
+			get_cred(task->tk_msg.rpc_cred);
 	}
 }
 
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 55e900255b0c..6eff14119a88 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -1162,7 +1162,8 @@ static void rpc_release_resources_task(struct rpc_task *task)
 {
 	xprt_release(task);
 	if (task->tk_msg.rpc_cred) {
-		put_cred(task->tk_msg.rpc_cred);
+		if (!(task->tk_flags & RPC_TASK_CRED_NOREF))
+			put_cred(task->tk_msg.rpc_cred);
 		task->tk_msg.rpc_cred = NULL;
 	}
 	rpc_task_release_client(task);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/8] SUNRPC: Don't take a reference to the cred on synchronous tasks
  2020-02-10 19:13   ` [PATCH 2/8] SUNRPC: Add a flag to avoid reference counts on credentials Trond Myklebust
@ 2020-02-10 19:13     ` Trond Myklebust
  2020-02-10 19:13       ` [PATCH 4/8] NFS: Assume cred is pinned by open context in I/O requests Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

If the RPC call is synchronous, assume the cred is already pinned
by the caller.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/clnt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 2345e563c2f4..252b044cdcdf 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1127,6 +1127,9 @@ struct rpc_task *rpc_run_task(const struct rpc_task_setup *task_setup_data)
 
 	task = rpc_new_task(task_setup_data);
 
+	if (!RPC_IS_ASYNC(task))
+		task->tk_flags |= RPC_TASK_CRED_NOREF;
+
 	rpc_task_set_client(task, task_setup_data->rpc_client);
 	rpc_task_set_rpc_message(task, task_setup_data->rpc_message);
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/8] NFS: Assume cred is pinned by open context in I/O requests
  2020-02-10 19:13     ` [PATCH 3/8] SUNRPC: Don't take a reference to the cred on synchronous tasks Trond Myklebust
@ 2020-02-10 19:13       ` Trond Myklebust
  2020-02-10 19:13         ` [PATCH 5/8] NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

In read/write/commit, we should be able to assume that the cred is
pinned by the open context.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/pagelist.c | 2 +-
 fs/nfs/write.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 20b3717cd7ca..c9c3edefc5be 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -627,7 +627,7 @@ int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr,
 		.callback_ops = call_ops,
 		.callback_data = hdr,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC | flags,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | flags,
 	};
 	int ret = 0;
 
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index c478b772cc49..5544ee6cfda8 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1707,7 +1707,7 @@ int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data,
 		.callback_ops = call_ops,
 		.callback_data = data,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC | flags,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | flags,
 		.priority = priority,
 	};
 	/* Set up the initial task struct.  */
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 5/8] NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O
  2020-02-10 19:13       ` [PATCH 4/8] NFS: Assume cred is pinned by open context in I/O requests Trond Myklebust
@ 2020-02-10 19:13         ` Trond Myklebust
  2020-02-10 19:13           ` [PATCH 6/8] NFSv4: Avoid unnecessary credential references in layoutget Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

Avoid unnecessary references to the cred when we have already referenced
it through the open context or the open owner.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 00fe674c8a49..47464fb419dc 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2346,7 +2346,7 @@ static int _nfs4_proc_open_confirm(struct nfs4_opendata *data)
 		.callback_ops = &nfs4_open_confirm_ops,
 		.callback_data = data,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	int status;
 
@@ -2511,7 +2511,7 @@ static int nfs4_run_open_task(struct nfs4_opendata *data,
 		.callback_ops = &nfs4_open_ops,
 		.callback_data = data,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	int status;
 
@@ -3651,7 +3651,7 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait)
 		.rpc_message = &msg,
 		.callback_ops = &nfs4_close_ops,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	int status = -ENOMEM;
 
@@ -6350,7 +6350,7 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred,
 		.rpc_client = server->client,
 		.rpc_message = &msg,
 		.callback_ops = &nfs4_delegreturn_ops,
-		.flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | RPC_TASK_TIMEOUT,
 	};
 	int status = 0;
 
@@ -6933,7 +6933,7 @@ static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *f
 		.rpc_message = &msg,
 		.callback_ops = &nfs4_lock_ops,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	int ret;
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 6/8] NFSv4: Avoid unnecessary credential references in layoutget
  2020-02-10 19:13         ` [PATCH 5/8] NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O Trond Myklebust
@ 2020-02-10 19:13           ` Trond Myklebust
  2020-02-10 19:13             ` [PATCH 7/8] NFS: Avoid referencing the cred twice in async rename/unlink Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

Layoutget is just using the credential attached to the open context.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c | 2 +-
 fs/nfs/pnfs.c     | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 47464fb419dc..7f5802b6d404 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -9177,7 +9177,7 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout)
 		.rpc_message = &msg,
 		.callback_ops = &nfs4_layoutget_call_ops,
 		.callback_data = lgp,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	struct pnfs_layout_segment *lseg = NULL;
 	struct nfs4_exception exception = {
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index b21eb4882846..cb99ac954688 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1073,7 +1073,7 @@ pnfs_alloc_init_layoutget_args(struct inode *ino,
 	lgp->args.ctx = get_nfs_open_context(ctx);
 	nfs4_stateid_copy(&lgp->args.stateid, stateid);
 	lgp->gfp_flags = gfp_flags;
-	lgp->cred = get_cred(ctx->cred);
+	lgp->cred = ctx->cred;
 	return lgp;
 }
 
@@ -1084,7 +1084,6 @@ void pnfs_layoutget_free(struct nfs4_layoutget *lgp)
 	nfs4_free_pages(lgp->args.layout.pages, max_pages);
 	if (lgp->args.inode)
 		pnfs_put_layout_hdr(NFS_I(lgp->args.inode)->layout);
-	put_cred(lgp->cred);
 	put_nfs_open_context(lgp->args.ctx);
 	kfree(lgp);
 }
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 7/8] NFS: Avoid referencing the cred twice in async rename/unlink
  2020-02-10 19:13           ` [PATCH 6/8] NFSv4: Avoid unnecessary credential references in layoutget Trond Myklebust
@ 2020-02-10 19:13             ` Trond Myklebust
  2020-02-10 19:13               ` [PATCH 8/8] NFS: Limit the size of the access cache by default Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

In both async rename and rename, we take a reference to the
cred in the call arguments.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/unlink.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index 0effeee28352..b27ebdccef70 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -98,7 +98,7 @@ static void nfs_do_call_unlink(struct inode *inode, struct nfs_unlinkdata *data)
 		.callback_ops = &nfs_unlink_ops,
 		.callback_data = data,
 		.workqueue = nfsiod_workqueue,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 	struct rpc_task *task;
 	struct inode *dir = d_inode(data->dentry->d_parent);
@@ -341,7 +341,7 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir,
 		.callback_ops = &nfs_rename_ops,
 		.workqueue = nfsiod_workqueue,
 		.rpc_client = NFS_CLIENT(old_dir),
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
 	};
 
 	data = kzalloc(sizeof(*data), GFP_KERNEL);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 8/8] NFS: Limit the size of the access cache by default
  2020-02-10 19:13             ` [PATCH 7/8] NFS: Avoid referencing the cred twice in async rename/unlink Trond Myklebust
@ 2020-02-10 19:13               ` Trond Myklebust
  0 siblings, 0 replies; 9+ messages in thread
From: Trond Myklebust @ 2020-02-10 19:13 UTC (permalink / raw)
  To: linux-nfs

Currently, we have no real limit on the access cache size (we set it
to ULONG_MAX). That can lead to credentials getting pinned for a
very long time on lots of files if you have a system with a lot of
memory.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 09bcbdc67135..133bf23430e8 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2307,7 +2307,7 @@ static DEFINE_SPINLOCK(nfs_access_lru_lock);
 static LIST_HEAD(nfs_access_lru_list);
 static atomic_long_t nfs_access_nr_entries;
 
-static unsigned long nfs_access_max_cachesize = ULONG_MAX;
+static unsigned long nfs_access_max_cachesize = 4*1024*1024;
 module_param(nfs_access_max_cachesize, ulong, 0644);
 MODULE_PARM_DESC(nfs_access_max_cachesize, "NFS access maximum total cache length");
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-10 19:13 [PATCH 0/8] Reduce the refcount pressure of NFS on struct cred Trond Myklebust
2020-02-10 19:13 ` [PATCH 1/8] NFS: alloc_nfs_open_context() must use the file cred when available Trond Myklebust
2020-02-10 19:13   ` [PATCH 2/8] SUNRPC: Add a flag to avoid reference counts on credentials Trond Myklebust
2020-02-10 19:13     ` [PATCH 3/8] SUNRPC: Don't take a reference to the cred on synchronous tasks Trond Myklebust
2020-02-10 19:13       ` [PATCH 4/8] NFS: Assume cred is pinned by open context in I/O requests Trond Myklebust
2020-02-10 19:13         ` [PATCH 5/8] NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O Trond Myklebust
2020-02-10 19:13           ` [PATCH 6/8] NFSv4: Avoid unnecessary credential references in layoutget Trond Myklebust
2020-02-10 19:13             ` [PATCH 7/8] NFS: Avoid referencing the cred twice in async rename/unlink Trond Myklebust
2020-02-10 19:13               ` [PATCH 8/8] NFS: Limit the size of the access cache by default Trond Myklebust

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git