Re: [PATCH 0/2] NFS: limit use of ACCESS cache for negative responses

From: Trond Myklebust <trondmy@hammerspace.com>
To: "bcodding@redhat.com" <bcodding@redhat.com>
Cc: "anna@kernel.org" <anna@kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"neilb@suse.de" <neilb@suse.de>
Subject: Re: [PATCH 0/2] NFS: limit use of ACCESS cache for negative responses
Date: Sat, 27 Aug 2022 00:52:23 +0000	[thread overview]
Message-ID: <361256cf42393e2af9691b40bd51594ce078f968.camel@hammerspace.com> (raw)
In-Reply-To: <FA952BF7-1638-4BD1-8DA9-683078CFDE8F@redhat.com>

On Fri, 2022-08-26 at 14:27 -0400, Benjamin Coddington wrote:
> On 26 Aug 2022, at 12:56, Trond Myklebust wrote:
> > User group membership is not a per-mount thing. It's a global
> > thing.
> 
> The cached access entry is a per-inode thing.

That's irrelevant. We already have code that deal with per-inode
changes. Those changes happen when the owner changes, the group owner
changes, the mode changes or the ACL changes.

The problem that you're asking to be solved is when the lookup key for
every single access entry across the entire universe changes because
someone edited a group for a given person, and the server picked up on
that change at some time after the NFS client picked up on that change.

> 
> > As I said, what I'm proposing does allow you to set up a cron job
> > that
> > flushes your cache on a regular basis. There is absolutely no extra
> > value whatsoever provided by moving that into the kernel on a per-
> > mount
> > basis.
> 
> Sure there is - that's where we configure major NFS client behaviors.
> 
> I understand where you're coming from, but it seems so bizarre that a
> previous
> behavior that multiple organizations built and depend upon has been
> removed
> to optimize performance, and now we will need to tell them that to
> restore
> it we must write cron jobs on all the clients.  I don't think there's
> been a
> dependency on cron to get NFS to work a certain way yet.
> 
> A mount option is much easier to deploy in these organizations that
> have
> autofs deployed, and documenting it in NFS(5) seems the right place.
> 
> If that's just not acceptable, at least let's just make a tuneable
> that
> expires entries rather than a trigger to flush everything.  Please
> consider
> the configuration sprawl on the NFS client, and let me know how to
> proceed.
> 

Can we please try to solve the real problem first? The real problem is
not that user permissions change every hour on the server.

POSIX normally only expects changes to happen to your group membership
when you log in. The problem here is that the NFS server is updating
its rules concerning your group membership at some random time after
your log in on the NFS client.

So how about if we just do our best to approximate the POSIX rules, and
promise to revalidate your cached file access permissions at least once
after you log in? Then we can let the NFS server do whatever the hell
it wants to do after that.
IOW: If the sysadmin changes the group membership for the user, then
the user can remedy the problem by logging out and then logging back in
again, just like they do for local filesystems.


8<----------------------------------------
From 72d5b8b6bb053ff3574c4dcbf7ecf3102e43b5b8 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Fri, 26 Aug 2022 19:44:44 -0400
Subject: [PATCH] NFS: Clear the file access cache upon login

POSIX typically only refreshes the user's supplementary group
information upon login. Since NFS servers may often refresh their
concept of the user supplementary group membership at their own cadence,
it is possible for the NFS client's access cache to become stale due to
the user's group membership changing on the server after the user has
already logged in on the client.
While it is reasonable to expect that such group membership changes are
rare, and that we do not want to optimise the cache to accommodate them,
it is also not unreasonable for the user to expect that if they log out
and log back in again, that the staleness would clear up.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/dir.c           | 22 ++++++++++++++++++++++
 include/linux/nfs_fs.h |  1 +
 2 files changed, 23 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5d6c2ddc7ea6..5e09cb79544e 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2949,9 +2949,28 @@ static struct nfs_access_entry *nfs_access_search_rbtree(struct inode *inode, co
 	return NULL;
 }
 
+static u64 nfs_access_login_time(const struct task_struct *task,
+				 const struct cred *cred)
+{
+	const struct task_struct *parent;
+	u64 ret;
+
+	rcu_read_lock();
+	for (;;) {
+		parent = rcu_dereference(task->real_parent);
+		if (parent == task || cred_fscmp(parent->cred, cred) != 0)
+			break;
+		task = parent;
+	}
+	ret = task->start_time;
+	rcu_read_unlock();
+	return ret;
+}
+
 static int nfs_access_get_cached_locked(struct inode *inode, const struct cred *cred, u32 *mask, bool may_block)
 {
 	struct nfs_inode *nfsi = NFS_I(inode);
+	u64 login_time = nfs_access_login_time(current, cred);
 	struct nfs_access_entry *cache;
 	bool retry = true;
 	int err;
@@ -2979,6 +2998,8 @@ static int nfs_access_get_cached_locked(struct inode *inode, const struct cred *
 		spin_lock(&inode->i_lock);
 		retry = false;
 	}
+	if ((s64)(cache->timestamp - login_time) < 0)
+		goto out_zap;
 	*mask = cache->mask;
 	list_move_tail(&cache->lru, &nfsi->access_cache_entry_lru);
 	err = 0;
@@ -3058,6 +3079,7 @@ static void nfs_access_add_rbtree(struct inode *inode,
 		else
 			goto found;
 	}
+	set->timestamp = ktime_get_ns();
 	rb_link_node(&set->rb_node, parent, p);
 	rb_insert_color(&set->rb_node, root_node);
 	list_add_tail(&set->lru, &nfsi->access_cache_entry_lru);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 7931fa472561..d92fdfd2444c 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -59,6 +59,7 @@ struct nfs_access_entry {
 	kuid_t			fsuid;
 	kgid_t			fsgid;
 	struct group_info	*group_info;
+	u64			timestamp;
 	__u32			mask;
 	struct rcu_head		rcu_head;
 };
-- 
2.37.2


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com