linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations
@ 2024-03-15 16:52 Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 01/24] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
                   ` (23 more replies)
  0 siblings, 24 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

NFSv4.1 adds a new GET_DIR_DELEGATION operation, to allow clients
to request a delegation on a directory. If the client holds a directory
delegation, then it knows that nothing will change the dentries in it
until it has been recalled.

Last year, Rick Macklem gave a talk at the NFS Bakeathon on his
implementation of directory delegations for FreeBSD [1], and showed that
it can greatly improve LOOKUP-heavy workloads. There is also some
earlier work by CITI [2] that showed similar results. The SMB protocol
also has a similar sort of construct, and they have also seen great
performance improvements from it.

This patchset adds minimal support for directory delegations to the VFS
layer, and and plumbs support for GET_DIR_DELEGATION handling into nfsd.
It also has a set of patches for the NFS client. The VFS and server-side
bits all seem to work now and do the right thing. The client side is
still very much a WIP since it relies more on policy and heuristics.

What I've found is that without directory delegations in play, the
client usually revalidates dentries by issuing a GETATTR for the parent.
If it holds a delegation on the parent, it can avoid that GETATTR, which
is a nice benefit.

But...the catch is that the Linux NFS client defaults to caching
dentries for 30-60s before revalidating them. The attrcache timeout
quickly ends up as 60s on directories that aren't changing, so for a
delegation to be useful, you need to be revalidating dentries in the
same directory over a rather long span of time. Even then, at best it'll
be optimizing away one GETATTR per inode every 60s or so.

I've done a little ad-hoc testing with kernel builds, but they don't
show much benefit from this. Testing with them both enabled and
disabled, the kernel builds finished within 5 seconds of one another on
a 12.5 minute build, which is insignificant.

The GETATTR load seems to be reduced by about 7% with dir delegs
enabled, which is nice, but GETATTRs are usually pretty lightweight
anyway, and we do have the overhead of dir delegations to deal with.

There is a lot of room for improvement that could make this more
compelling:

1) The client's logic for fetching a delegation is probably too
   primitive. Maybe we need to request one only after every 3-5
   revalidations?

2) The client uses the same "REFERENCED" timeout for dir delegations
   as it does for clients. It could probably benefit from holding them
   longer by default. I'm not sure how best to do that.

3) The protocol allows for clients and server to use asynchronous
   notifications to avoid issuing delegation breaks when there are
   changes to a delegated directory. That's not implemented yet,
   but it could change the calculus here for workloads where multiple
   clients are accessing and changing child dentries.

4) A GET_DIR_DELEGATION+READDIR compound could be worthwhile. If we have
   all of the dentries in the directory, then we could (in principle)
   satisfy any negtive dentry lookup w/o going to the server. If they're
   all in the correct order on d_subdirs, we could satisfy a readdir
   via dcache_readdir without going to the server.

5) The server could probably avoid handing them out if the inode changed
   <60s in the past?

For the VFS and NFSD bits, I mainly want to get some early feedback.
Does this seem like a reasonable approach? Anyone see major problems
with handling directory delegations in the VFS codepaths?

The NFS client bits are quite a bit more rough.  Is there a better
heuristic for when to request a dir delegation? I could use a bit of
guidance here.

[1]: https://www.youtube.com/watch?v=DdFyH3BN5pI
[2]: https://linux-nfs.org/wiki/index.php/CITI_Experience_with_Directory_Delegations

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Jeff Layton (24):
      filelock: push the S_ISREG check down to ->setlease handlers
      filelock: add a lm_set_conflict lease_manager callback
      vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
      vfs: allow mkdir to wait for delegation break on parent
      vfs: allow rmdir to wait for delegation break on parent
      vfs: break parent dir delegations in open(..., O_CREAT) codepath
      vfs: make vfs_create break delegations on parent directory
      vfs: make vfs_mknod break delegations on parent directory
      filelock: lift the ban on directory leases in generic_setlease
      nfsd: allow filecache to hold S_IFDIR files
      nfsd: allow DELEGRETURN on directories
      nfsd: encoders and decoders for GET_DIR_DELEGATION
      nfsd: check for delegation conflicts vs. the same client
      nfsd: wire up GET_DIR_DELEGATION handling
      nfs: fix nfs_stateid_hash prototype when CONFIG_CRC32 isn't set
      nfs: remove unused NFS_CALL macro
      nfs: add cache_validity to the nfs_inode_event tracepoints
      nfs: add a tracepoint to nfs_inode_detach_delegation_locked
      nfs: new tracepoint in nfs_delegation_need_return
      nfs: new tracepoint in match_stateid operation
      nfs: add a GDD_GETATTR rpc operation
      nfs: skip dentry revalidation when parent dir has a delegation
      nfs: optionally request a delegation on GETATTR
      nfs: add a module parameter to disable directory delegations

 drivers/base/devtmpfs.c   |   6 +-
 fs/cachefiles/namei.c     |   2 +-
 fs/ecryptfs/inode.c       |   8 +--
 fs/init.c                 |   4 +-
 fs/locks.c                |  12 +++-
 fs/namei.c                | 101 ++++++++++++++++++++++++++++------
 fs/nfs/delegation.c       |   5 ++
 fs/nfs/dir.c              |  20 +++++++
 fs/nfs/internal.h         |   2 +-
 fs/nfs/nfs4file.c         |   2 +
 fs/nfs/nfs4proc.c         |  55 ++++++++++++++++++-
 fs/nfs/nfs4trace.h        | 104 +++++++++++++++++++++++++++++++++++
 fs/nfs/nfs4xdr.c          | 136 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfs/nfstrace.h         |   8 ++-
 fs/nfsd/filecache.c       |  37 +++++++++++--
 fs/nfsd/filecache.h       |   2 +
 fs/nfsd/nfs3proc.c        |   2 +-
 fs/nfsd/nfs4proc.c        |  49 +++++++++++++++++
 fs/nfsd/nfs4recover.c     |   6 +-
 fs/nfsd/nfs4state.c       | 112 +++++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4xdr.c         |  72 +++++++++++++++++++++++-
 fs/nfsd/state.h           |   5 ++
 fs/nfsd/vfs.c             |  13 +++--
 fs/nfsd/vfs.h             |   2 +-
 fs/nfsd/xdr4.h            |   8 +++
 fs/open.c                 |   2 +-
 fs/overlayfs/overlayfs.h  |   8 +--
 fs/smb/client/cifsfs.c    |   3 +
 fs/smb/server/vfs.c       |   8 +--
 include/linux/filelock.h  |  10 ++++
 include/linux/fs.h        |  11 ++--
 include/linux/nfs4.h      |   6 ++
 include/linux/nfs_fs.h    |   1 +
 include/linux/nfs_fs_sb.h |   1 +
 include/linux/nfs_xdr.h   |   9 +--
 net/unix/af_unix.c        |   2 +-
 36 files changed, 754 insertions(+), 80 deletions(-)
---
base-commit: b80a250a20975e30ff8635dd75d6d20bf05405a1
change-id: 20240215-dir-deleg-e212210ba9d4

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH RFC 01/24] filelock: push the S_ISREG check down to ->setlease handlers
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback Jeff Layton
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

When nfsd starts requesting directory delegations, setlease handlers may
see requests for leases on directories. Push the !S_ISREG check down
into the non-trivial setlease handlers, so we can selectively enable
them where they're supported.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c             | 5 +++--
 fs/nfs/nfs4file.c      | 2 ++
 fs/smb/client/cifsfs.c | 3 +++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 90c8746874de..cb4b35d26162 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1925,6 +1925,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
 int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
 			void **priv)
 {
+	if (!S_ISREG(file_inode(filp)->i_mode))
+		return -EINVAL;
+
 	switch (arg) {
 	case F_UNLCK:
 		return generic_delete_lease(filp, *priv);
@@ -2014,8 +2017,6 @@ vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void **priv)
 
 	if ((!vfsuid_eq_kuid(vfsuid, current_fsuid())) && !capable(CAP_LEASE))
 		return -EACCES;
-	if (!S_ISREG(inode->i_mode))
-		return -EINVAL;
 	error = security_file_lock(filp, arg);
 	if (error)
 		return error;
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 1cd9652f3c28..b7630a437ad2 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -442,6 +442,8 @@ void nfs42_ssc_unregister_ops(void)
 static int nfs4_setlease(struct file *file, int arg, struct file_lease **lease,
 			 void **priv)
 {
+	if (!S_ISREG(file_inode(file)->i_mode))
+		return -EINVAL;
 	return nfs4_proc_setlease(file, arg, lease, priv);
 }
 
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index fb368b191eef..81a96fbfa3ed 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -1094,6 +1094,9 @@ cifs_setlease(struct file *file, int arg, struct file_lease **lease, void **priv
 	struct inode *inode = file_inode(file);
 	struct cifsFileInfo *cfile = file->private_data;
 
+	if (!S_ISREG(inode->i_mode))
+		return -EINVAL;
+
 	/* Check if file is oplocked if this is request for new lease */
 	if (arg == F_UNLCK ||
 	    ((arg == F_RDLCK) && CIFS_CACHE_READ(CIFS_I(inode))) ||

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 01/24] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-17 14:56   ` Chuck Lever
  2024-03-20 13:13   ` Christian Brauner
  2024-03-15 16:52 ` [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
                   ` (21 subsequent siblings)
  23 siblings, 2 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

The NFSv4.1 protocol adds support for directory delegations, but it
specifies that if you already have a delegation and try to request a new
one on the same filehandle, the server must reply that the delegation is
unavailable.

Add a new lease_manager callback to allow the lease manager (nfsd in
this case) to impose extra checks when performing a setlease.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c               |  5 +++++
 include/linux/filelock.h | 10 ++++++++++
 2 files changed, 15 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index cb4b35d26162..415cca8e9565 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1822,6 +1822,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
 			continue;
 		}
 
+		/* Allow the lease manager to veto the setlease */
+		if (lease->fl_lmops->lm_set_conflict &&
+		    lease->fl_lmops->lm_set_conflict(lease, fl))
+			goto out;
+
 		/*
 		 * No exclusive leases if someone else has a lease on
 		 * this file:
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index daee999d05f3..c5fc768087df 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -49,6 +49,16 @@ struct lease_manager_operations {
 	int (*lm_change)(struct file_lease *, int, struct list_head *);
 	void (*lm_setup)(struct file_lease *, void **);
 	bool (*lm_breaker_owns_lease)(struct file_lease *);
+
+	/**
+	 * lm_set_conflict - extra conditions for setlease
+	 * @new: new file_lease being set
+	 * @old: old (extant) file_lease
+	 *
+	 * This allows the lease manager to add extra conditions when
+	 * setting a lease.
+	 */
+	bool (*lm_set_conflict)(struct file_lease *new, struct file_lease *old);
 };
 
 struct lock_manager {

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 01/24] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-16 23:57   ` Al Viro
  2024-03-15 16:52 ` [PATCH RFC 04/24] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

vfs_link, vfs_unlink, and vfs_rename all have existing delegation break
handling for the children in the rename. Add the necessary calls for
breaking delegations in the parent(s) as well.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 9342fa6a38c2..56d3ebed7bac 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4332,6 +4332,9 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
+			error = try_break_deleg(dir, delegated_inode);
+			if (error)
+				goto out;
 			error = try_break_deleg(target, delegated_inode);
 			if (error)
 				goto out;
@@ -4603,9 +4606,12 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
 	else {
-		error = try_break_deleg(inode, delegated_inode);
-		if (!error)
-			error = dir->i_op->link(old_dentry, dir, new_dentry);
+		error = try_break_deleg(dir, delegated_inode);
+		if (!error) {
+			error = try_break_deleg(inode, delegated_inode);
+			if (!error)
+				error = dir->i_op->link(old_dentry, dir, new_dentry);
+		}
 	}
 
 	if (!error && (inode->i_state & I_LINKABLE)) {
@@ -4870,6 +4876,14 @@ int vfs_rename(struct renamedata *rd)
 		    old_dir->i_nlink >= max_links)
 			goto out;
 	}
+	error = try_break_deleg(old_dir, delegated_inode);
+	if (error)
+		goto out;
+	if (new_dir != old_dir) {
+		error = try_break_deleg(new_dir, delegated_inode);
+		if (error)
+			goto out;
+	}
 	if (!is_dir) {
 		error = try_break_deleg(source, delegated_inode);
 		if (error)

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 04/24] vfs: allow mkdir to wait for delegation break on parent
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (2 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 05/24] vfs: allow rmdir " Jeff Layton
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a new delegated_inode parameter to vfs_mkdir. Most callers will set
that to NULL, but do_mkdirat can use that to wait for the delegation
break to complete and then retry.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 drivers/base/devtmpfs.c  |  2 +-
 fs/cachefiles/namei.c    |  2 +-
 fs/ecryptfs/inode.c      |  2 +-
 fs/init.c                |  2 +-
 fs/namei.c               | 17 +++++++++++++----
 fs/nfsd/nfs4recover.c    |  2 +-
 fs/nfsd/vfs.c            |  2 +-
 fs/overlayfs/overlayfs.h |  2 +-
 fs/smb/server/vfs.c      |  2 +-
 include/linux/fs.h       |  2 +-
 10 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index b848764ef018..8d1dbcad69f7 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -166,7 +166,7 @@ static int dev_mkdir(const char *name, umode_t mode)
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	err = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode);
+	err = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode, NULL);
 	if (!err)
 		/* mark as kernel-created inode */
 		d_inode(dentry)->i_private = &thread;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 7ade836beb58..4e3e31320275 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -130,7 +130,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 			goto mkdir_error;
 		ret = cachefiles_inject_write_error();
 		if (ret == 0)
-			ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
+			ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
 		if (ret < 0) {
 			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
 						   cachefiles_trace_mkdir_error);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 5ed1e4cf6c0b..d26b4484fa60 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -513,7 +513,7 @@ static int ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	rc = lock_parent(dentry, &lower_dentry, &lower_dir);
 	if (!rc)
 		rc = vfs_mkdir(&nop_mnt_idmap, lower_dir,
-			       lower_dentry, mode);
+			       lower_dentry, mode, NULL);
 	if (rc || d_really_is_negative(lower_dentry))
 		goto out;
 	rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb);
diff --git a/fs/init.c b/fs/init.c
index e9387b6c4f30..325c9e4d9b20 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -232,7 +232,7 @@ int __init init_mkdir(const char *pathname, umode_t mode)
 	error = security_path_mkdir(&path, dentry, mode);
 	if (!error)
 		error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
-				  dentry, mode);
+				  dentry, mode, NULL);
 	done_path_create(&path, dentry);
 	return error;
 }
diff --git a/fs/namei.c b/fs/namei.c
index 56d3ebed7bac..6a22517f9938 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4103,7 +4103,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
  * raw inode simply pass @nop_mnt_idmap.
  */
 int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
-	      struct dentry *dentry, umode_t mode)
+	      struct dentry *dentry, umode_t mode, struct inode **delegated_inode)
 {
 	int error;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4123,6 +4123,10 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (max_links && dir->i_nlink >= max_links)
 		return -EMLINK;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		return error;
+
 	error = dir->i_op->mkdir(idmap, dir, dentry, mode);
 	if (!error)
 		fsnotify_mkdir(dir, dentry);
@@ -4136,6 +4140,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 	struct path path;
 	int error;
 	unsigned int lookup_flags = LOOKUP_DIRECTORY;
+	struct inode *delegated_inode = NULL;
 
 retry:
 	dentry = filename_create(dfd, name, &path, lookup_flags);
@@ -4145,11 +4150,15 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 
 	error = security_path_mkdir(&path, dentry,
 			mode_strip_umask(path.dentry->d_inode, mode));
-	if (!error) {
+	if (!error)
 		error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
-				  dentry, mode);
-	}
+				  dentry, mode, &delegated_inode);
 	done_path_create(&path, dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 2c060e0b1604..5bfced783a70 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -234,7 +234,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 		 * as well be forgiving and just succeed silently.
 		 */
 		goto out_put;
-	status = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
+	status = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU, NULL);
 out_put:
 	dput(dentry);
 out_unlock:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6a4c506038e0..e42e58825590 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1496,7 +1496,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			nfsd_check_ignore_resizing(iap);
 		break;
 	case S_IFDIR:
-		host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode);
+		host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, NULL);
 		if (!host_err && unlikely(d_unhashed(dchild))) {
 			struct dentry *d;
 			d = lookup_one_len(dchild->d_name.name,
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ee949f3e7c77..baa371947f86 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -242,7 +242,7 @@ static inline int ovl_do_mkdir(struct ovl_fs *ofs,
 			       struct inode *dir, struct dentry *dentry,
 			       umode_t mode)
 {
-	int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
+	int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, NULL);
 	pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err);
 	return err;
 }
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index c487e834331a..3760e0dda349 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -227,7 +227,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode)
 
 	idmap = mnt_idmap(path.mnt);
 	mode |= S_IFDIR;
-	err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode);
+	err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode, NULL);
 	if (!err && d_unhashed(dentry)) {
 		struct dentry *d;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8dfcddafdb3d..18eb7d628290 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1839,7 +1839,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
 int vfs_create(struct mnt_idmap *, struct inode *,
 	       struct dentry *, umode_t, bool);
 int vfs_mkdir(struct mnt_idmap *, struct inode *,
-	      struct dentry *, umode_t);
+	      struct dentry *, umode_t, struct inode **);
 int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
               umode_t, dev_t);
 int vfs_symlink(struct mnt_idmap *, struct inode *,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 05/24] vfs: allow rmdir to wait for delegation break on parent
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (3 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 04/24] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a new delegated_inode parameter to vfs_rmdir. Most callers will set
that to NULL, but do_rmdir can use that to wait for the delegation
break to complete and then retry.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 drivers/base/devtmpfs.c  |  2 +-
 fs/ecryptfs/inode.c      |  2 +-
 fs/namei.c               | 16 ++++++++++++++--
 fs/nfsd/nfs4recover.c    |  4 ++--
 fs/nfsd/vfs.c            |  2 +-
 fs/overlayfs/overlayfs.h |  2 +-
 fs/smb/server/vfs.c      |  4 ++--
 include/linux/fs.h       |  3 ++-
 8 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 8d1dbcad69f7..c00126796f79 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -248,7 +248,7 @@ static int dev_rmdir(const char *name)
 	if (d_really_is_positive(dentry)) {
 		if (d_inode(dentry)->i_private == &thread)
 			err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry),
-					dentry);
+					dentry, NULL);
 		else
 			err = -EPERM;
 	} else {
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index d26b4484fa60..3d0cddbf037c 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -541,7 +541,7 @@ static int ecryptfs_rmdir(struct inode *dir, struct dentry *dentry)
 		if (d_unhashed(lower_dentry))
 			rc = -EINVAL;
 		else
-			rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry);
+			rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry, NULL);
 	}
 	if (!rc) {
 		clear_nlink(d_inode(dentry));
diff --git a/fs/namei.c b/fs/namei.c
index 6a22517f9938..f00d8d708001 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4183,6 +4183,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
  * @idmap:	idmap of the mount the inode was found from
  * @dir:	inode of @dentry
  * @dentry:	pointer to dentry of the base directory
+ * @delegated_inode: return pointer for delegated inode
  *
  * Remove a directory.
  *
@@ -4193,7 +4194,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
  * raw inode simply pass @nop_mnt_idmap.
  */
 int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
-		     struct dentry *dentry)
+		     struct dentry *dentry, struct inode **delegated_inode)
 {
 	int error = may_delete(idmap, dir, dentry, 1);
 
@@ -4215,6 +4216,10 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		goto out;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		goto out;
+
 	error = dir->i_op->rmdir(dir, dentry);
 	if (error)
 		goto out;
@@ -4241,6 +4246,7 @@ int do_rmdir(int dfd, struct filename *name)
 	struct qstr last;
 	int type;
 	unsigned int lookup_flags = 0;
+	struct inode *delegated_inode = NULL;
 retry:
 	error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
 	if (error)
@@ -4274,7 +4280,8 @@ int do_rmdir(int dfd, struct filename *name)
 	error = security_path_rmdir(&path, dentry);
 	if (error)
 		goto exit4;
-	error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry);
+	error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode,
+			  dentry, &delegated_inode);
 exit4:
 	dput(dentry);
 exit3:
@@ -4282,6 +4289,11 @@ int do_rmdir(int dfd, struct filename *name)
 	mnt_drop_write(path.mnt);
 exit2:
 	path_put(&path);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 5bfced783a70..0e82f79471cf 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -354,7 +354,7 @@ nfsd4_unlink_clid_dir(char *name, int namlen, struct nfsd_net *nn)
 	status = -ENOENT;
 	if (d_really_is_negative(dentry))
 		goto out;
-	status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry);
+	status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry, NULL);
 out:
 	dput(dentry);
 out_unlock:
@@ -444,7 +444,7 @@ purge_old(struct dentry *parent, struct dentry *child, struct nfsd_net *nn)
 	if (nfs4_has_reclaimed_state(name, nn))
 		goto out_free;
 
-	status = vfs_rmdir(&nop_mnt_idmap, d_inode(parent), child);
+	status = vfs_rmdir(&nop_mnt_idmap, d_inode(parent), child, NULL);
 	if (status)
 		printk("failed to remove client recovery directory %pd\n",
 				child);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index e42e58825590..34cc2d1a4944 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -2003,7 +2003,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 				break;
 		}
 	} else {
-		host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry);
+		host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry, NULL);
 	}
 	fh_fill_post_attrs(fhp);
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index baa371947f86..5b1f56294c4d 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -203,7 +203,7 @@ static inline int ovl_do_notify_change(struct ovl_fs *ofs,
 static inline int ovl_do_rmdir(struct ovl_fs *ofs,
 			       struct inode *dir, struct dentry *dentry)
 {
-	int err = vfs_rmdir(ovl_upper_mnt_idmap(ofs), dir, dentry);
+	int err = vfs_rmdir(ovl_upper_mnt_idmap(ofs), dir, dentry, NULL);
 
 	pr_debug("rmdir(%pd2) = %i\n", dentry, err);
 	return err;
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 3760e0dda349..5b4e5876c2ac 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -611,7 +611,7 @@ int ksmbd_vfs_remove_file(struct ksmbd_work *work, const struct path *path)
 
 	idmap = mnt_idmap(path->mnt);
 	if (S_ISDIR(d_inode(path->dentry)->i_mode)) {
-		err = vfs_rmdir(idmap, d_inode(parent), path->dentry);
+		err = vfs_rmdir(idmap, d_inode(parent), path->dentry, NULL);
 		if (err && err != -ENOTEMPTY)
 			ksmbd_debug(VFS, "rmdir failed, err %d\n", err);
 	} else {
@@ -1084,7 +1084,7 @@ int ksmbd_vfs_unlink(struct file *filp)
 	dget(dentry);
 
 	if (S_ISDIR(d_inode(dentry)->i_mode))
-		err = vfs_rmdir(idmap, d_inode(dir), dentry);
+		err = vfs_rmdir(idmap, d_inode(dir), dentry, NULL);
 	else
 		err = vfs_unlink(idmap, d_inode(dir), dentry, NULL);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 18eb7d628290..e72c825476de 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1846,7 +1846,8 @@ int vfs_symlink(struct mnt_idmap *, struct inode *,
 		struct dentry *, const char *);
 int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
 	     struct dentry *, struct inode **);
-int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *);
+int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *,
+	       struct inode **);
 int vfs_unlink(struct mnt_idmap *, struct inode *, struct dentry *,
 	       struct inode **);
 

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (4 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 05/24] vfs: allow rmdir " Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-17  0:19   ` Al Viro
  2024-03-18  8:25   ` Stefan Metzmacher
  2024-03-15 16:52 ` [PATCH RFC 07/24] vfs: make vfs_create break delegations on parent directory Jeff Layton
                   ` (17 subsequent siblings)
  23 siblings, 2 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a delegated_inode parameter to lookup_open and have it break the
delegation. Then, open_last_lookups can wait for the delegation break
and retry the call to lookup_open once it's done.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index f00d8d708001..88598a62ec64 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3404,7 +3404,7 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
  */
 static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 				  const struct open_flags *op,
-				  bool got_write)
+				  bool got_write, struct inode **delegated_inode)
 {
 	struct mnt_idmap *idmap;
 	struct dentry *dir = nd->path.dentry;
@@ -3490,6 +3490,11 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 
 	/* Negative dentry, just create the file */
 	if (!dentry->d_inode && (open_flag & O_CREAT)) {
+		/* but break the directory lease first! */
+		error = try_break_deleg(dir_inode, delegated_inode);
+		if (error)
+			goto out_dput;
+
 		file->f_mode |= FMODE_CREATED;
 		audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 		if (!dir_inode->i_op->create) {
@@ -3517,6 +3522,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		   struct file *file, const struct open_flags *op)
 {
 	struct dentry *dir = nd->path.dentry;
+	struct inode *delegated_inode = NULL;
 	int open_flag = op->open_flag;
 	bool got_write = false;
 	struct dentry *dentry;
@@ -3553,7 +3559,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		if (unlikely(nd->last.name[nd->last.len]))
 			return ERR_PTR(-EISDIR);
 	}
-
+retry:
 	if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {
 		got_write = !mnt_want_write(nd->path.mnt);
 		/*
@@ -3566,7 +3572,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		inode_lock(dir->d_inode);
 	else
 		inode_lock_shared(dir->d_inode);
-	dentry = lookup_open(nd, file, op, got_write);
+	dentry = lookup_open(nd, file, op, got_write, &delegated_inode);
 	if (!IS_ERR(dentry) && (file->f_mode & FMODE_CREATED))
 		fsnotify_create(dir->d_inode, dentry);
 	if (open_flag & O_CREAT)
@@ -3577,8 +3583,16 @@ static const char *open_last_lookups(struct nameidata *nd,
 	if (got_write)
 		mnt_drop_write(nd->path.mnt);
 
-	if (IS_ERR(dentry))
+	if (IS_ERR(dentry)) {
+		if (delegated_inode) {
+			int error = break_deleg_wait(&delegated_inode);
+
+			if (!error)
+				goto retry;
+			return ERR_PTR(error);
+		}
 		return ERR_CAST(dentry);
+	}
 
 	if (file->f_mode & (FMODE_OPENED | FMODE_CREATED)) {
 		dput(nd->path.dentry);

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 07/24] vfs: make vfs_create break delegations on parent directory
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (5 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-15 16:52 ` [PATCH RFC 08/24] vfs: make vfs_mknod " Jeff Layton
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a new delegated_inode parameter to vfs_create. Most callers will
set that to NULL, but do_mknodat can use that to synchronously wait
for the delegation break to complete.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ecryptfs/inode.c      |  2 +-
 fs/namei.c               | 15 +++++++++++++--
 fs/nfsd/nfs3proc.c       |  2 +-
 fs/nfsd/vfs.c            |  2 +-
 fs/open.c                |  2 +-
 fs/overlayfs/overlayfs.h |  2 +-
 fs/smb/server/vfs.c      |  2 +-
 include/linux/fs.h       |  2 +-
 8 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 3d0cddbf037c..a99b1e264c46 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -189,7 +189,7 @@ ecryptfs_do_create(struct inode *directory_inode,
 	rc = lock_parent(ecryptfs_dentry, &lower_dentry, &lower_dir);
 	if (!rc)
 		rc = vfs_create(&nop_mnt_idmap, lower_dir,
-				lower_dentry, mode, true);
+				lower_dentry, mode, true, NULL);
 	if (rc) {
 		printk(KERN_ERR "%s: Failure to create dentry in lower fs; "
 		       "rc = [%d]\n", __func__, rc);
diff --git a/fs/namei.c b/fs/namei.c
index 88598a62ec64..01e04cf155eb 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3174,6 +3174,7 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
  * @dentry:	pointer to dentry of the base directory
  * @mode:	mode of the new file
  * @want_excl:	whether the file must not yet exist
+ * @delegated_inode: return pointer for delegated_inode
  *
  * Create a new file.
  *
@@ -3184,7 +3185,8 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
  * raw inode simply pass @nop_mnt_idmap.
  */
 int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
-	       struct dentry *dentry, umode_t mode, bool want_excl)
+	       struct dentry *dentry, umode_t mode, bool want_excl,
+	       struct inode **delegated_inode)
 {
 	int error;
 
@@ -3197,6 +3199,9 @@ int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
 
 	mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG);
 	error = security_inode_create(dir, dentry, mode);
+	if (error)
+		return error;
+	error = try_break_deleg(dir, delegated_inode);
 	if (error)
 		return error;
 	error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
@@ -4047,6 +4052,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	struct path path;
 	int error;
 	unsigned int lookup_flags = 0;
+	struct inode *delegated_inode = NULL;
 
 	error = may_mknod(mode);
 	if (error)
@@ -4066,7 +4072,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	switch (mode & S_IFMT) {
 		case 0: case S_IFREG:
 			error = vfs_create(idmap, path.dentry->d_inode,
-					   dentry, mode, true);
+					   dentry, mode, true, &delegated_inode);
 			if (!error)
 				ima_post_path_mknod(idmap, dentry);
 			break;
@@ -4081,6 +4087,11 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	}
 out2:
 	done_path_create(&path, dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index dfcc957e460d..e920a6291f2d 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -313,7 +313,7 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	status = fh_fill_pre_attrs(fhp);
 	if (status != nfs_ok)
 		goto out;
-	host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true);
+	host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true, NULL);
 	if (host_err < 0) {
 		status = nfserrno(host_err);
 		goto out;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 34cc2d1a4944..47b8ab1d4b17 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1491,7 +1491,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	switch (type) {
 	case S_IFREG:
 		host_err = vfs_create(&nop_mnt_idmap, dirp, dchild,
-				      iap->ia_mode, true);
+				      iap->ia_mode, true, NULL);
 		if (!host_err)
 			nfsd_check_ignore_resizing(iap);
 		break;
diff --git a/fs/open.c b/fs/open.c
index 0a73afe04d34..0b50ea7e8aec 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1137,7 +1137,7 @@ struct file *dentry_create(const struct path *path, int flags, umode_t mode,
 
 	error = vfs_create(mnt_idmap(path->mnt),
 			   d_inode(path->dentry->d_parent),
-			   path->dentry, mode, true);
+			   path->dentry, mode, true, NULL);
 	if (!error)
 		error = vfs_open(path, f);
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 5b1f56294c4d..be2518e6da95 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -232,7 +232,7 @@ static inline int ovl_do_create(struct ovl_fs *ofs,
 				struct inode *dir, struct dentry *dentry,
 				umode_t mode)
 {
-	int err = vfs_create(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, true);
+	int err = vfs_create(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, true, NULL);
 
 	pr_debug("create(%pd2, 0%o) = %i\n", dentry, mode, err);
 	return err;
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 5b4e5876c2ac..b313eb5a1d28 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -187,7 +187,7 @@ int ksmbd_vfs_create(struct ksmbd_work *work, const char *name, umode_t mode)
 
 	mode |= S_IFREG;
 	err = vfs_create(mnt_idmap(path.mnt), d_inode(path.dentry),
-			 dentry, mode, true);
+			 dentry, mode, true, NULL);
 	if (!err) {
 		ksmbd_vfs_inherit_owner(work, d_inode(path.dentry),
 					d_inode(dentry));
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e72c825476de..8fb4101fea49 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1837,7 +1837,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
  * VFS helper functions..
  */
 int vfs_create(struct mnt_idmap *, struct inode *,
-	       struct dentry *, umode_t, bool);
+	       struct dentry *, umode_t, bool, struct inode **);
 int vfs_mkdir(struct mnt_idmap *, struct inode *,
 	      struct dentry *, umode_t, struct inode **);
 int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 08/24] vfs: make vfs_mknod break delegations on parent directory
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (6 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 07/24] vfs: make vfs_create break delegations on parent directory Jeff Layton
@ 2024-03-15 16:52 ` Jeff Layton
  2024-03-20 13:42   ` Christian Brauner
  2024-03-15 16:53 ` [PATCH RFC 09/24] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:52 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a new delegated_inode parameter to vfs_mknod. Most callers will
set that to NULL, but do_mknodat can use that to synchronously wait
for the delegation break to complete.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 drivers/base/devtmpfs.c  |  2 +-
 fs/ecryptfs/inode.c      |  2 +-
 fs/init.c                |  2 +-
 fs/namei.c               | 11 ++++++++---
 fs/nfsd/vfs.c            |  2 +-
 fs/overlayfs/overlayfs.h |  2 +-
 include/linux/fs.h       |  4 ++--
 net/unix/af_unix.c       |  2 +-
 8 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index c00126796f79..8c0a872e3165 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -217,7 +217,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
 		return PTR_ERR(dentry);
 
 	err = vfs_mknod(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode,
-			dev->devt);
+			dev->devt, NULL);
 	if (!err) {
 		struct iattr newattrs;
 
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index a99b1e264c46..c6442b8caa2f 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -566,7 +566,7 @@ ecryptfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	rc = lock_parent(dentry, &lower_dentry, &lower_dir);
 	if (!rc)
 		rc = vfs_mknod(&nop_mnt_idmap, lower_dir,
-			       lower_dentry, mode, dev);
+			       lower_dentry, mode, dev, NULL);
 	if (rc || d_really_is_negative(lower_dentry))
 		goto out;
 	rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb);
diff --git a/fs/init.c b/fs/init.c
index 325c9e4d9b20..99c6b413adad 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -157,7 +157,7 @@ int __init init_mknod(const char *filename, umode_t mode, unsigned int dev)
 	error = security_path_mknod(&path, dentry, mode, dev);
 	if (!error)
 		error = vfs_mknod(mnt_idmap(path.mnt), path.dentry->d_inode,
-				  dentry, mode, new_decode_dev(dev));
+				  dentry, mode, new_decode_dev(dev), NULL);
 	done_path_create(&path, dentry);
 	return error;
 }
diff --git a/fs/namei.c b/fs/namei.c
index 01e04cf155eb..a185974c1a55 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3996,7 +3996,8 @@ EXPORT_SYMBOL(user_path_create);
  * raw inode simply pass @nop_mnt_idmap.
  */
 int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
-	      struct dentry *dentry, umode_t mode, dev_t dev)
+	      struct dentry *dentry, umode_t mode, dev_t dev,
+	      struct inode **delegated_inode)
 {
 	bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
 	int error = may_create(idmap, dir, dentry);
@@ -4020,6 +4021,10 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		return error;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		return error;
+
 	error = dir->i_op->mknod(idmap, dir, dentry, mode, dev);
 	if (!error)
 		fsnotify_create(dir, dentry);
@@ -4078,11 +4083,11 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 			break;
 		case S_IFCHR: case S_IFBLK:
 			error = vfs_mknod(idmap, path.dentry->d_inode,
-					  dentry, mode, new_decode_dev(dev));
+					  dentry, mode, new_decode_dev(dev), &delegated_inode);
 			break;
 		case S_IFIFO: case S_IFSOCK:
 			error = vfs_mknod(idmap, path.dentry->d_inode,
-					  dentry, mode, 0);
+					  dentry, mode, 0, &delegated_inode);
 			break;
 	}
 out2:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 47b8ab1d4b17..fe088e7c49c8 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1525,7 +1525,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	case S_IFIFO:
 	case S_IFSOCK:
 		host_err = vfs_mknod(&nop_mnt_idmap, dirp, dchild,
-				     iap->ia_mode, rdev);
+				     iap->ia_mode, rdev, NULL);
 		break;
 	default:
 		printk(KERN_WARNING "nfsd: bad file type %o in nfsd_create\n",
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index be2518e6da95..26cdef6c3579 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -251,7 +251,7 @@ static inline int ovl_do_mknod(struct ovl_fs *ofs,
 			       struct inode *dir, struct dentry *dentry,
 			       umode_t mode, dev_t dev)
 {
-	int err = vfs_mknod(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, dev);
+	int err = vfs_mknod(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, dev, NULL);
 
 	pr_debug("mknod(%pd2, 0%o, 0%o) = %i\n", dentry, mode, dev, err);
 	return err;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8fb4101fea49..4b396c9a7a84 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1841,7 +1841,7 @@ int vfs_create(struct mnt_idmap *, struct inode *,
 int vfs_mkdir(struct mnt_idmap *, struct inode *,
 	      struct dentry *, umode_t, struct inode **);
 int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
-              umode_t, dev_t);
+              umode_t, dev_t, struct inode **);
 int vfs_symlink(struct mnt_idmap *, struct inode *,
 		struct dentry *, const char *);
 int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
@@ -1879,7 +1879,7 @@ static inline int vfs_whiteout(struct mnt_idmap *idmap,
 			       struct inode *dir, struct dentry *dentry)
 {
 	return vfs_mknod(idmap, dir, dentry, S_IFCHR | WHITEOUT_MODE,
-			 WHITEOUT_DEV);
+			 WHITEOUT_DEV, NULL);
 }
 
 struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 0748e7ea5210..34fbcc90c984 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1227,7 +1227,7 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr,
 	idmap = mnt_idmap(parent.mnt);
 	err = security_path_mknod(&parent, dentry, mode, 0);
 	if (!err)
-		err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0);
+		err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0, NULL);
 	if (err)
 		goto out_path;
 	err = mutex_lock_interruptible(&u->bindlock);

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 09/24] filelock: lift the ban on directory leases in generic_setlease
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (7 preceding siblings ...)
  2024-03-15 16:52 ` [PATCH RFC 08/24] vfs: make vfs_mknod " Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 10/24] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

With the addition of the try_break_lease calls in directory changing
operations, allow generic_setlease to hand them out.

Note that this also makes directory leases available to userland via
fcntl(). I don't see a real reason to prevent userland from acquiring
one, but we could reinstate the prohibition if that's preferable.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 415cca8e9565..ba6b6f9ea4c7 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1930,7 +1930,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
 int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
 			void **priv)
 {
-	if (!S_ISREG(file_inode(filp)->i_mode))
+	struct inode *inode = file_inode(filp);
+
+	if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
 		return -EINVAL;
 
 	switch (arg) {

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 10/24] nfsd: allow filecache to hold S_IFDIR files
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (8 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 09/24] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories Jeff Layton
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

The filecache infrastructure will only handle S_ISREG files at the
moment. Plumb a "type" variable into nfsd_file_do_acquire and have all
of the existing callers set it to S_ISREG. Add a new
nfsd_file_acquire_dir() wrapper that we can then call to request a
nfsd_file that holds a directory open.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 37 +++++++++++++++++++++++++++++++------
 fs/nfsd/filecache.h |  2 ++
 fs/nfsd/vfs.c       |  5 +++--
 fs/nfsd/vfs.h       |  2 +-
 4 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index ddd3e0d9cfa6..ba66d571d567 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -979,7 +979,7 @@ nfsd_file_is_cached(struct inode *inode)
 static __be32
 nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		     unsigned int may_flags, struct file *file,
-		     struct nfsd_file **pnf, bool want_gc)
+		     umode_t type, bool want_gc, struct nfsd_file **pnf)
 {
 	unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
 	struct net *net = SVC_NET(rqstp);
@@ -991,7 +991,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	int ret;
 
 retry:
-	status = fh_verify(rqstp, fhp, S_IFREG,
+	status = fh_verify(rqstp, fhp, type,
 				may_flags|NFSD_MAY_OWNER_OVERRIDE);
 	if (status != nfs_ok)
 		return status;
@@ -1083,7 +1083,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			trace_nfsd_file_opened(nf, status);
 		} else {
 			ret = nfsd_open_verified(rqstp, fhp, may_flags,
-						 &nf->nf_file);
+						 type, &nf->nf_file);
 			if (ret == -EOPENSTALE && stale_retry) {
 				stale_retry = false;
 				nfsd_file_unhash(nf);
@@ -1139,7 +1139,7 @@ __be32
 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		     unsigned int may_flags, struct nfsd_file **pnf)
 {
-	return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, true);
+	return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, S_IFREG, true, pnf);
 }
 
 /**
@@ -1163,7 +1163,7 @@ __be32
 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		  unsigned int may_flags, struct nfsd_file **pnf)
 {
-	return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, false);
+	return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, S_IFREG, false, pnf);
 }
 
 /**
@@ -1189,7 +1189,32 @@ nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			 unsigned int may_flags, struct file *file,
 			 struct nfsd_file **pnf)
 {
-	return nfsd_file_do_acquire(rqstp, fhp, may_flags, file, pnf, false);
+	return nfsd_file_do_acquire(rqstp, fhp, may_flags, file, S_IFREG, false, pnf);
+}
+
+/**
+ * nfsd_file_acquire_dir - Get a struct nfsd_file with an open directory
+ * @rqstp: the RPC transaction being executed
+ * @fhp: the NFS filehandle of the file to be opened
+ * @pnf: OUT: new or found "struct nfsd_file" object
+ *
+ * The nfsd_file_object returned by this API is reference-counted
+ * but not garbage-collected. The object is unhashed after the
+ * final nfsd_file_put(). This opens directories only, and only
+ * in O_RDONLY mode.
+ *
+ * Return values:
+ *   %nfs_ok - @pnf points to an nfsd_file with its reference
+ *   count boosted.
+ *
+ * On error, an nfsstat value in network byte order is returned.
+ */
+__be32
+nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  struct nfsd_file **pnf)
+{
+	return nfsd_file_do_acquire(rqstp, fhp, NFSD_MAY_READ|NFSD_MAY_64BIT_COOKIE,
+				    NULL, S_IFDIR, false, pnf);
 }
 
 /*
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index c61884def906..de29a1c9d949 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -65,5 +65,7 @@ __be32 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 __be32 nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		  unsigned int may_flags, struct file *file,
 		  struct nfsd_file **nfp);
+__be32 nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  struct nfsd_file **pnf);
 int nfsd_file_cache_stats_show(struct seq_file *m, void *v);
 #endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fe088e7c49c8..a8313bba2a6f 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -950,6 +950,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
  * nfsd_open_verified - Open a regular file for the filecache
  * @rqstp: RPC request
  * @fhp: NFS filehandle of the file to open
+ * @type: S_IFMT inode type allowed (0 means any type is allowed)
  * @may_flags: internal permission flags
  * @filp: OUT: open "struct file *"
  *
@@ -957,9 +958,9 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
  */
 int
 nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp, int may_flags,
-		   struct file **filp)
+		   umode_t type, struct file **filp)
 {
-	return __nfsd_open(rqstp, fhp, S_IFREG, may_flags, filp);
+	return __nfsd_open(rqstp, fhp, type, may_flags, filp);
 }
 
 /*
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index c60fdb6200fd..c7f0349c179e 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -105,7 +105,7 @@ int 		nfsd_open_break_lease(struct inode *, int);
 __be32		nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
 				int, struct file **);
 int		nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp,
-				   int may_flags, struct file **filp);
+				   int may_flags, umode_t type, struct file **filp);
 __be32		nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				struct file *file, loff_t offset,
 				unsigned long *count,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (9 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 10/24] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-17 15:09   ` Chuck Lever
  2024-03-15 16:53 ` [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION Jeff Layton
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

fh_verify only allows you to filter on a single type of inode, so have
nfsd4_delegreturn not filter by type. Once fh_verify returns, do the
appropriate check of the type and return an error if it's not a regular
file or directory.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 17d09d72632b..c52e807f9672 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7425,12 +7425,24 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_delegation *dp;
 	stateid_t *stateid = &dr->dr_stateid;
 	struct nfs4_stid *s;
+	umode_t mode;
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
-	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
+	if ((status = fh_verify(rqstp, &cstate->current_fh, 0, 0)))
 		return status;
 
+	mode = d_inode(cstate->current_fh.fh_dentry)->i_mode & S_IFMT;
+	switch(mode) {
+	case S_IFREG:
+	case S_IFDIR:
+		break;
+	case S_IFLNK:
+		return nfserr_symlink;
+	default:
+		return nfserr_inval;
+	}
+
 	status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, 0, &s, nn);
 	if (status)
 		goto out;

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (10 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-17 15:34   ` Chuck Lever
  2024-03-15 16:53 ` [PATCH RFC 13/24] nfsd: check for delegation conflicts vs. the same client Jeff Layton
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

This adds basic infrastructure for handing GET_DIR_DELEGATION calls from
clients, including the  decoders and encoders. For now, the server side
always just returns that the  delegation is GDDR_UNAVAIL (and that we
won't call back).

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c   | 30 ++++++++++++++++++++++
 fs/nfsd/nfs4xdr.c    | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 fs/nfsd/xdr4.h       |  8 ++++++
 include/linux/nfs4.h |  5 ++++
 4 files changed, 113 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 2927b1263f08..7973fe17bf3c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2173,6 +2173,18 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned int layout_type)
 	return nfsd4_layout_ops[layout_type];
 }
 
+static __be32
+nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
+			 struct nfsd4_compound_state *cstate,
+			 union nfsd4_op_u *u)
+{
+	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+
+	/* FIXME: actually return a delegation */
+	gdd->nf_status = GDD4_UNAVAIL;
+	return nfs_ok;
+}
+
 static __be32
 nfsd4_getdeviceinfo(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, union nfsd4_op_u *u)
@@ -3082,6 +3094,18 @@ static u32 nfsd4_copy_notify_rsize(const struct svc_rqst *rqstp,
 		* sizeof(__be32);
 }
 
+static u32 nfsd4_get_dir_delegation_rsize(const struct svc_rqst *rqstp,
+					  const struct nfsd4_op *op)
+{
+	return (op_encode_hdr_size +
+		1 /* gddr_status */ +
+		op_encode_verifier_maxsz +
+		op_encode_stateid_maxsz +
+		2 /* gddr_notification */ +
+		2 /* gddr_child_attributes */ +
+		2 /* gddr_dir_attributes */);
+}
+
 #ifdef CONFIG_NFSD_PNFS
 static u32 nfsd4_getdeviceinfo_rsize(const struct svc_rqst *rqstp,
 				     const struct nfsd4_op *op)
@@ -3470,6 +3494,12 @@ static const struct nfsd4_operation nfsd4_ops[] = {
 		.op_get_currentstateid = nfsd4_get_freestateid,
 		.op_rsize_bop = nfsd4_only_status_rsize,
 	},
+	[OP_GET_DIR_DELEGATION] = {
+		.op_func = nfsd4_get_dir_delegation,
+		.op_flags = OP_MODIFIES_SOMETHING,
+		.op_name = "OP_GET_DIR_DELEGATION",
+		.op_rsize_bop = nfsd4_get_dir_delegation_rsize,
+	},
 #ifdef CONFIG_NFSD_PNFS
 	[OP_GETDEVICEINFO] = {
 		.op_func = nfsd4_getdeviceinfo,
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index fac938f563ad..3718bef74e9f 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1732,6 +1732,40 @@ nfsd4_decode_free_stateid(struct nfsd4_compoundargs *argp,
 	return nfsd4_decode_stateid4(argp, &free_stateid->fr_stateid);
 }
 
+static __be32
+nfsd4_decode_get_dir_delegation(struct nfsd4_compoundargs *argp,
+		union nfsd4_op_u *u)
+{
+	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+	struct timespec64 ts;
+	u32 signal_deleg_avail;
+	u32 attrs[1];
+	__be32 status;
+
+	memset(gdd, 0, sizeof(*gdd));
+
+	/* No signal_avail support for now (and maybe never) */
+	if (xdr_stream_decode_bool(argp->xdr, &signal_deleg_avail) < 0)
+		return nfserr_bad_xdr;
+	status = nfsd4_decode_bitmap4(argp, gdd->notification_types,
+				      ARRAY_SIZE(gdd->notification_types));
+	if (status)
+		return status;
+
+	/* For now, we don't support child or dir attr change notification */
+	status = nfsd4_decode_nfstime4(argp, &ts);
+	if (status)
+		return status;
+	/* No dir attr notification support yet either */
+	status = nfsd4_decode_nfstime4(argp, &ts);
+	if (status)
+		return status;
+	status = nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
+	if (status)
+		return status;
+	return nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
+}
+
 #ifdef CONFIG_NFSD_PNFS
 static __be32
 nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp,
@@ -2370,7 +2404,7 @@ static const nfsd4_dec nfsd4_dec_ops[] = {
 	[OP_CREATE_SESSION]	= nfsd4_decode_create_session,
 	[OP_DESTROY_SESSION]	= nfsd4_decode_destroy_session,
 	[OP_FREE_STATEID]	= nfsd4_decode_free_stateid,
-	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_notsupp,
+	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_get_dir_delegation,
 #ifdef CONFIG_NFSD_PNFS
 	[OP_GETDEVICEINFO]	= nfsd4_decode_getdeviceinfo,
 	[OP_GETDEVICELIST]	= nfsd4_decode_notsupp,
@@ -5002,6 +5036,40 @@ nfsd4_encode_device_addr4(struct xdr_stream *xdr,
 	return nfserr_toosmall;
 }
 
+static __be32
+nfsd4_encode_get_dir_delegation(struct nfsd4_compoundres *resp, __be32 nfserr,
+				union nfsd4_op_u *u)
+{
+	struct xdr_stream *xdr = resp->xdr;
+	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+
+	/* Encode the GDDR_* status code */
+	if (xdr_stream_encode_u32(xdr, gdd->nf_status) != XDR_UNIT)
+		return nfserr_resource;
+
+	/* if it's GDD4_UNAVAIL then we're (almost) done */
+	if (gdd->nf_status == GDD4_UNAVAIL) {
+		/* We never call back */
+		return nfsd4_encode_bool(xdr, false);
+	}
+
+	/* GDD4_OK case */
+	nfserr = nfsd4_encode_verifier4(xdr, &gdd->cookieverf);
+	if (nfserr)
+		return nfserr;
+	nfserr = nfsd4_encode_stateid4(xdr, &gdd->stateid);
+	if (nfserr)
+		return nfserr;
+	/* No notifications (yet) */
+	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
+	if (nfserr)
+		return nfserr;
+	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
+	if (nfserr)
+		return nfserr;
+	return nfsd4_encode_bitmap4(xdr, 0, 0, 0);
+}
+
 static __be32
 nfsd4_encode_getdeviceinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
 		union nfsd4_op_u *u)
@@ -5580,7 +5648,7 @@ static const nfsd4_enc nfsd4_enc_ops[] = {
 	[OP_CREATE_SESSION]	= nfsd4_encode_create_session,
 	[OP_DESTROY_SESSION]	= nfsd4_encode_noop,
 	[OP_FREE_STATEID]	= nfsd4_encode_noop,
-	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_noop,
+	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_get_dir_delegation,
 #ifdef CONFIG_NFSD_PNFS
 	[OP_GETDEVICEINFO]	= nfsd4_encode_getdeviceinfo,
 	[OP_GETDEVICELIST]	= nfsd4_encode_noop,
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 415516c1b27e..27de75f32dea 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -518,6 +518,13 @@ struct nfsd4_free_stateid {
 	stateid_t	fr_stateid;         /* request */
 };
 
+struct nfsd4_get_dir_delegation {
+	u32		notification_types[1];	/* request */
+	u32		nf_status;		/* response */
+	nfs4_verifier	cookieverf;		/* response */
+	stateid_t	stateid;		/* response */
+};
+
 /* also used for NVERIFY */
 struct nfsd4_verify {
 	u32		ve_bmval[3];        /* request */
@@ -797,6 +804,7 @@ struct nfsd4_op {
 		struct nfsd4_reclaim_complete	reclaim_complete;
 		struct nfsd4_test_stateid	test_stateid;
 		struct nfsd4_free_stateid	free_stateid;
+		struct nfsd4_get_dir_delegation	get_dir_delegation;
 		struct nfsd4_getdeviceinfo	getdeviceinfo;
 		struct nfsd4_layoutget		layoutget;
 		struct nfsd4_layoutcommit	layoutcommit;
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index ef8d2d618d5b..11ad088b411d 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -701,6 +701,11 @@ enum state_protect_how4 {
 	SP4_SSV		= 2
 };
 
+enum get_dir_delegation_non_fatal_res {
+	GDD4_OK		= 0,
+	GDD4_UNAVAIL	= 1
+};
+
 enum pnfs_layouttype {
 	LAYOUT_NFSV4_1_FILES  = 1,
 	LAYOUT_OSD2_OBJECTS = 2,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 13/24] nfsd: check for delegation conflicts vs. the same client
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (11 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

RFC 8881 requires that the server reply with GDD_UNAVAIL when the client
requests a directory delegation that it already holds.

When setting a directory delegation, check that the client assocated
with the stateid doesn't match an existing delegation. If it does,
reject the setlease attempt.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c52e807f9672..778c1c6e3b54 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -88,6 +88,7 @@ void nfsd4_end_grace(struct nfsd_net *nn);
 static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
 static void nfsd4_file_hash_remove(struct nfs4_file *fi);
 static void deleg_reaper(struct nfsd_net *nn);
+static bool nfsd_dir_deleg_conflict(struct file_lease *new, struct file_lease *old);
 
 /* Locking: */
 
@@ -5262,6 +5263,31 @@ static const struct lease_manager_operations nfsd_lease_mng_ops = {
 	.lm_change = nfsd_change_deleg_cb,
 };
 
+static const struct lease_manager_operations nfsd_dir_lease_mng_ops = {
+	.lm_breaker_owns_lease = nfsd_breaker_owns_lease,
+	.lm_break = nfsd_break_deleg_cb,
+	.lm_change = nfsd_change_deleg_cb,
+	.lm_set_conflict = nfsd_dir_deleg_conflict,
+};
+
+static bool
+nfsd_dir_deleg_conflict(struct file_lease *new, struct file_lease *old)
+{
+	struct nfs4_delegation *od, *nd;
+
+	/* Only conflicts with other nfsd dir delegs */
+	if (old->fl_lmops != &nfsd_dir_lease_mng_ops)
+		return false;
+
+	od = old->c.flc_owner;
+	nd = new->c.flc_owner;
+
+	/* Are these for the same client? No bueno if so */
+	if (od->dl_stid.sc_client == nd->dl_stid.sc_client)
+		return true;
+	return false;
+}
+
 static __be32 nfsd4_check_seqid(struct nfsd4_compound_state *cstate, struct nfs4_stateowner *so, u32 seqid)
 {
 	if (nfsd4_has_session(cstate))
@@ -5609,12 +5635,13 @@ static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp,
 	fl = locks_alloc_lease();
 	if (!fl)
 		return NULL;
-	fl->fl_lmops = &nfsd_lease_mng_ops;
 	fl->c.flc_flags = FL_DELEG;
 	fl->c.flc_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->c.flc_owner = (fl_owner_t)dp;
 	fl->c.flc_pid = current->tgid;
 	fl->c.flc_file = dp->dl_stid.sc_file->fi_deleg_file->nf_file;
+	fl->fl_lmops = S_ISDIR(file_inode(fl->c.flc_file)->i_mode) ?
+				&nfsd_dir_lease_mng_ops : &nfsd_lease_mng_ops;
 	return fl;
 }
 

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (12 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 13/24] nfsd: check for delegation conflicts vs. the same client Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-17 15:42   ` Chuck Lever
  2024-03-15 16:53 ` [PATCH RFC 15/24] nfs: fix nfs_stateid_hash prototype when CONFIG_CRC32 isn't set Jeff Layton
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Add a new routine for acquiring a read delegation on a directory. Since
the same CB_RECALL/DELEGRETURN infrastrure is used for regular and
directory delegations, we can just use a normal nfs4_delegation to
represent it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c  | 23 ++++++++++++++++--
 fs/nfsd/nfs4state.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/state.h     |  5 ++++
 3 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 7973fe17bf3c..1a2c90f4ea53 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2179,9 +2179,28 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
 			 union nfsd4_op_u *u)
 {
 	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+	struct nfs4_delegation *dd;
+	struct nfsd_file *nf;
+	__be32 status;
+
+	status = nfsd_file_acquire_dir(rqstp, &cstate->current_fh, &nf);
+	if (status != nfs_ok)
+		return status;
+
+	dd = nfsd_get_dir_deleg(cstate, gdd, nf);
+	if (IS_ERR(dd)) {
+		int err = PTR_ERR(dd);
+
+		if (err != -EAGAIN)
+			return nfserrno(err);
+		gdd->nf_status = GDD4_UNAVAIL;
+		return nfs_ok;
+	}
 
-	/* FIXME: actually return a delegation */
-	gdd->nf_status = GDD4_UNAVAIL;
+	gdd->nf_status = GDD4_OK;
+	memcpy(&gdd->stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->stateid));
+	memset(&gdd->cookieverf, '\0', sizeof(gdd->cookieverf));
+	nfs4_put_stid(&dd->dl_stid);
 	return nfs_ok;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 778c1c6e3b54..36574aedc211 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -8874,7 +8874,7 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
 			}
 break_lease:
 			nfsd_stats_wdeleg_getattr_inc(nn);
-			dp = fl->fl_owner;
+			dp = fl->c.flc_owner;
 			ncf = &dp->dl_cb_fattr;
 			nfs4_cb_getattr(&dp->dl_cb_fattr);
 			spin_unlock(&ctx->flc_lock);
@@ -8912,3 +8912,70 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
 	spin_unlock(&ctx->flc_lock);
 	return 0;
 }
+
+struct nfs4_delegation *
+nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+		   struct nfsd4_get_dir_delegation *gdd,
+		   struct nfsd_file *nf)
+{
+	struct nfs4_client *clp = cstate->clp;
+	struct nfs4_delegation *dp;
+	struct file_lease *fl;
+	struct nfs4_file *fp;
+	int status = 0;
+
+	fp = nfsd4_alloc_file();
+	if (!fp)
+		return ERR_PTR(-ENOMEM);
+
+	nfsd4_file_init(&cstate->current_fh, fp);
+	fp->fi_deleg_file = nf;
+	fp->fi_delegees = 1;
+
+	spin_lock(&state_lock);
+	spin_lock(&fp->fi_lock);
+	if (nfs4_delegation_exists(clp, fp))
+		status = -EAGAIN;
+	spin_unlock(&fp->fi_lock);
+	spin_unlock(&state_lock);
+
+	if (status)
+		goto out_delegees;
+
+	status = -ENOMEM;
+	dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
+	if (!dp)
+		goto out_delegees;
+
+	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
+	if (!fl)
+		goto out_put_stid;
+
+	status = kernel_setlease(nf->nf_file,
+				 fl->c.flc_type, &fl, NULL);
+	if (fl)
+		locks_free_lease(fl);
+	if (status)
+		goto out_put_stid;
+
+	spin_lock(&state_lock);
+	spin_lock(&clp->cl_lock);
+	spin_lock(&fp->fi_lock);
+	status = hash_delegation_locked(dp, fp);
+	spin_unlock(&fp->fi_lock);
+	spin_unlock(&clp->cl_lock);
+	spin_unlock(&state_lock);
+
+	if (status)
+		goto out_unlock;
+
+	return dp;
+out_unlock:
+	kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);
+out_put_stid:
+	nfs4_put_stid(&dp->dl_stid);
+out_delegees:
+	put_deleg_file(fp);
+	return ERR_PTR(status);
+}
+
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 01c6f3445646..20551483cc7b 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -782,4 +782,9 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
 
 extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
 		struct inode *inode, bool *file_modified, u64 *size);
+
+struct nfsd4_get_dir_delegation;
+struct nfs4_delegation *nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+						struct nfsd4_get_dir_delegation *gdd,
+						struct nfsd_file *nf);
 #endif   /* NFSD4_STATE_H */

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 15/24] nfs: fix nfs_stateid_hash prototype when CONFIG_CRC32 isn't set
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (13 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 16/24] nfs: remove unused NFS_CALL macro Jeff Layton
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

The alternate version without CONFIG_CRC32 should have the argument as
const.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index e3722ce6722e..9203b3bb78b3 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -868,7 +868,7 @@ static inline u32 nfs_stateid_hash(const nfs4_stateid *stateid)
 				NFS4_STATEID_OTHER_SIZE);
 }
 #else
-static inline u32 nfs_stateid_hash(nfs4_stateid *stateid)
+static inline u32 nfs_stateid_hash(const nfs4_stateid *stateid)
 {
 	return 0;
 }

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 16/24] nfs: remove unused NFS_CALL macro
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (14 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 15/24] nfs: fix nfs_stateid_hash prototype when CONFIG_CRC32 isn't set Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 17/24] nfs: add cache_validity to the nfs_inode_event tracepoints Jeff Layton
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Nothing uses this, and thank goodness as the syntax is horrid.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/nfs_xdr.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 539b57fbf3ce..d09b9773b20c 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1820,13 +1820,6 @@ struct nfs_rpc_ops {
 	void	(*disable_swap)(struct inode *inode);
 };
 
-/*
- * 	NFS_CALL(getattr, inode, (fattr));
- * into
- *	NFS_PROTO(inode)->getattr(fattr);
- */
-#define NFS_CALL(op, inode, args)	NFS_PROTO(inode)->op args
-
 /*
  * Function vectors etc. for the NFS client
  */

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 17/24] nfs: add cache_validity to the nfs_inode_event tracepoints
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (15 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 16/24] nfs: remove unused NFS_CALL macro Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 18/24] nfs: add a tracepoint to nfs_inode_detach_delegation_locked Jeff Layton
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Managing the cache_validity flags is the deep voodoo of NFS cache
coherency. Let's have a little extra visibility into that value via the
nfs_inode_event tracepoints.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfstrace.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index afedb449b54f..e0cd3601d1f7 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -56,6 +56,7 @@ DECLARE_EVENT_CLASS(nfs_inode_event,
 			__field(u32, fhandle)
 			__field(u64, fileid)
 			__field(u64, version)
+			__field(unsigned long, cache_validity)
 		),
 
 		TP_fast_assign(
@@ -64,14 +65,17 @@ DECLARE_EVENT_CLASS(nfs_inode_event,
 			__entry->fileid = nfsi->fileid;
 			__entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
 			__entry->version = inode_peek_iversion_raw(inode);
+			__entry->cache_validity = nfsi->cache_validity;
 		),
 
 		TP_printk(
-			"fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu ",
+			"fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu cache_validity=0x%lx (%s)",
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->fileid,
 			__entry->fhandle,
-			(unsigned long long)__entry->version
+			(unsigned long long)__entry->version,
+			__entry->cache_validity,
+			nfs_show_cache_validity(__entry->cache_validity)
 		)
 );
 

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 18/24] nfs: add a tracepoint to nfs_inode_detach_delegation_locked
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (16 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 17/24] nfs: add cache_validity to the nfs_inode_event tracepoints Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 19/24] nfs: new tracepoint in nfs_delegation_need_return Jeff Layton
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

We have a tracepoint for setting a delegation and reclaiming them. Add a
tracepoint for when the delegation is being detached from the inode.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/delegation.c | 2 ++
 fs/nfs/nfs4trace.h  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index d4a42ce0c7e3..a331a2dbae12 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -342,6 +342,8 @@ nfs_detach_delegation_locked(struct nfs_inode *nfsi,
 		rcu_dereference_protected(nfsi->delegation,
 				lockdep_is_held(&clp->cl_lock));
 
+	trace_nfs4_detach_delegation(&nfsi->vfs_inode, delegation->type);
+
 	if (deleg_cur == NULL || delegation != deleg_cur)
 		return NULL;
 
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index fd7cb15b08b2..b4ebac1961cc 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -926,6 +926,7 @@ DECLARE_EVENT_CLASS(nfs4_set_delegation_event,
 			TP_ARGS(inode, fmode))
 DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_set_delegation);
 DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_reclaim_delegation);
+DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_detach_delegation);
 
 TRACE_EVENT(nfs4_delegreturn_exit,
 		TP_PROTO(

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 19/24] nfs: new tracepoint in nfs_delegation_need_return
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (17 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 18/24] nfs: add a tracepoint to nfs_inode_detach_delegation_locked Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 20/24] nfs: new tracepoint in match_stateid operation Jeff Layton
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Add a tracepoint in the function that decides whether to return a
delegation to the server.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/delegation.c |  2 ++
 fs/nfs/nfs4trace.h  | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index a331a2dbae12..88dbec4739d3 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -571,6 +571,8 @@ static bool nfs_delegation_need_return(struct nfs_delegation *delegation)
 {
 	bool ret = false;
 
+	trace_nfs_delegation_need_return(delegation);
+
 	if (test_and_clear_bit(NFS_DELEGATION_RETURN, &delegation->flags))
 		ret = true;
 	else if (test_bit(NFS_DELEGATION_RETURN_IF_CLOSED, &delegation->flags)) {
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index b4ebac1961cc..e5adfb755dc7 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -14,6 +14,8 @@
 #include <trace/misc/fs.h>
 #include <trace/misc/nfs.h>
 
+#include "delegation.h"
+
 #define show_nfs_fattr_flags(valid) \
 	__print_flags((unsigned long)valid, "|", \
 		{ NFS_ATTR_FATTR_TYPE, "TYPE" }, \
@@ -928,6 +930,51 @@ DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_set_delegation);
 DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_reclaim_delegation);
 DEFINE_NFS4_SET_DELEGATION_EVENT(nfs4_detach_delegation);
 
+#define show_delegation_flags(flags) \
+	__print_flags(flags, "|", \
+		{ BIT(NFS_DELEGATION_NEED_RECLAIM), "NEED_RECLAIM" }, \
+		{ BIT(NFS_DELEGATION_RETURN), "RETURN" }, \
+		{ BIT(NFS_DELEGATION_RETURN_IF_CLOSED), "RETURN_IF_CLOSED" }, \
+		{ BIT(NFS_DELEGATION_REFERENCED), "REFERENCED" }, \
+		{ BIT(NFS_DELEGATION_RETURNING), "RETURNING" }, \
+		{ BIT(NFS_DELEGATION_REVOKED), "REVOKED" }, \
+		{ BIT(NFS_DELEGATION_TEST_EXPIRED), "TEST_EXPIRED" }, \
+		{ BIT(NFS_DELEGATION_INODE_FREEING), "INODE_FREEING" }, \
+		{ BIT(NFS_DELEGATION_RETURN_DELAYED), "RETURN_DELAYED" })
+
+DECLARE_EVENT_CLASS(nfs4_delegation_event,
+		TP_PROTO(
+			const struct nfs_delegation *delegation
+		),
+
+		TP_ARGS(delegation),
+
+		TP_STRUCT__entry(
+			__field(u32, fhandle)
+			__field(unsigned int, fmode)
+			__field(unsigned long, flags)
+		),
+
+		TP_fast_assign(
+			__entry->fhandle = nfs_fhandle_hash(NFS_FH(delegation->inode));
+			__entry->fmode = delegation->type;
+			__entry->flags = delegation->flags;
+		),
+
+		TP_printk(
+			"fhandle=0x%08x fmode=%s flags=%s",
+			__entry->fhandle, show_fs_fmode_flags(__entry->fmode),
+			show_delegation_flags(__entry->flags)
+		)
+);
+#define DEFINE_NFS4_DELEGATION_EVENT(name) \
+	DEFINE_EVENT(nfs4_delegation_event, name, \
+			TP_PROTO( \
+				const struct nfs_delegation *delegation \
+			), \
+			TP_ARGS(delegation))
+DEFINE_NFS4_DELEGATION_EVENT(nfs_delegation_need_return);
+
 TRACE_EVENT(nfs4_delegreturn_exit,
 		TP_PROTO(
 			const struct nfs4_delegreturnargs *args,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 20/24] nfs: new tracepoint in match_stateid operation
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (18 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 19/24] nfs: new tracepoint in nfs_delegation_need_return Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation Jeff Layton
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Add new tracepoints in the NFSv4 match_stateid minorversion op that show
the info in both stateids.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfs4proc.c  |  4 ++++
 fs/nfs/nfs4trace.h | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 815996cb27fc..d518388bd0d6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -10432,6 +10432,8 @@ nfs41_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp)
 static bool nfs41_match_stateid(const nfs4_stateid *s1,
 		const nfs4_stateid *s2)
 {
+	trace_nfs41_match_stateid(s1, s2);
+
 	if (s1->type != s2->type)
 		return false;
 
@@ -10449,6 +10451,8 @@ static bool nfs41_match_stateid(const nfs4_stateid *s1,
 static bool nfs4_match_stateid(const nfs4_stateid *s1,
 		const nfs4_stateid *s2)
 {
+	trace_nfs4_match_stateid(s1, s2);
+
 	return nfs4_stateid_match(s1, s2);
 }
 
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index e5adfb755dc7..f531728a5b4a 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -1467,6 +1467,62 @@ DECLARE_EVENT_CLASS(nfs4_inode_stateid_callback_event,
 DEFINE_NFS4_INODE_STATEID_CALLBACK_EVENT(nfs4_cb_recall);
 DEFINE_NFS4_INODE_STATEID_CALLBACK_EVENT(nfs4_cb_layoutrecall_file);
 
+#define show_stateid_type(type) \
+	__print_symbolic(type, \
+		{ NFS4_INVALID_STATEID_TYPE, "INVALID" }, \
+		{ NFS4_SPECIAL_STATEID_TYPE, "SPECIAL" }, \
+		{ NFS4_OPEN_STATEID_TYPE, "OPEN" }, \
+		{ NFS4_LOCK_STATEID_TYPE, "LOCK" }, \
+		{ NFS4_DELEGATION_STATEID_TYPE, "DELEGATION" }, \
+		{ NFS4_LAYOUT_STATEID_TYPE, "LAYOUT" },	\
+		{ NFS4_PNFS_DS_STATEID_TYPE, "PNFS_DS" }, \
+		{ NFS4_REVOKED_STATEID_TYPE, "REVOKED" })
+
+DECLARE_EVENT_CLASS(nfs4_match_stateid_event,
+		TP_PROTO(
+			const nfs4_stateid *s1,
+			const nfs4_stateid *s2
+		),
+
+		TP_ARGS(s1, s2),
+
+		TP_STRUCT__entry(
+			__field(int, s1_seq)
+			__field(int, s2_seq)
+			__field(u32, s1_hash)
+			__field(u32, s2_hash)
+			__field(int, s1_type)
+			__field(int, s2_type)
+		),
+
+		TP_fast_assign(
+			__entry->s1_seq = s1->seqid;
+			__entry->s1_hash = nfs_stateid_hash(s1);
+			__entry->s1_type = s1->type;
+			__entry->s2_seq = s2->seqid;
+			__entry->s2_hash = nfs_stateid_hash(s2);
+			__entry->s2_type = s2->type;
+		),
+
+		TP_printk(
+			"s1=%s:%x:%u s2=%s:%x:%u",
+			show_stateid_type(__entry->s1_type),
+			__entry->s1_hash, __entry->s1_seq,
+			show_stateid_type(__entry->s2_type),
+			__entry->s2_hash, __entry->s2_seq
+		)
+);
+
+#define DEFINE_NFS4_MATCH_STATEID_EVENT(name) \
+	DEFINE_EVENT(nfs4_match_stateid_event, name, \
+			TP_PROTO( \
+				const nfs4_stateid *s1, \
+				const nfs4_stateid *s2 \
+			), \
+			TP_ARGS(s1, s2))
+DEFINE_NFS4_MATCH_STATEID_EVENT(nfs41_match_stateid);
+DEFINE_NFS4_MATCH_STATEID_EVENT(nfs4_match_stateid);
+
 DECLARE_EVENT_CLASS(nfs4_idmap_event,
 		TP_PROTO(
 			const char *name,

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (19 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 20/24] nfs: new tracepoint in match_stateid operation Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 20:50   ` Anna Schumaker
  2024-03-15 16:53 ` [PATCH RFC 22/24] nfs: skip dentry revalidation when parent dir has a delegation Jeff Layton
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

Add a new compound that does a GET_DIR_DELEGATION just before doing a
GETATTR on an inode. Add a delegation stateid and a nf_status code to
struct nfs4_getattr_res to store the result.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfs4xdr.c        | 136 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/nfs4.h    |   1 +
 include/linux/nfs_xdr.h |   2 +
 3 files changed, 139 insertions(+)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 1416099dfcd1..c28025018bda 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -391,6 +391,22 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
 				XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN) + 5)
 #define encode_reclaim_complete_maxsz	(op_encode_hdr_maxsz + 4)
 #define decode_reclaim_complete_maxsz	(op_decode_hdr_maxsz + 4)
+#define encode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +				\
+					 4 /* gdda_signal_deleg_avail */ +		\
+					 8 /* gdda_notification_types */ +		\
+					 nfstime4_maxsz  /* gdda_child_attr_delay */ +	\
+					 nfstime4_maxsz  /* gdda_dir_attr_delay */ +	\
+					 nfs4_fattr_bitmap_maxsz /* gdda_child_attributes */ + \
+					 nfs4_fattr_bitmap_maxsz /* gdda_dir_attributes */)
+
+#define decode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +				\
+					 4 /* gddrnf_status */ +			\
+					 encode_verifier_maxsz /* gddr_cookieverf */ +	\
+					 encode_stateid_maxsz /* gddr_stateid */ +	\
+					 8 /* gddr_notification */ +			\
+					 nfs4_fattr_bitmap_maxsz /* gddr_child_attributes */ + \
+					 nfs4_fattr_bitmap_maxsz /* gddr_dir_attributes */)
+
 #define encode_getdeviceinfo_maxsz (op_encode_hdr_maxsz + \
 				XDR_QUADLEN(NFS4_DEVICEID4_SIZE) + \
 				1 /* layout type */ + \
@@ -636,6 +652,18 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
 				decode_putfh_maxsz + \
 				decode_getattr_maxsz + \
 				decode_renew_maxsz)
+#define NFS4_enc_gdd_getattr_sz	(compound_encode_hdr_maxsz + \
+				encode_sequence_maxsz + \
+				encode_putfh_maxsz + \
+				encode_get_dir_delegation_maxsz + \
+				encode_getattr_maxsz + \
+				encode_renew_maxsz)
+#define NFS4_dec_gdd_getattr_sz	(compound_decode_hdr_maxsz + \
+				decode_sequence_maxsz + \
+				decode_putfh_maxsz + \
+				decode_get_dir_delegation_maxsz + \
+				decode_getattr_maxsz + \
+				decode_renew_maxsz)
 #define NFS4_enc_lookup_sz	(compound_encode_hdr_maxsz + \
 				encode_sequence_maxsz + \
 				encode_putfh_maxsz + \
@@ -1981,6 +2009,30 @@ static void encode_sequence(struct xdr_stream *xdr,
 }
 
 #ifdef CONFIG_NFS_V4_1
+static void
+encode_get_dir_delegation(struct xdr_stream *xdr, struct compound_hdr *hdr)
+{
+	__be32 *p;
+	struct timespec64 ts = {};
+	u32 zerobm[1] = {};
+
+	encode_op_hdr(xdr, OP_GET_DIR_DELEGATION, decode_get_dir_delegation_maxsz, hdr);
+
+	/* We can't handle CB_RECALLABLE_OBJ_AVAIL yet */
+	xdr_stream_encode_bool(xdr, false);
+
+	/* for now, we request no notification types */
+	xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
+
+	/* Request no attribute updates */
+	p = reserve_space(xdr, 12 + 12);
+	p = xdr_encode_nfstime4(p, &ts);
+	xdr_encode_nfstime4(p, &ts);
+
+	xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
+	xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
+}
+
 static void
 encode_getdeviceinfo(struct xdr_stream *xdr,
 		     const struct nfs4_getdeviceinfo_args *args,
@@ -2334,6 +2386,25 @@ static void nfs4_xdr_enc_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_nops(&hdr);
 }
 
+/*
+ * Encode GDD_GETATTR request
+ */
+static void nfs4_xdr_enc_gdd_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
+				     const void *data)
+{
+	const struct nfs4_getattr_arg *args = data;
+	struct compound_hdr hdr = {
+		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
+	};
+
+	encode_compound_hdr(xdr, req, &hdr);
+	encode_sequence(xdr, &args->seq_args, &hdr);
+	encode_putfh(xdr, args->fh, &hdr);
+	encode_get_dir_delegation(xdr, &hdr);
+	encode_getfattr(xdr, args->bitmask, &hdr);
+	encode_nops(&hdr);
+}
+
 /*
  * Encode a CLOSE request
  */
@@ -5919,6 +5990,43 @@ static int decode_layout_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid)
 	return decode_stateid(xdr, stateid);
 }
 
+static int decode_get_dir_delegation(struct xdr_stream *xdr,
+				     struct nfs4_getattr_res *res)
+{
+	nfs4_verifier	cookieverf;
+	int		status;
+	u32		bm[1];
+
+	status = decode_op_hdr(xdr, OP_GET_DIR_DELEGATION);
+	if (status)
+		return status;
+
+	if (xdr_stream_decode_u32(xdr, &res->nf_status))
+		return -EIO;
+
+	if (res->nf_status == GDD4_UNAVAIL)
+		return xdr_inline_decode(xdr, 4) ? 0 : -EIO;
+
+	status = decode_verifier(xdr, &cookieverf);
+	if (status)
+		return status;
+
+	status = decode_delegation_stateid(xdr, &res->deleg);
+	if (status)
+		return status;
+
+	status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
+	if (status < 0)
+		return status;
+	status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
+	if (status < 0)
+		return status;
+	status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
+	if (status < 0)
+		return status;
+	return 0;
+}
+
 static int decode_getdeviceinfo(struct xdr_stream *xdr,
 				struct nfs4_getdeviceinfo_res *res)
 {
@@ -6455,6 +6563,33 @@ static int nfs4_xdr_dec_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
 	return status;
 }
 
+/*
+ * Decode GDD_GETATTR response
+ */
+static int nfs4_xdr_dec_gdd_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
+				    void *data)
+{
+	struct nfs4_getattr_res *res = data;
+	struct compound_hdr hdr;
+	int status;
+
+	status = decode_compound_hdr(xdr, &hdr);
+	if (status)
+		goto out;
+	status = decode_sequence(xdr, &res->seq_res, rqstp);
+	if (status)
+		goto out;
+	status = decode_putfh(xdr);
+	if (status)
+		goto out;
+	status = decode_get_dir_delegation(xdr, res);
+	if (status)
+		goto out;
+	status = decode_getfattr(xdr, res->fattr, res->server);
+out:
+	return status;
+}
+
 /*
  * Encode an SETACL request
  */
@@ -7704,6 +7839,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
 	PROC41(BIND_CONN_TO_SESSION,
 			enc_bind_conn_to_session, dec_bind_conn_to_session),
 	PROC41(DESTROY_CLIENTID,enc_destroy_clientid,	dec_destroy_clientid),
+	PROC41(GDD_GETATTR,	enc_gdd_getattr,	dec_gdd_getattr),
 	PROC42(SEEK,		enc_seek,		dec_seek),
 	PROC42(ALLOCATE,	enc_allocate,		dec_allocate),
 	PROC42(DEALLOCATE,	enc_deallocate,		dec_deallocate),
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 11ad088b411d..86cbfd50ecd1 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -681,6 +681,7 @@ enum {
 	NFSPROC4_CLNT_LISTXATTRS,
 	NFSPROC4_CLNT_REMOVEXATTR,
 	NFSPROC4_CLNT_READ_PLUS,
+	NFSPROC4_CLNT_GDD_GETATTR,
 };
 
 /* nfs41 types */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index d09b9773b20c..85ee37ccc25e 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1072,6 +1072,8 @@ struct nfs4_getattr_res {
 	struct nfs4_sequence_res	seq_res;
 	const struct nfs_server *	server;
 	struct nfs_fattr *		fattr;
+	nfs4_stateid			deleg;
+	u32				nf_status;
 };
 
 struct nfs4_link_arg {

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 22/24] nfs: skip dentry revalidation when parent dir has a delegation
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (20 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 23/24] nfs: optionally request a delegation on GETATTR Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 24/24] nfs: add a module parameter to disable directory delegations Jeff Layton
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

If the parent has a valid delegation, then test whether the
dentry->d_time matches the current change attr on the directory. If it
does, then we can declare the dentry valid with no further checks.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/dir.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index ac505671efbd..061648c73116 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2188,6 +2188,15 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry,
 }
 EXPORT_SYMBOL_GPL(nfs_atomic_open);
 
+static int
+nfs_lookup_revalidate_delegated_parent(struct inode *dir, struct dentry *dentry,
+				       struct inode *inode)
+{
+	return nfs_lookup_revalidate_done(dir, dentry, inode,
+					  nfs_verify_change_attribute(dir, dentry->d_time) ?
+					  1 : 0);
+}
+
 static int
 nfs4_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 			  unsigned int flags)
@@ -2212,6 +2221,9 @@ nfs4_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 	if (nfs_verifier_is_delegated(dentry))
 		return nfs_lookup_revalidate_delegated(dir, dentry, inode);
 
+	if (nfs_have_delegated_attributes(dir))
+		return nfs_lookup_revalidate_delegated_parent(dir, dentry, inode);
+
 	/* NFS only supports OPEN on regular files */
 	if (!S_ISREG(inode->i_mode))
 		goto full_reval;

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 23/24] nfs: optionally request a delegation on GETATTR
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (21 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 22/24] nfs: skip dentry revalidation when parent dir has a delegation Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  2024-03-15 16:53 ` [PATCH RFC 24/24] nfs: add a module parameter to disable directory delegations Jeff Layton
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

We expect directory delegations to most helpful when their children have
frequently revalidated dentries. The way we usually revalidate them is
to check that the dentry's d_time field matches the change attribute on
the parent. Most dentry revalidations that have to issue an RPC end up
as a GETATTR on the wire for the parent, if the parent hasn't changed,
then the dentry is fine.

Add a new NFS_INO_GDD_GETATTR flag to the nfsi->flags field. Whenever we
validate a dentry vs. its parent, set that flag on the parent.  Then,
when we next do a GETATTR vs. the parent, test and clear the flag and do
a GDD_GETATTR if it was set. If the the GET_DIR_DELEGATION succeeds, set
the delegation in the inode.

Finally, a lot of servers don't support GET_DIR_DELEGATION, so add a
NFS_CAP_* bit to track whether the server supports it, and enable it by
default on v4.1+, and retry without a delegation if we get back an
ENOTSUPP.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/delegation.c       |  1 +
 fs/nfs/dir.c              |  8 ++++++++
 fs/nfs/nfs4proc.c         | 45 ++++++++++++++++++++++++++++++++++++++++++---
 include/linux/nfs_fs.h    |  1 +
 include/linux/nfs_fs_sb.h |  1 +
 5 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 88dbec4739d3..14c14bd03465 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -356,6 +356,7 @@ nfs_detach_delegation_locked(struct nfs_inode *nfsi,
 	delegation->inode = NULL;
 	rcu_assign_pointer(nfsi->delegation, NULL);
 	spin_unlock(&delegation->lock);
+	clear_bit(NFS_INO_GDD_GETATTR, &nfsi->flags);
 	return delegation;
 }
 
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 061648c73116..930fe7e14914 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1504,12 +1504,20 @@ static int nfs_dentry_verify_change(struct inode *dir, struct dentry *dentry)
 static int nfs_check_verifier(struct inode *dir, struct dentry *dentry,
 			      int rcu_walk)
 {
+	struct nfs_inode *nfsi = NFS_I(dir);
+
 	if (IS_ROOT(dentry))
 		return 1;
 	if (NFS_SERVER(dir)->flags & NFS_MOUNT_LOOKUP_CACHE_NONE)
 		return 0;
 	if (!nfs_dentry_verify_change(dir, dentry))
 		return 0;
+	/*
+	 * The dentry matches the directory's change attribute, so
+	 * we're likely revalidating here. Flag the dir so that we
+	 * ask for a delegation on the next getattr.
+	 */
+	set_bit(NFS_INO_GDD_GETATTR, &nfsi->flags);
 	/* Revalidate nfsi->cache_change_attribute before we declare a match */
 	if (nfs_mapping_need_revalidate_inode(dir)) {
 		if (rcu_walk)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index d518388bd0d6..3dbe9a18c9be 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4318,6 +4318,21 @@ static int nfs4_get_referral(struct rpc_clnt *client, struct inode *dir,
 	return status;
 }
 
+static bool should_request_dir_deleg(struct inode *inode)
+{
+	if (!inode)
+		return false;
+	if (!S_ISDIR(inode->i_mode))
+		return false;
+	if (!nfs_server_capable(inode, NFS_CAP_GET_DIR_DELEG))
+		return false;
+	if (!test_and_clear_bit(NFS_INO_GDD_GETATTR, &(NFS_I(inode)->flags)))
+		return false;
+	if (nfs4_have_delegation(inode, FMODE_READ))
+		return false;
+	return true;
+}
+
 static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
 				struct nfs_fattr *fattr, struct inode *inode)
 {
@@ -4331,11 +4346,12 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
 		.server = server,
 	};
 	struct rpc_message msg = {
-		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GETATTR],
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
 	unsigned short task_flags = 0;
+	bool gdd;
+	int status;
 
 	if (nfs4_has_session(server->nfs_client))
 		task_flags = RPC_TASK_MOVEABLE;
@@ -4344,11 +4360,32 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
 	if (inode && (server->flags & NFS_MOUNT_SOFTREVAL))
 		task_flags |= RPC_TASK_TIMEOUT;
 
+retry:
+	gdd = should_request_dir_deleg(inode);
+	if (gdd)
+		msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GDD_GETATTR];
+	else
+		msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GETATTR];
+
 	nfs4_bitmap_copy_adjust(bitmask, nfs4_bitmask(server, fattr->label), inode, 0);
 	nfs_fattr_init(fattr);
 	nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 0);
-	return nfs4_do_call_sync(server->client, server, &msg,
-			&args.seq_args, &res.seq_res, task_flags);
+	status = nfs4_do_call_sync(server->client, server, &msg,
+				   &args.seq_args, &res.seq_res, task_flags);
+	switch (status) {
+	case 0:
+		if (gdd && res.nf_status == GDD4_OK)
+			status = nfs_inode_set_delegation(inode, current_cred(), FMODE_READ,
+							  &res.deleg, 0);
+		break;
+	case -ENOTSUPP:
+		/* If the server doesn't support GET_DIR_DELEGATION, retry without it */
+		if (gdd) {
+			server->caps &= ~NFS_CAP_GET_DIR_DELEG;
+			goto retry;
+		}
+	}
+	return status;
 }
 
 int nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
@@ -10552,6 +10589,7 @@ static const struct nfs4_minor_version_ops nfs_v4_1_minor_ops = {
 	.minor_version = 1,
 	.init_caps = NFS_CAP_READDIRPLUS
 		| NFS_CAP_ATOMIC_OPEN
+		| NFS_CAP_GET_DIR_DELEG
 		| NFS_CAP_POSIX_LOCK
 		| NFS_CAP_STATEID_NFSV41
 		| NFS_CAP_ATOMIC_OPEN_V1
@@ -10578,6 +10616,7 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
 	.minor_version = 2,
 	.init_caps = NFS_CAP_READDIRPLUS
 		| NFS_CAP_ATOMIC_OPEN
+		| NFS_CAP_GET_DIR_DELEG
 		| NFS_CAP_POSIX_LOCK
 		| NFS_CAP_STATEID_NFSV41
 		| NFS_CAP_ATOMIC_OPEN_V1
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index f5ce7b101146..5140e4dac98a 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -318,6 +318,7 @@ struct nfs4_copy_state {
 #define NFS_INO_LAYOUTCOMMITTING (10)		/* layoutcommit inflight */
 #define NFS_INO_LAYOUTSTATS	(11)		/* layoutstats inflight */
 #define NFS_INO_ODIRECT		(12)		/* I/O setting is O_DIRECT */
+#define NFS_INO_GDD_GETATTR	(13)		/* Ask for dir deleg on next GETATTR */
 
 static inline struct nfs_inode *NFS_I(const struct inode *inode)
 {
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 92de074e63b9..0ab744cf52d7 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -278,6 +278,7 @@ struct nfs_server {
 #define NFS_CAP_LGOPEN		(1U << 5)
 #define NFS_CAP_CASE_INSENSITIVE	(1U << 6)
 #define NFS_CAP_CASE_PRESERVING	(1U << 7)
+#define NFS_CAP_GET_DIR_DELEG	(1U << 13)
 #define NFS_CAP_POSIX_LOCK	(1U << 14)
 #define NFS_CAP_UIDGID_NOMAP	(1U << 15)
 #define NFS_CAP_STATEID_NFSV41	(1U << 16)

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH RFC 24/24] nfs: add a module parameter to disable directory delegations
  2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (22 preceding siblings ...)
  2024-03-15 16:53 ` [PATCH RFC 23/24] nfs: optionally request a delegation on GETATTR Jeff Layton
@ 2024-03-15 16:53 ` Jeff Layton
  23 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 16:53 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev,
	Jeff Layton

For testing purposes, add a module parameter that will prevent the
client from requesting further directory delegations.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfs4proc.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3dbe9a18c9be..a85a594cad88 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4318,8 +4318,14 @@ static int nfs4_get_referral(struct rpc_clnt *client, struct inode *dir,
 	return status;
 }
 
+static bool nfs_dir_delegation_enabled = true;
+module_param(nfs_dir_delegation_enabled, bool, 0644);
+MODULE_PARM_DESC(nfs_dir_delegation_enabled, "Enable directory delegations?");
+
 static bool should_request_dir_deleg(struct inode *inode)
 {
+	if (!nfs_dir_delegation_enabled)
+		return false;
 	if (!inode)
 		return false;
 	if (!S_ISDIR(inode->i_mode))

-- 
2.44.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation
  2024-03-15 16:53 ` [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation Jeff Layton
@ 2024-03-15 20:50   ` Anna Schumaker
  2024-03-15 21:29     ` Jeff Layton
  0 siblings, 1 reply; 47+ messages in thread
From: Anna Schumaker @ 2024-03-15 20:50 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

Hi Jeff,

On Fri, Mar 15, 2024 at 12:54 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> Add a new compound that does a GET_DIR_DELEGATION just before doing a
> GETATTR on an inode. Add a delegation stateid and a nf_status code to
> struct nfs4_getattr_res to store the result.
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfs/nfs4xdr.c        | 136 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/nfs4.h    |   1 +
>  include/linux/nfs_xdr.h |   2 +
>  3 files changed, 139 insertions(+)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 1416099dfcd1..c28025018bda 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -391,6 +391,22 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
>                                 XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN) + 5)
>  #define encode_reclaim_complete_maxsz  (op_encode_hdr_maxsz + 4)
>  #define decode_reclaim_complete_maxsz  (op_decode_hdr_maxsz + 4)
> +#define encode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +                         \
> +                                        4 /* gdda_signal_deleg_avail */ +              \
> +                                        8 /* gdda_notification_types */ +              \
> +                                        nfstime4_maxsz  /* gdda_child_attr_delay */ +  \
> +                                        nfstime4_maxsz  /* gdda_dir_attr_delay */ +    \
> +                                        nfs4_fattr_bitmap_maxsz /* gdda_child_attributes */ + \
> +                                        nfs4_fattr_bitmap_maxsz /* gdda_dir_attributes */)
> +
> +#define decode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +                         \
> +                                        4 /* gddrnf_status */ +                        \
> +                                        encode_verifier_maxsz /* gddr_cookieverf */ +  \
> +                                        encode_stateid_maxsz /* gddr_stateid */ +      \
> +                                        8 /* gddr_notification */ +                    \
> +                                        nfs4_fattr_bitmap_maxsz /* gddr_child_attributes */ + \
> +                                        nfs4_fattr_bitmap_maxsz /* gddr_dir_attributes */)
> +
>  #define encode_getdeviceinfo_maxsz (op_encode_hdr_maxsz + \
>                                 XDR_QUADLEN(NFS4_DEVICEID4_SIZE) + \
>                                 1 /* layout type */ + \
> @@ -636,6 +652,18 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
>                                 decode_putfh_maxsz + \
>                                 decode_getattr_maxsz + \
>                                 decode_renew_maxsz)
> +#define NFS4_enc_gdd_getattr_sz        (compound_encode_hdr_maxsz + \
> +                               encode_sequence_maxsz + \
> +                               encode_putfh_maxsz + \
> +                               encode_get_dir_delegation_maxsz + \
> +                               encode_getattr_maxsz + \
> +                               encode_renew_maxsz)
> +#define NFS4_dec_gdd_getattr_sz        (compound_decode_hdr_maxsz + \
> +                               decode_sequence_maxsz + \
> +                               decode_putfh_maxsz + \
> +                               decode_get_dir_delegation_maxsz + \
> +                               decode_getattr_maxsz + \
> +                               decode_renew_maxsz)
>  #define NFS4_enc_lookup_sz     (compound_encode_hdr_maxsz + \
>                                 encode_sequence_maxsz + \
>                                 encode_putfh_maxsz + \
> @@ -1981,6 +2009,30 @@ static void encode_sequence(struct xdr_stream *xdr,
>  }
>
>  #ifdef CONFIG_NFS_V4_1
> +static void
> +encode_get_dir_delegation(struct xdr_stream *xdr, struct compound_hdr *hdr)
> +{
> +       __be32 *p;
> +       struct timespec64 ts = {};
> +       u32 zerobm[1] = {};
> +
> +       encode_op_hdr(xdr, OP_GET_DIR_DELEGATION, decode_get_dir_delegation_maxsz, hdr);
> +
> +       /* We can't handle CB_RECALLABLE_OBJ_AVAIL yet */
> +       xdr_stream_encode_bool(xdr, false);
> +
> +       /* for now, we request no notification types */
> +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> +
> +       /* Request no attribute updates */
> +       p = reserve_space(xdr, 12 + 12);
> +       p = xdr_encode_nfstime4(p, &ts);
> +       xdr_encode_nfstime4(p, &ts);
> +
> +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> +}
> +
>  static void
>  encode_getdeviceinfo(struct xdr_stream *xdr,
>                      const struct nfs4_getdeviceinfo_args *args,
> @@ -2334,6 +2386,25 @@ static void nfs4_xdr_enc_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
>         encode_nops(&hdr);
>  }
>
> +/*
> + * Encode GDD_GETATTR request
> + */
> +static void nfs4_xdr_enc_gdd_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
> +                                    const void *data)
> +{
> +       const struct nfs4_getattr_arg *args = data;
> +       struct compound_hdr hdr = {
> +               .minorversion = nfs4_xdr_minorversion(&args->seq_args),
> +       };
> +
> +       encode_compound_hdr(xdr, req, &hdr);
> +       encode_sequence(xdr, &args->seq_args, &hdr);
> +       encode_putfh(xdr, args->fh, &hdr);
> +       encode_get_dir_delegation(xdr, &hdr);
> +       encode_getfattr(xdr, args->bitmask, &hdr);
> +       encode_nops(&hdr);
> +}
> +

This function should be under a "#ifdef CONFIG_NFS_V4_1" to avoid the
following compiler error:

fs/nfs/nfs4xdr.c:2403:2: error: call to undeclared function
'encode_get_dir_delegation'; ISO C99 and later do not support implicit
function declarations [-Wimplicit-function-declaration]
 2403 |         encode_get_dir_delegation(xdr, &hdr);
      |         ^


>  /*
>   * Encode a CLOSE request
>   */
> @@ -5919,6 +5990,43 @@ static int decode_layout_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid)
>         return decode_stateid(xdr, stateid);
>  }
>
> +static int decode_get_dir_delegation(struct xdr_stream *xdr,
> +                                    struct nfs4_getattr_res *res)
> +{
> +       nfs4_verifier   cookieverf;
> +       int             status;
> +       u32             bm[1];
> +
> +       status = decode_op_hdr(xdr, OP_GET_DIR_DELEGATION);
> +       if (status)
> +               return status;
> +
> +       if (xdr_stream_decode_u32(xdr, &res->nf_status))
> +               return -EIO;
> +
> +       if (res->nf_status == GDD4_UNAVAIL)
> +               return xdr_inline_decode(xdr, 4) ? 0 : -EIO;
> +
> +       status = decode_verifier(xdr, &cookieverf);
> +       if (status)
> +               return status;
> +
> +       status = decode_delegation_stateid(xdr, &res->deleg);
> +       if (status)
> +               return status;
> +
> +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> +       if (status < 0)
> +               return status;
> +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> +       if (status < 0)
> +               return status;
> +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> +       if (status < 0)
> +               return status;
> +       return 0;
> +}
> +
>  static int decode_getdeviceinfo(struct xdr_stream *xdr,
>                                 struct nfs4_getdeviceinfo_res *res)
>  {
> @@ -6455,6 +6563,33 @@ static int nfs4_xdr_dec_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
>         return status;
>  }
>
> +/*
> + * Decode GDD_GETATTR response
> + */
> +static int nfs4_xdr_dec_gdd_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
> +                                   void *data)
> +{
> +       struct nfs4_getattr_res *res = data;
> +       struct compound_hdr hdr;
> +       int status;
> +
> +       status = decode_compound_hdr(xdr, &hdr);
> +       if (status)
> +               goto out;
> +       status = decode_sequence(xdr, &res->seq_res, rqstp);
> +       if (status)
> +               goto out;
> +       status = decode_putfh(xdr);
> +       if (status)
> +               goto out;
> +       status = decode_get_dir_delegation(xdr, res);
> +       if (status)
> +               goto out;
> +       status = decode_getfattr(xdr, res->fattr, res->server);
> +out:
> +       return status;
> +}
> +

This needs to be under the same #ifdef, too.

Thanks,
 Anna

>  /*
>   * Encode an SETACL request
>   */
> @@ -7704,6 +7839,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
>         PROC41(BIND_CONN_TO_SESSION,
>                         enc_bind_conn_to_session, dec_bind_conn_to_session),
>         PROC41(DESTROY_CLIENTID,enc_destroy_clientid,   dec_destroy_clientid),
> +       PROC41(GDD_GETATTR,     enc_gdd_getattr,        dec_gdd_getattr),
>         PROC42(SEEK,            enc_seek,               dec_seek),
>         PROC42(ALLOCATE,        enc_allocate,           dec_allocate),
>         PROC42(DEALLOCATE,      enc_deallocate,         dec_deallocate),
> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> index 11ad088b411d..86cbfd50ecd1 100644
> --- a/include/linux/nfs4.h
> +++ b/include/linux/nfs4.h
> @@ -681,6 +681,7 @@ enum {
>         NFSPROC4_CLNT_LISTXATTRS,
>         NFSPROC4_CLNT_REMOVEXATTR,
>         NFSPROC4_CLNT_READ_PLUS,
> +       NFSPROC4_CLNT_GDD_GETATTR,
>  };
>
>  /* nfs41 types */
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index d09b9773b20c..85ee37ccc25e 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1072,6 +1072,8 @@ struct nfs4_getattr_res {
>         struct nfs4_sequence_res        seq_res;
>         const struct nfs_server *       server;
>         struct nfs_fattr *              fattr;
> +       nfs4_stateid                    deleg;
> +       u32                             nf_status;
>  };
>
>  struct nfs4_link_arg {
>
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation
  2024-03-15 20:50   ` Anna Schumaker
@ 2024-03-15 21:29     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-15 21:29 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, 2024-03-15 at 16:50 -0400, Anna Schumaker wrote:
> Hi Jeff,
> 
> On Fri, Mar 15, 2024 at 12:54 PM Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > Add a new compound that does a GET_DIR_DELEGATION just before doing a
> > GETATTR on an inode. Add a delegation stateid and a nf_status code to
> > struct nfs4_getattr_res to store the result.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfs/nfs4xdr.c        | 136 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/nfs4.h    |   1 +
> >  include/linux/nfs_xdr.h |   2 +
> >  3 files changed, 139 insertions(+)
> > 
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index 1416099dfcd1..c28025018bda 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -391,6 +391,22 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
> >                                 XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN) + 5)
> >  #define encode_reclaim_complete_maxsz  (op_encode_hdr_maxsz + 4)
> >  #define decode_reclaim_complete_maxsz  (op_decode_hdr_maxsz + 4)
> > +#define encode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +                         \
> > +                                        4 /* gdda_signal_deleg_avail */ +              \
> > +                                        8 /* gdda_notification_types */ +              \
> > +                                        nfstime4_maxsz  /* gdda_child_attr_delay */ +  \
> > +                                        nfstime4_maxsz  /* gdda_dir_attr_delay */ +    \
> > +                                        nfs4_fattr_bitmap_maxsz /* gdda_child_attributes */ + \
> > +                                        nfs4_fattr_bitmap_maxsz /* gdda_dir_attributes */)
> > +
> > +#define decode_get_dir_delegation_maxsz (op_encode_hdr_maxsz +                         \
> > +                                        4 /* gddrnf_status */ +                        \
> > +                                        encode_verifier_maxsz /* gddr_cookieverf */ +  \
> > +                                        encode_stateid_maxsz /* gddr_stateid */ +      \
> > +                                        8 /* gddr_notification */ +                    \
> > +                                        nfs4_fattr_bitmap_maxsz /* gddr_child_attributes */ + \
> > +                                        nfs4_fattr_bitmap_maxsz /* gddr_dir_attributes */)
> > +
> >  #define encode_getdeviceinfo_maxsz (op_encode_hdr_maxsz + \
> >                                 XDR_QUADLEN(NFS4_DEVICEID4_SIZE) + \
> >                                 1 /* layout type */ + \
> > @@ -636,6 +652,18 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
> >                                 decode_putfh_maxsz + \
> >                                 decode_getattr_maxsz + \
> >                                 decode_renew_maxsz)
> > +#define NFS4_enc_gdd_getattr_sz        (compound_encode_hdr_maxsz + \
> > +                               encode_sequence_maxsz + \
> > +                               encode_putfh_maxsz + \
> > +                               encode_get_dir_delegation_maxsz + \
> > +                               encode_getattr_maxsz + \
> > +                               encode_renew_maxsz)
> > +#define NFS4_dec_gdd_getattr_sz        (compound_decode_hdr_maxsz + \
> > +                               decode_sequence_maxsz + \
> > +                               decode_putfh_maxsz + \
> > +                               decode_get_dir_delegation_maxsz + \
> > +                               decode_getattr_maxsz + \
> > +                               decode_renew_maxsz)
> >  #define NFS4_enc_lookup_sz     (compound_encode_hdr_maxsz + \
> >                                 encode_sequence_maxsz + \
> >                                 encode_putfh_maxsz + \
> > @@ -1981,6 +2009,30 @@ static void encode_sequence(struct xdr_stream *xdr,
> >  }
> > 
> >  #ifdef CONFIG_NFS_V4_1
> > +static void
> > +encode_get_dir_delegation(struct xdr_stream *xdr, struct compound_hdr *hdr)
> > +{
> > +       __be32 *p;
> > +       struct timespec64 ts = {};
> > +       u32 zerobm[1] = {};
> > +
> > +       encode_op_hdr(xdr, OP_GET_DIR_DELEGATION, decode_get_dir_delegation_maxsz, hdr);
> > +
> > +       /* We can't handle CB_RECALLABLE_OBJ_AVAIL yet */
> > +       xdr_stream_encode_bool(xdr, false);
> > +
> > +       /* for now, we request no notification types */
> > +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> > +
> > +       /* Request no attribute updates */
> > +       p = reserve_space(xdr, 12 + 12);
> > +       p = xdr_encode_nfstime4(p, &ts);
> > +       xdr_encode_nfstime4(p, &ts);
> > +
> > +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> > +       xdr_encode_bitmap4(xdr, zerobm, ARRAY_SIZE(zerobm));
> > +}
> > +
> >  static void
> >  encode_getdeviceinfo(struct xdr_stream *xdr,
> >                      const struct nfs4_getdeviceinfo_args *args,
> > @@ -2334,6 +2386,25 @@ static void nfs4_xdr_enc_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
> >         encode_nops(&hdr);
> >  }
> > 
> > +/*
> > + * Encode GDD_GETATTR request
> > + */
> > +static void nfs4_xdr_enc_gdd_getattr(struct rpc_rqst *req, struct xdr_stream *xdr,
> > +                                    const void *data)
> > +{
> > +       const struct nfs4_getattr_arg *args = data;
> > +       struct compound_hdr hdr = {
> > +               .minorversion = nfs4_xdr_minorversion(&args->seq_args),
> > +       };
> > +
> > +       encode_compound_hdr(xdr, req, &hdr);
> > +       encode_sequence(xdr, &args->seq_args, &hdr);
> > +       encode_putfh(xdr, args->fh, &hdr);
> > +       encode_get_dir_delegation(xdr, &hdr);
> > +       encode_getfattr(xdr, args->bitmask, &hdr);
> > +       encode_nops(&hdr);
> > +}
> > +
> 
> This function should be under a "#ifdef CONFIG_NFS_V4_1" to avoid the
> following compiler error:
> 
> fs/nfs/nfs4xdr.c:2403:2: error: call to undeclared function
> 'encode_get_dir_delegation'; ISO C99 and later do not support implicit
> function declarations [-Wimplicit-function-declaration]
>  2403 |         encode_get_dir_delegation(xdr, &hdr);
>       |         ^
> 

Thanks Anna! I'll fix up both problems before the next iteration.

> 
> >  /*
> >   * Encode a CLOSE request
> >   */
> > @@ -5919,6 +5990,43 @@ static int decode_layout_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid)
> >         return decode_stateid(xdr, stateid);
> >  }
> > 
> > +static int decode_get_dir_delegation(struct xdr_stream *xdr,
> > +                                    struct nfs4_getattr_res *res)
> > +{
> > +       nfs4_verifier   cookieverf;
> > +       int             status;
> > +       u32             bm[1];
> > +
> > +       status = decode_op_hdr(xdr, OP_GET_DIR_DELEGATION);
> > +       if (status)
> > +               return status;
> > +
> > +       if (xdr_stream_decode_u32(xdr, &res->nf_status))
> > +               return -EIO;
> > +
> > +       if (res->nf_status == GDD4_UNAVAIL)
> > +               return xdr_inline_decode(xdr, 4) ? 0 : -EIO;
> > +
> > +       status = decode_verifier(xdr, &cookieverf);
> > +       if (status)
> > +               return status;
> > +
> > +       status = decode_delegation_stateid(xdr, &res->deleg);
> > +       if (status)
> > +               return status;
> > +
> > +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> > +       if (status < 0)
> > +               return status;
> > +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> > +       if (status < 0)
> > +               return status;
> > +       status = decode_bitmap4(xdr, bm, ARRAY_SIZE(bm));
> > +       if (status < 0)
> > +               return status;
> > +       return 0;
> > +}
> > +
> >  static int decode_getdeviceinfo(struct xdr_stream *xdr,
> >                                 struct nfs4_getdeviceinfo_res *res)
> >  {
> > @@ -6455,6 +6563,33 @@ static int nfs4_xdr_dec_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
> >         return status;
> >  }
> > 
> > +/*
> > + * Decode GDD_GETATTR response
> > + */
> > +static int nfs4_xdr_dec_gdd_getattr(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
> > +                                   void *data)
> > +{
> > +       struct nfs4_getattr_res *res = data;
> > +       struct compound_hdr hdr;
> > +       int status;
> > +
> > +       status = decode_compound_hdr(xdr, &hdr);
> > +       if (status)
> > +               goto out;
> > +       status = decode_sequence(xdr, &res->seq_res, rqstp);
> > +       if (status)
> > +               goto out;
> > +       status = decode_putfh(xdr);
> > +       if (status)
> > +               goto out;
> > +       status = decode_get_dir_delegation(xdr, res);
> > +       if (status)
> > +               goto out;
> > +       status = decode_getfattr(xdr, res->fattr, res->server);
> > +out:
> > +       return status;
> > +}
> > +
> 
> This needs to be under the same #ifdef, too.
> 
> Thanks,
>  Anna
> 
> >  /*
> >   * Encode an SETACL request
> >   */
> > @@ -7704,6 +7839,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
> >         PROC41(BIND_CONN_TO_SESSION,
> >                         enc_bind_conn_to_session, dec_bind_conn_to_session),
> >         PROC41(DESTROY_CLIENTID,enc_destroy_clientid,   dec_destroy_clientid),
> > +       PROC41(GDD_GETATTR,     enc_gdd_getattr,        dec_gdd_getattr),
> >         PROC42(SEEK,            enc_seek,               dec_seek),
> >         PROC42(ALLOCATE,        enc_allocate,           dec_allocate),
> >         PROC42(DEALLOCATE,      enc_deallocate,         dec_deallocate),
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index 11ad088b411d..86cbfd50ecd1 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -681,6 +681,7 @@ enum {
> >         NFSPROC4_CLNT_LISTXATTRS,
> >         NFSPROC4_CLNT_REMOVEXATTR,
> >         NFSPROC4_CLNT_READ_PLUS,
> > +       NFSPROC4_CLNT_GDD_GETATTR,
> >  };
> > 
> >  /* nfs41 types */
> > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > index d09b9773b20c..85ee37ccc25e 100644
> > --- a/include/linux/nfs_xdr.h
> > +++ b/include/linux/nfs_xdr.h
> > @@ -1072,6 +1072,8 @@ struct nfs4_getattr_res {
> >         struct nfs4_sequence_res        seq_res;
> >         const struct nfs_server *       server;
> >         struct nfs_fattr *              fattr;
> > +       nfs4_stateid                    deleg;
> > +       u32                             nf_status;
> >  };
> > 
> >  struct nfs4_link_arg {
> > 
> > --
> > 2.44.0
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  2024-03-15 16:52 ` [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
@ 2024-03-16 23:57   ` Al Viro
  2024-03-17 14:09     ` Jeff Layton
  0 siblings, 1 reply; 47+ messages in thread
From: Al Viro @ 2024-03-16 23:57 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:52:54PM -0400, Jeff Layton wrote:
> @@ -4603,9 +4606,12 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
>  	else if (max_links && inode->i_nlink >= max_links)
>  		error = -EMLINK;
>  	else {
> -		error = try_break_deleg(inode, delegated_inode);
> -		if (!error)
> -			error = dir->i_op->link(old_dentry, dir, new_dentry);
> +		error = try_break_deleg(dir, delegated_inode);
> +		if (!error) {
> +			error = try_break_deleg(inode, delegated_inode);
> +			if (!error)
> +				error = dir->i_op->link(old_dentry, dir, new_dentry);
> +		}

A minor nit: that might be easier to follow as
		error = try_break_deleg(dir, delegated_inode);
		if (!error)
			error = try_break_deleg(inode, delegated_inode);
		if (!error)
			error = dir->i_op->link(old_dentry, dir, new_dentry);

and let the compiler deal with optimizing it - any C compiler is going to be
able to figure out that one out.  vfs_link() is a mix of those styles anyway -
we have
        if (!error && (inode->i_state & I_LINKABLE)) {
                spin_lock(&inode->i_lock);
                inode->i_state &= ~I_LINKABLE;
                spin_unlock(&inode->i_lock);
        }
immediately afterwards; might as well make that consistent, especially since
you are getting more shallow nesting that way.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2024-03-15 16:52 ` [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
@ 2024-03-17  0:19   ` Al Viro
  2024-03-17 12:23     ` Jeff Layton
  2024-03-18  8:25   ` Stefan Metzmacher
  1 sibling, 1 reply; 47+ messages in thread
From: Al Viro @ 2024-03-17  0:19 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:52:57PM -0400, Jeff Layton wrote:
> In order to add directory delegation support, we need to break
> delegations on the parent whenever there is going to be a change in the
> directory.
> 
> Add a delegated_inode parameter to lookup_open and have it break the
> delegation. Then, open_last_lookups can wait for the delegation break
> and retry the call to lookup_open once it's done.

> @@ -3490,6 +3490,11 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,

Wait a sec - are you going to do anything to the atomic_open side of things?
 
>  	/* Negative dentry, just create the file */
>  	if (!dentry->d_inode && (open_flag & O_CREAT)) {
> +		/* but break the directory lease first! */
> +		error = try_break_deleg(dir_inode, delegated_inode);
> +		if (error)
> +			goto out_dput;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2024-03-17  0:19   ` Al Viro
@ 2024-03-17 12:23     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-17 12:23 UTC (permalink / raw)
  To: Al Viro
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Sun, 2024-03-17 at 00:19 +0000, Al Viro wrote:
> On Fri, Mar 15, 2024 at 12:52:57PM -0400, Jeff Layton wrote:
> > In order to add directory delegation support, we need to break
> > delegations on the parent whenever there is going to be a change in the
> > directory.
> > 
> > Add a delegated_inode parameter to lookup_open and have it break the
> > delegation. Then, open_last_lookups can wait for the delegation break
> > and retry the call to lookup_open once it's done.
> 
> > @@ -3490,6 +3490,11 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
> 
> Wait a sec - are you going to do anything to the atomic_open side of things?
> 
> 

Hmm good point. I was thinking that all of the filesystems that had
atomic_open didn't support leases. I'm wrong though -- there are some
that currently do:

9p: It's a network filesystem, and I don't think it has any sort of
asynchronous notification or delegation-like object, does it? It might
be best though to just make it call simple_nosetlease.

fuse: fuse allows leases today. I doubt we can get away with turning
that off now. There probably ought to be a way for the userland driver
to opt-in or out of allowing built-in lease support maybe a flag or
something?

ntfs3: IDGI. Why does ntfs3 (which is a local filesystem, unless I'm
mistaken) have an atomic_open? Shouldn't lookup+open be fine, like with
most local filesystems?

vboxsf: Probably the same situation as 9p. Can we just disable leases?

I'll spin up a patchset soon to add proper setlease handlers to all of
the above. Then we can then guard against allowing generic_setlease on
filesystems by default on filesystems with an atomic_open handler.

Another (maybe better) idea might be to require filesystems to specify a
setlease handler if they want them enabled. We could just set the
existing local filesystems to generic_setlease. That would make lease
support a strictly opt-in thing, which is probably the best idea for
avoiding surprises with them.

>  
> >  	/* Negative dentry, just create the file */
> >  	if (!dentry->d_inode && (open_flag & O_CREAT)) {
> > +		/* but break the directory lease first! */
> > +		error = try_break_deleg(dir_inode, delegated_inode);
> > +		if (error)
> > +			goto out_dput;

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  2024-03-16 23:57   ` Al Viro
@ 2024-03-17 14:09     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-17 14:09 UTC (permalink / raw)
  To: Al Viro
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Sat, 2024-03-16 at 23:57 +0000, Al Viro wrote:
> On Fri, Mar 15, 2024 at 12:52:54PM -0400, Jeff Layton wrote:
> > @@ -4603,9 +4606,12 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
> >  	else if (max_links && inode->i_nlink >= max_links)
> >  		error = -EMLINK;
> >  	else {
> > -		error = try_break_deleg(inode, delegated_inode);
> > -		if (!error)
> > -			error = dir->i_op->link(old_dentry, dir, new_dentry);
> > +		error = try_break_deleg(dir, delegated_inode);
> > +		if (!error) {
> > +			error = try_break_deleg(inode, delegated_inode);
> > +			if (!error)
> > +				error = dir->i_op->link(old_dentry, dir, new_dentry);
> > +		}
> 
> A minor nit: that might be easier to follow as
> 		error = try_break_deleg(dir, delegated_inode);
> 		if (!error)
> 			error = try_break_deleg(inode, delegated_inode);
> 		if (!error)
> 			error = dir->i_op->link(old_dentry, dir, new_dentry);
> 
> and let the compiler deal with optimizing it - any C compiler is going to be
> able to figure out that one out.  vfs_link() is a mix of those styles anyway -
> we have
>         if (!error && (inode->i_state & I_LINKABLE)) {
>                 spin_lock(&inode->i_lock);
>                 inode->i_state &= ~I_LINKABLE;
>                 spin_unlock(&inode->i_lock);
>         }
> immediately afterwards; might as well make that consistent, especially since
> you are getting more shallow nesting that way.

Sounds good. Fixed in my tree.

Thanks for the review so far!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback
  2024-03-15 16:52 ` [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback Jeff Layton
@ 2024-03-17 14:56   ` Chuck Lever
  2024-03-18 11:07     ` Jeff Layton
  2024-03-20 13:13   ` Christian Brauner
  1 sibling, 1 reply; 47+ messages in thread
From: Chuck Lever @ 2024-03-17 14:56 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:52:53PM -0400, Jeff Layton wrote:
> The NFSv4.1 protocol adds support for directory delegations, but it
> specifies that if you already have a delegation and try to request a new
> one on the same filehandle, the server must reply that the delegation is
> unavailable.
> 
> Add a new lease_manager callback to allow the lease manager (nfsd in
> this case) to impose extra checks when performing a setlease.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/locks.c               |  5 +++++
>  include/linux/filelock.h | 10 ++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index cb4b35d26162..415cca8e9565 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1822,6 +1822,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
>  			continue;
>  		}
>  
> +		/* Allow the lease manager to veto the setlease */
> +		if (lease->fl_lmops->lm_set_conflict &&
> +		    lease->fl_lmops->lm_set_conflict(lease, fl))
> +			goto out;
> +
>  		/*
>  		 * No exclusive leases if someone else has a lease on
>  		 * this file:
> diff --git a/include/linux/filelock.h b/include/linux/filelock.h
> index daee999d05f3..c5fc768087df 100644
> --- a/include/linux/filelock.h
> +++ b/include/linux/filelock.h
> @@ -49,6 +49,16 @@ struct lease_manager_operations {
>  	int (*lm_change)(struct file_lease *, int, struct list_head *);
>  	void (*lm_setup)(struct file_lease *, void **);
>  	bool (*lm_breaker_owns_lease)(struct file_lease *);
> +
> +	/**
> +	 * lm_set_conflict - extra conditions for setlease
> +	 * @new: new file_lease being set
> +	 * @old: old (extant) file_lease
> +	 *
> +	 * This allows the lease manager to add extra conditions when
> +	 * setting a lease.

To make it clear which return value causes add_lease() to abort, I'd
rather see API contract-style descriptions of the meaning of the
return values instead of this design note. Something like:

 * Return values:
 *   %true: @new and @old conflict
 *   %false: No conflict detected


> +	 */
> +	bool (*lm_set_conflict)(struct file_lease *new, struct file_lease *old);
>  };
>  
>  struct lock_manager {
> 
> -- 
> 2.44.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories
  2024-03-15 16:53 ` [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories Jeff Layton
@ 2024-03-17 15:09   ` Chuck Lever
  2024-03-17 16:03     ` Trond Myklebust
  0 siblings, 1 reply; 47+ messages in thread
From: Chuck Lever @ 2024-03-17 15:09 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:53:02PM -0400, Jeff Layton wrote:
> fh_verify only allows you to filter on a single type of inode, so have
> nfsd4_delegreturn not filter by type. Once fh_verify returns, do the
> appropriate check of the type and return an error if it's not a regular
> file or directory.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/nfs4state.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 17d09d72632b..c52e807f9672 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -7425,12 +7425,24 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	struct nfs4_delegation *dp;
>  	stateid_t *stateid = &dr->dr_stateid;
>  	struct nfs4_stid *s;
> +	umode_t mode;
>  	__be32 status;
>  	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
>  
> -	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
> +	if ((status = fh_verify(rqstp, &cstate->current_fh, 0, 0)))
>  		return status;
>  
> +	mode = d_inode(cstate->current_fh.fh_dentry)->i_mode & S_IFMT;
> +	switch(mode) {
> +	case S_IFREG:
> +	case S_IFDIR:
> +		break;
> +	case S_IFLNK:
> +		return nfserr_symlink;
> +	default:
> +		return nfserr_inval;
> +	}
> +

RFC 8881 Section 15.2 does not list NFS4ERR_SYMLINK among the
valid status codes for the DELEGRETURN operation. Maybe the naked
fh_verify() call has gotten it wrong all these years...?


>  	status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, 0, &s, nn);
>  	if (status)
>  		goto out;
> 
> -- 
> 2.44.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION
  2024-03-15 16:53 ` [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION Jeff Layton
@ 2024-03-17 15:34   ` Chuck Lever
  2024-03-18 16:12     ` Jeff Layton
  0 siblings, 1 reply; 47+ messages in thread
From: Chuck Lever @ 2024-03-17 15:34 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:53:03PM -0400, Jeff Layton wrote:
> This adds basic infrastructure for handing GET_DIR_DELEGATION calls from
> clients, including the  decoders and encoders. For now, the server side
> always just returns that the  delegation is GDDR_UNAVAIL (and that we
> won't call back).
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/nfs4proc.c   | 30 ++++++++++++++++++++++
>  fs/nfsd/nfs4xdr.c    | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  fs/nfsd/xdr4.h       |  8 ++++++
>  include/linux/nfs4.h |  5 ++++
>  4 files changed, 113 insertions(+), 2 deletions(-)

Just a handful of style preferences below.


> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 2927b1263f08..7973fe17bf3c 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2173,6 +2173,18 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned int layout_type)
>  	return nfsd4_layout_ops[layout_type];
>  }
>  
> +static __be32
> +nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
> +			 struct nfsd4_compound_state *cstate,
> +			 union nfsd4_op_u *u)
> +{
> +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> +
> +	/* FIXME: actually return a delegation */
> +	gdd->nf_status = GDD4_UNAVAIL;
> +	return nfs_ok;
> +}
> +
>  static __be32
>  nfsd4_getdeviceinfo(struct svc_rqst *rqstp,
>  		struct nfsd4_compound_state *cstate, union nfsd4_op_u *u)
> @@ -3082,6 +3094,18 @@ static u32 nfsd4_copy_notify_rsize(const struct svc_rqst *rqstp,
>  		* sizeof(__be32);
>  }
>  
> +static u32 nfsd4_get_dir_delegation_rsize(const struct svc_rqst *rqstp,
> +					  const struct nfsd4_op *op)
> +{
> +	return (op_encode_hdr_size +
> +		1 /* gddr_status */ +
> +		op_encode_verifier_maxsz +
> +		op_encode_stateid_maxsz +
> +		2 /* gddr_notification */ +
> +		2 /* gddr_child_attributes */ +
> +		2 /* gddr_dir_attributes */);
> +}
> +
>  #ifdef CONFIG_NFSD_PNFS
>  static u32 nfsd4_getdeviceinfo_rsize(const struct svc_rqst *rqstp,
>  				     const struct nfsd4_op *op)
> @@ -3470,6 +3494,12 @@ static const struct nfsd4_operation nfsd4_ops[] = {
>  		.op_get_currentstateid = nfsd4_get_freestateid,
>  		.op_rsize_bop = nfsd4_only_status_rsize,
>  	},
> +	[OP_GET_DIR_DELEGATION] = {
> +		.op_func = nfsd4_get_dir_delegation,
> +		.op_flags = OP_MODIFIES_SOMETHING,
> +		.op_name = "OP_GET_DIR_DELEGATION",
> +		.op_rsize_bop = nfsd4_get_dir_delegation_rsize,
> +	},
>  #ifdef CONFIG_NFSD_PNFS
>  	[OP_GETDEVICEINFO] = {
>  		.op_func = nfsd4_getdeviceinfo,
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index fac938f563ad..3718bef74e9f 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -1732,6 +1732,40 @@ nfsd4_decode_free_stateid(struct nfsd4_compoundargs *argp,
>  	return nfsd4_decode_stateid4(argp, &free_stateid->fr_stateid);
>  }
>  
> +static __be32
> +nfsd4_decode_get_dir_delegation(struct nfsd4_compoundargs *argp,
> +		union nfsd4_op_u *u)
> +{
> +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> +	struct timespec64 ts;
> +	u32 signal_deleg_avail;
> +	u32 attrs[1];

I know this isn't how we've done XDR in the past, but I'd rather
see these dummy args as fields in struct nfsd4_get_dir_delegation,
and also move the comments about whether each argument is supported
to the putative nfsd4_proc_get_dir_delegation().

The actual implementation of GET_DIR_DELEGATION is in nfs4proc.c,
after all, not here. This is simply a translation function.


> +	__be32 status;
> +
> +	memset(gdd, 0, sizeof(*gdd));
> +
> +	/* No signal_avail support for now (and maybe never) */
> +	if (xdr_stream_decode_bool(argp->xdr, &signal_deleg_avail) < 0)
> +		return nfserr_bad_xdr;
> +	status = nfsd4_decode_bitmap4(argp, gdd->notification_types,
> +				      ARRAY_SIZE(gdd->notification_types));
> +	if (status)
> +		return status;
> +
> +	/* For now, we don't support child or dir attr change notification */
> +	status = nfsd4_decode_nfstime4(argp, &ts);
> +	if (status)
> +		return status;
> +	/* No dir attr notification support yet either */
> +	status = nfsd4_decode_nfstime4(argp, &ts);
> +	if (status)
> +		return status;
> +	status = nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> +	if (status)
> +		return status;
> +	return nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> +}
> +
>  #ifdef CONFIG_NFSD_PNFS
>  static __be32
>  nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp,
> @@ -2370,7 +2404,7 @@ static const nfsd4_dec nfsd4_dec_ops[] = {
>  	[OP_CREATE_SESSION]	= nfsd4_decode_create_session,
>  	[OP_DESTROY_SESSION]	= nfsd4_decode_destroy_session,
>  	[OP_FREE_STATEID]	= nfsd4_decode_free_stateid,
> -	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_notsupp,
> +	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_get_dir_delegation,
>  #ifdef CONFIG_NFSD_PNFS
>  	[OP_GETDEVICEINFO]	= nfsd4_decode_getdeviceinfo,
>  	[OP_GETDEVICELIST]	= nfsd4_decode_notsupp,
> @@ -5002,6 +5036,40 @@ nfsd4_encode_device_addr4(struct xdr_stream *xdr,
>  	return nfserr_toosmall;
>  }
>  
> +static __be32
> +nfsd4_encode_get_dir_delegation(struct nfsd4_compoundres *resp, __be32 nfserr,
> +				union nfsd4_op_u *u)
> +{
> +	struct xdr_stream *xdr = resp->xdr;
> +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> +
> +	/* Encode the GDDR_* status code */

In other encoders, I've used simply the name of the field as it is
in the RFC as a documenting comment. That's more clear, and is
easily grep-able. So:

	/* gddrnf_status */


> +	if (xdr_stream_encode_u32(xdr, gdd->nf_status) != XDR_UNIT)
> +		return nfserr_resource;
> +
> +	/* if it's GDD4_UNAVAIL then we're (almost) done */
> +	if (gdd->nf_status == GDD4_UNAVAIL) {

I prefer using a switch for XDR unions. That makes our
implementation look more like the XDR definition; easier for humans
to audit and modify.


> +		/* We never call back */
> +		return nfsd4_encode_bool(xdr, false);

Again, let's move this boolean to struct nfsd4_get_dir_delegation to
enable nfsd4_proc_get_dir_delegation to decide in the future.


> +	}
> +
> +	/* GDD4_OK case */

If a switch is used, then this comment becomes a real piece of
self-verifying code:

	case GDD4_OK:


> +	nfserr = nfsd4_encode_verifier4(xdr, &gdd->cookieverf);
> +	if (nfserr)
> +		return nfserr;
> +	nfserr = nfsd4_encode_stateid4(xdr, &gdd->stateid);
> +	if (nfserr)
> +		return nfserr;
> +	/* No notifications (yet) */
> +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> +	if (nfserr)
> +		return nfserr;
> +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> +	if (nfserr)
> +		return nfserr;
> +	return nfsd4_encode_bitmap4(xdr, 0, 0, 0);

All these as well can go in struct nfsd4_get_dir_delegation.


> +}
> +
>  static __be32
>  nfsd4_encode_getdeviceinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
>  		union nfsd4_op_u *u)
> @@ -5580,7 +5648,7 @@ static const nfsd4_enc nfsd4_enc_ops[] = {
>  	[OP_CREATE_SESSION]	= nfsd4_encode_create_session,
>  	[OP_DESTROY_SESSION]	= nfsd4_encode_noop,
>  	[OP_FREE_STATEID]	= nfsd4_encode_noop,
> -	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_noop,
> +	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_get_dir_delegation,
>  #ifdef CONFIG_NFSD_PNFS
>  	[OP_GETDEVICEINFO]	= nfsd4_encode_getdeviceinfo,
>  	[OP_GETDEVICELIST]	= nfsd4_encode_noop,
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 415516c1b27e..27de75f32dea 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -518,6 +518,13 @@ struct nfsd4_free_stateid {
>  	stateid_t	fr_stateid;         /* request */
>  };
>  
> +struct nfsd4_get_dir_delegation {
> +	u32		notification_types[1];	/* request */
> +	u32		nf_status;		/* response */
> +	nfs4_verifier	cookieverf;		/* response */
> +	stateid_t	stateid;		/* response */
> +};
> +
>  /* also used for NVERIFY */
>  struct nfsd4_verify {
>  	u32		ve_bmval[3];        /* request */
> @@ -797,6 +804,7 @@ struct nfsd4_op {
>  		struct nfsd4_reclaim_complete	reclaim_complete;
>  		struct nfsd4_test_stateid	test_stateid;
>  		struct nfsd4_free_stateid	free_stateid;
> +		struct nfsd4_get_dir_delegation	get_dir_delegation;
>  		struct nfsd4_getdeviceinfo	getdeviceinfo;
>  		struct nfsd4_layoutget		layoutget;
>  		struct nfsd4_layoutcommit	layoutcommit;
> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> index ef8d2d618d5b..11ad088b411d 100644
> --- a/include/linux/nfs4.h
> +++ b/include/linux/nfs4.h
> @@ -701,6 +701,11 @@ enum state_protect_how4 {
>  	SP4_SSV		= 2
>  };
>  
> +enum get_dir_delegation_non_fatal_res {
> +	GDD4_OK		= 0,
> +	GDD4_UNAVAIL	= 1
> +};
> +
>  enum pnfs_layouttype {
>  	LAYOUT_NFSV4_1_FILES  = 1,
>  	LAYOUT_OSD2_OBJECTS = 2,
> 
> -- 
> 2.44.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling
  2024-03-15 16:53 ` [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
@ 2024-03-17 15:42   ` Chuck Lever
  0 siblings, 0 replies; 47+ messages in thread
From: Chuck Lever @ 2024-03-17 15:42 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:53:05PM -0400, Jeff Layton wrote:
> Add a new routine for acquiring a read delegation on a directory. Since
> the same CB_RECALL/DELEGRETURN infrastrure is used for regular and
> directory delegations, we can just use a normal nfs4_delegation to
> represent it.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/nfs4proc.c  | 23 ++++++++++++++++--
>  fs/nfsd/nfs4state.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  fs/nfsd/state.h     |  5 ++++
>  3 files changed, 94 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 7973fe17bf3c..1a2c90f4ea53 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2179,9 +2179,28 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
>  			 union nfsd4_op_u *u)
>  {
>  	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> +	struct nfs4_delegation *dd;
> +	struct nfsd_file *nf;
> +	__be32 status;
> +
> +	status = nfsd_file_acquire_dir(rqstp, &cstate->current_fh, &nf);
> +	if (status != nfs_ok)
> +		return status;
> +
> +	dd = nfsd_get_dir_deleg(cstate, gdd, nf);
> +	if (IS_ERR(dd)) {
> +		int err = PTR_ERR(dd);
> +
> +		if (err != -EAGAIN)
> +			return nfserrno(err);
> +		gdd->nf_status = GDD4_UNAVAIL;
> +		return nfs_ok;
> +	}
>  
> -	/* FIXME: actually return a delegation */
> -	gdd->nf_status = GDD4_UNAVAIL;
> +	gdd->nf_status = GDD4_OK;
> +	memcpy(&gdd->stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->stateid));
> +	memset(&gdd->cookieverf, '\0', sizeof(gdd->cookieverf));
> +	nfs4_put_stid(&dd->dl_stid);
>  	return nfs_ok;
>  }
>  
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 778c1c6e3b54..36574aedc211 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8874,7 +8874,7 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
>  			}
>  break_lease:
>  			nfsd_stats_wdeleg_getattr_inc(nn);
> -			dp = fl->fl_owner;
> +			dp = fl->c.flc_owner;
>  			ncf = &dp->dl_cb_fattr;
>  			nfs4_cb_getattr(&dp->dl_cb_fattr);
>  			spin_unlock(&ctx->flc_lock);
> @@ -8912,3 +8912,70 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
>  	spin_unlock(&ctx->flc_lock);
>  	return 0;
>  }
> +

I'll admit to not having examined the below function closely, but
since it is not a static function, can you add a kdoc comment and
a few salient points like, if there is something NFSD doesn't
implement (yet), or short cuts that would be good for readers to
know about?


> +struct nfs4_delegation *
> +nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
> +		   struct nfsd4_get_dir_delegation *gdd,
> +		   struct nfsd_file *nf)
> +{
> +	struct nfs4_client *clp = cstate->clp;
> +	struct nfs4_delegation *dp;
> +	struct file_lease *fl;
> +	struct nfs4_file *fp;
> +	int status = 0;
> +
> +	fp = nfsd4_alloc_file();
> +	if (!fp)
> +		return ERR_PTR(-ENOMEM);
> +
> +	nfsd4_file_init(&cstate->current_fh, fp);
> +	fp->fi_deleg_file = nf;
> +	fp->fi_delegees = 1;
> +
> +	spin_lock(&state_lock);
> +	spin_lock(&fp->fi_lock);
> +	if (nfs4_delegation_exists(clp, fp))
> +		status = -EAGAIN;
> +	spin_unlock(&fp->fi_lock);
> +	spin_unlock(&state_lock);
> +
> +	if (status)
> +		goto out_delegees;
> +
> +	status = -ENOMEM;
> +	dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
> +	if (!dp)
> +		goto out_delegees;
> +
> +	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
> +	if (!fl)
> +		goto out_put_stid;
> +
> +	status = kernel_setlease(nf->nf_file,
> +				 fl->c.flc_type, &fl, NULL);
> +	if (fl)
> +		locks_free_lease(fl);
> +	if (status)
> +		goto out_put_stid;
> +
> +	spin_lock(&state_lock);
> +	spin_lock(&clp->cl_lock);
> +	spin_lock(&fp->fi_lock);
> +	status = hash_delegation_locked(dp, fp);
> +	spin_unlock(&fp->fi_lock);
> +	spin_unlock(&clp->cl_lock);
> +	spin_unlock(&state_lock);
> +
> +	if (status)
> +		goto out_unlock;
> +
> +	return dp;
> +out_unlock:
> +	kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);
> +out_put_stid:
> +	nfs4_put_stid(&dp->dl_stid);
> +out_delegees:
> +	put_deleg_file(fp);
> +	return ERR_PTR(status);
> +}
> +
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 01c6f3445646..20551483cc7b 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -782,4 +782,9 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>  
>  extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
>  		struct inode *inode, bool *file_modified, u64 *size);
> +
> +struct nfsd4_get_dir_delegation;
> +struct nfs4_delegation *nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
> +						struct nfsd4_get_dir_delegation *gdd,
> +						struct nfsd_file *nf);
>  #endif   /* NFSD4_STATE_H */
> 
> -- 
> 2.44.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories
  2024-03-17 15:09   ` Chuck Lever
@ 2024-03-17 16:03     ` Trond Myklebust
  2024-03-18 11:22       ` Jeff Layton
  0 siblings, 1 reply; 47+ messages in thread
From: Trond Myklebust @ 2024-03-17 16:03 UTC (permalink / raw)
  To: jlayton, chuck.lever
  Cc: senozhatsky, sfrench, ecryptfs, linux-unionfs, davem, viro, anna,
	jack, tom, pabeni, netdev, linux-fsdevel, linux-kernel,
	ronniesahlberg, samba-technical, dhowells, Dai.Ngo, kuba, rafael,
	alex.aring, pc, amir73il, kolga, sprasad, code, brauner, gregkh,
	edumazet, linux-cifs, linkinjeon, neilb, linux-nfs, netfs,
	miklos

On Sun, 2024-03-17 at 11:09 -0400, Chuck Lever wrote:
> On Fri, Mar 15, 2024 at 12:53:02PM -0400, Jeff Layton wrote:
> > fh_verify only allows you to filter on a single type of inode, so
> > have
> > nfsd4_delegreturn not filter by type. Once fh_verify returns, do
> > the
> > appropriate check of the type and return an error if it's not a
> > regular
> > file or directory.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/nfs4state.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 17d09d72632b..c52e807f9672 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -7425,12 +7425,24 @@ nfsd4_delegreturn(struct svc_rqst *rqstp,
> > struct nfsd4_compound_state *cstate,
> >  	struct nfs4_delegation *dp;
> >  	stateid_t *stateid = &dr->dr_stateid;
> >  	struct nfs4_stid *s;
> > +	umode_t mode;
> >  	__be32 status;
> >  	struct nfsd_net *nn = net_generic(SVC_NET(rqstp),
> > nfsd_net_id);
> >  
> > -	if ((status = fh_verify(rqstp, &cstate->current_fh,
> > S_IFREG, 0)))
> > +	if ((status = fh_verify(rqstp, &cstate->current_fh, 0,
> > 0)))
> >  		return status;
> >  
> > +	mode = d_inode(cstate->current_fh.fh_dentry)->i_mode &
> > S_IFMT;
> > +	switch(mode) {
> > +	case S_IFREG:
> > +	case S_IFDIR:
> > +		break;
> > +	case S_IFLNK:
> > +		return nfserr_symlink;
> > +	default:
> > +		return nfserr_inval;
> > +	}
> > +
> 
> RFC 8881 Section 15.2 does not list NFS4ERR_SYMLINK among the
> valid status codes for the DELEGRETURN operation. Maybe the naked
> fh_verify() call has gotten it wrong all these years...?

The WANT_DELEGATION operation allows the server to hand out delegations
for aggressive caching of symlinks. It is not an error to return that
delegation using DELEGRETURN.

Furthermore, provided that the presented stateid is actually valid, it
is also sufficient to uniquely identify the file to which it is
associated (see RFC8881 Section 8.2.4), so the filehandle should be
considered mostly irrelevant for operations like DELEGRETURN.


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2024-03-15 16:52 ` [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
  2024-03-17  0:19   ` Al Viro
@ 2024-03-18  8:25   ` Stefan Metzmacher
  2024-03-18 10:21     ` Jeff Layton
  1 sibling, 1 reply; 47+ messages in thread
From: Stefan Metzmacher @ 2024-03-18  8:25 UTC (permalink / raw)
  To: Jeff Layton, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
	Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev

Hi Jeff,

> In order to add directory delegation support, we need to break
> delegations on the parent whenever there is going to be a change in the
> directory.
> 
> Add a delegated_inode parameter to lookup_open and have it break the
> delegation. Then, open_last_lookups can wait for the delegation break
> and retry the call to lookup_open once it's done.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/namei.c | 22 ++++++++++++++++++----
>   1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index f00d8d708001..88598a62ec64 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3404,7 +3404,7 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
>    */
>   static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
>   				  const struct open_flags *op,
> -				  bool got_write)
> +				  bool got_write, struct inode **delegated_inode)

Does NFS has a concept of lease keys and parent lease keys?

In SMB it's possible that the client passes a lease key (16 client chosen bytes) to a directory open,
when asking for a directory lease.

Then operations on files within that directory, take that lease key from the directory as
'parent lease keys' in addition to a unique lease key for the file.

That way a client can avoid breaking its own directory leases when creating/move/delete... files
in the directory.

metze

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2024-03-18  8:25   ` Stefan Metzmacher
@ 2024-03-18 10:21     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-18 10:21 UTC (permalink / raw)
  To: Stefan Metzmacher, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
	Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Greg Kroah-Hartman, Rafael J. Wysocki, David Howells,
	Tyler Hicks, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Miklos Szeredi, Amir Goldstein, Namjae Jeon, Sergey Senozhatsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, netfs, ecryptfs, linux-unionfs, netdev

On Mon, 2024-03-18 at 09:25 +0100, Stefan Metzmacher wrote:
> Hi Jeff,
> 
> > In order to add directory delegation support, we need to break
> > delegations on the parent whenever there is going to be a change in the
> > directory.
> > 
> > Add a delegated_inode parameter to lookup_open and have it break the
> > delegation. Then, open_last_lookups can wait for the delegation break
> > and retry the call to lookup_open once it's done.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/namei.c | 22 ++++++++++++++++++----
> >   1 file changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/namei.c b/fs/namei.c
> > index f00d8d708001..88598a62ec64 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -3404,7 +3404,7 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
> >    */
> >   static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
> >   				  const struct open_flags *op,
> > -				  bool got_write)
> > +				  bool got_write, struct inode **delegated_inode)
> 
> Does NFS has a concept of lease keys and parent lease keys?
> 
> In SMB it's possible that the client passes a lease key (16 client chosen bytes) to a directory open,
> when asking for a directory lease.
> 
> Then operations on files within that directory, take that lease key from the directory as
> 'parent lease keys' in addition to a unique lease key for the file.
> 
> That way a client can avoid breaking its own directory leases when creating/move/delete... files
> in the directory.
> 

No, it's a bit different with NFSv4 directory delegations. A delegation
is given vs a filehandle (which is analogous to an inode), and it gets a
stateid, which just uniquely identifies it. There is no real association
with the parent.

When you request the dir delegation, you can request to be notified when
something changes instead of the server recalling it. The server may or
may not grant that request. Notifications are not implemented in this
patchset as of yet. I'm focusing on getting the recall handling right
first, and then I'll plan to add that in a later phase.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback
  2024-03-17 14:56   ` Chuck Lever
@ 2024-03-18 11:07     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-18 11:07 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Sun, 2024-03-17 at 10:56 -0400, Chuck Lever wrote:
> On Fri, Mar 15, 2024 at 12:52:53PM -0400, Jeff Layton wrote:
> > The NFSv4.1 protocol adds support for directory delegations, but it
> > specifies that if you already have a delegation and try to request a new
> > one on the same filehandle, the server must reply that the delegation is
> > unavailable.
> > 
> > Add a new lease_manager callback to allow the lease manager (nfsd in
> > this case) to impose extra checks when performing a setlease.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/locks.c               |  5 +++++
> >  include/linux/filelock.h | 10 ++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index cb4b35d26162..415cca8e9565 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -1822,6 +1822,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
> >  			continue;
> >  		}
> >  
> > +		/* Allow the lease manager to veto the setlease */
> > +		if (lease->fl_lmops->lm_set_conflict &&
> > +		    lease->fl_lmops->lm_set_conflict(lease, fl))
> > +			goto out;
> > +
> >  		/*
> >  		 * No exclusive leases if someone else has a lease on
> >  		 * this file:
> > diff --git a/include/linux/filelock.h b/include/linux/filelock.h
> > index daee999d05f3..c5fc768087df 100644
> > --- a/include/linux/filelock.h
> > +++ b/include/linux/filelock.h
> > @@ -49,6 +49,16 @@ struct lease_manager_operations {
> >  	int (*lm_change)(struct file_lease *, int, struct list_head *);
> >  	void (*lm_setup)(struct file_lease *, void **);
> >  	bool (*lm_breaker_owns_lease)(struct file_lease *);
> > +
> > +	/**
> > +	 * lm_set_conflict - extra conditions for setlease
> > +	 * @new: new file_lease being set
> > +	 * @old: old (extant) file_lease
> > +	 *
> > +	 * This allows the lease manager to add extra conditions when
> > +	 * setting a lease.
> 
> To make it clear which return value causes add_lease() to abort, I'd
> rather see API contract-style descriptions of the meaning of the
> return values instead of this design note. Something like:
> 
>  * Return values:
>  *   %true: @new and @old conflict
>  *   %false: No conflict detected
> 
> 

Thanks. I added this to the patch in my tree.

> > +	 */
> > +	bool (*lm_set_conflict)(struct file_lease *new, struct file_lease *old);
> >  };
> >  
> >  struct lock_manager {
> > 
> > -- 
> > 2.44.0
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories
  2024-03-17 16:03     ` Trond Myklebust
@ 2024-03-18 11:22       ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-18 11:22 UTC (permalink / raw)
  To: Trond Myklebust, chuck.lever
  Cc: senozhatsky, sfrench, ecryptfs, linux-unionfs, davem, viro, anna,
	jack, tom, pabeni, netdev, linux-fsdevel, linux-kernel,
	ronniesahlberg, samba-technical, dhowells, Dai.Ngo, kuba, rafael,
	alex.aring, pc, amir73il, kolga, sprasad, code, brauner, gregkh,
	edumazet, linux-cifs, linkinjeon, neilb, linux-nfs, netfs,
	miklos

On Sun, 2024-03-17 at 16:03 +0000, Trond Myklebust wrote:
> On Sun, 2024-03-17 at 11:09 -0400, Chuck Lever wrote:
> > On Fri, Mar 15, 2024 at 12:53:02PM -0400, Jeff Layton wrote:
> > > fh_verify only allows you to filter on a single type of inode, so
> > > have
> > > nfsd4_delegreturn not filter by type. Once fh_verify returns, do
> > > the
> > > appropriate check of the type and return an error if it's not a
> > > regular
> > > file or directory.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > >  fs/nfsd/nfs4state.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 17d09d72632b..c52e807f9672 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -7425,12 +7425,24 @@ nfsd4_delegreturn(struct svc_rqst *rqstp,
> > > struct nfsd4_compound_state *cstate,
> > >  	struct nfs4_delegation *dp;
> > >  	stateid_t *stateid = &dr->dr_stateid;
> > >  	struct nfs4_stid *s;
> > > +	umode_t mode;
> > >  	__be32 status;
> > >  	struct nfsd_net *nn = net_generic(SVC_NET(rqstp),
> > > nfsd_net_id);
> > >  
> > > -	if ((status = fh_verify(rqstp, &cstate->current_fh,
> > > S_IFREG, 0)))
> > > +	if ((status = fh_verify(rqstp, &cstate->current_fh, 0,
> > > 0)))
> > >  		return status;
> > >  
> > > +	mode = d_inode(cstate->current_fh.fh_dentry)->i_mode &
> > > S_IFMT;
> > > +	switch(mode) {
> > > +	case S_IFREG:
> > > +	case S_IFDIR:
> > > +		break;
> > > +	case S_IFLNK:
> > > +		return nfserr_symlink;
> > > +	default:
> > > +		return nfserr_inval;
> > > +	}
> > > +
> > 
> > RFC 8881 Section 15.2 does not list NFS4ERR_SYMLINK among the
> > valid status codes for the DELEGRETURN operation. Maybe the naked
> > fh_verify() call has gotten it wrong all these years...?
> 
> The WANT_DELEGATION operation allows the server to hand out delegations
> for aggressive caching of symlinks. It is not an error to return that
> delegation using DELEGRETURN.
> 
> Furthermore, provided that the presented stateid is actually valid, it
> is also sufficient to uniquely identify the file to which it is
> associated (see RFC8881 Section 8.2.4), so the filehandle should be
> considered mostly irrelevant for operations like DELEGRETURN.
> 

Ok. I think we can probably just drop the switch altogether. We already
don't validate that the stateid is associated with current_fh, AFAICT.
It looks possible to send a valid stateid alongside an FH that refers to
a completely different file, and we'll just accept it.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION
  2024-03-17 15:34   ` Chuck Lever
@ 2024-03-18 16:12     ` Jeff Layton
  2024-03-18 16:19       ` Chuck Lever
  0 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-18 16:12 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Sun, 2024-03-17 at 11:34 -0400, Chuck Lever wrote:
> On Fri, Mar 15, 2024 at 12:53:03PM -0400, Jeff Layton wrote:
> > This adds basic infrastructure for handing GET_DIR_DELEGATION calls from
> > clients, including the  decoders and encoders. For now, the server side
> > always just returns that the  delegation is GDDR_UNAVAIL (and that we
> > won't call back).
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/nfs4proc.c   | 30 ++++++++++++++++++++++
> >  fs/nfsd/nfs4xdr.c    | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  fs/nfsd/xdr4.h       |  8 ++++++
> >  include/linux/nfs4.h |  5 ++++
> >  4 files changed, 113 insertions(+), 2 deletions(-)
> 
> Just a handful of style preferences below.
> 

Those comments all make sense. I'll respin along those lines.

Also, I may go ahead and send this patch separately from the rest of the
series. I think it would be best to have trivial support for
GET_DIR_DELEGATION in the kernel server as soon as possible.

Eventually, clients may start sending these calls, and it's better if we
can just return a non-fatal error instead of sending back NFSERR_NOTSUPP
when they do.


> 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 2927b1263f08..7973fe17bf3c 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -2173,6 +2173,18 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned int layout_type)
> >  	return nfsd4_layout_ops[layout_type];
> >  }
> >  
> > +static __be32
> > +nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
> > +			 struct nfsd4_compound_state *cstate,
> > +			 union nfsd4_op_u *u)
> > +{
> > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > +
> > +	/* FIXME: actually return a delegation */
> > +	gdd->nf_status = GDD4_UNAVAIL;
> > +	return nfs_ok;
> > +}
> > +
> >  static __be32
> >  nfsd4_getdeviceinfo(struct svc_rqst *rqstp,
> >  		struct nfsd4_compound_state *cstate, union nfsd4_op_u *u)
> > @@ -3082,6 +3094,18 @@ static u32 nfsd4_copy_notify_rsize(const struct svc_rqst *rqstp,
> >  		* sizeof(__be32);
> >  }
> >  
> > +static u32 nfsd4_get_dir_delegation_rsize(const struct svc_rqst *rqstp,
> > +					  const struct nfsd4_op *op)
> > +{
> > +	return (op_encode_hdr_size +
> > +		1 /* gddr_status */ +
> > +		op_encode_verifier_maxsz +
> > +		op_encode_stateid_maxsz +
> > +		2 /* gddr_notification */ +
> > +		2 /* gddr_child_attributes */ +
> > +		2 /* gddr_dir_attributes */);
> > +}
> > +
> >  #ifdef CONFIG_NFSD_PNFS
> >  static u32 nfsd4_getdeviceinfo_rsize(const struct svc_rqst *rqstp,
> >  				     const struct nfsd4_op *op)
> > @@ -3470,6 +3494,12 @@ static const struct nfsd4_operation nfsd4_ops[] = {
> >  		.op_get_currentstateid = nfsd4_get_freestateid,
> >  		.op_rsize_bop = nfsd4_only_status_rsize,
> >  	},
> > +	[OP_GET_DIR_DELEGATION] = {
> > +		.op_func = nfsd4_get_dir_delegation,
> > +		.op_flags = OP_MODIFIES_SOMETHING,
> > +		.op_name = "OP_GET_DIR_DELEGATION",
> > +		.op_rsize_bop = nfsd4_get_dir_delegation_rsize,
> > +	},
> >  #ifdef CONFIG_NFSD_PNFS
> >  	[OP_GETDEVICEINFO] = {
> >  		.op_func = nfsd4_getdeviceinfo,
> > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > index fac938f563ad..3718bef74e9f 100644
> > --- a/fs/nfsd/nfs4xdr.c
> > +++ b/fs/nfsd/nfs4xdr.c
> > @@ -1732,6 +1732,40 @@ nfsd4_decode_free_stateid(struct nfsd4_compoundargs *argp,
> >  	return nfsd4_decode_stateid4(argp, &free_stateid->fr_stateid);
> >  }
> >  
> > +static __be32
> > +nfsd4_decode_get_dir_delegation(struct nfsd4_compoundargs *argp,
> > +		union nfsd4_op_u *u)
> > +{
> > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > +	struct timespec64 ts;
> > +	u32 signal_deleg_avail;
> > +	u32 attrs[1];
> 
> I know this isn't how we've done XDR in the past, but I'd rather
> see these dummy args as fields in struct nfsd4_get_dir_delegation,
> and also move the comments about whether each argument is supported
> to the putative nfsd4_proc_get_dir_delegation().
> 
> The actual implementation of GET_DIR_DELEGATION is in nfs4proc.c,
> after all, not here. This is simply a translation function.
> 
> 
> > +	__be32 status;
> > +
> > +	memset(gdd, 0, sizeof(*gdd));
> > +
> > +	/* No signal_avail support for now (and maybe never) */
> > +	if (xdr_stream_decode_bool(argp->xdr, &signal_deleg_avail) < 0)
> > +		return nfserr_bad_xdr;
> > +	status = nfsd4_decode_bitmap4(argp, gdd->notification_types,
> > +				      ARRAY_SIZE(gdd->notification_types));
> > +	if (status)
> > +		return status;
> > +
> > +	/* For now, we don't support child or dir attr change notification */
> > +	status = nfsd4_decode_nfstime4(argp, &ts);
> > +	if (status)
> > +		return status;
> > +	/* No dir attr notification support yet either */
> > +	status = nfsd4_decode_nfstime4(argp, &ts);
> > +	if (status)
> > +		return status;
> > +	status = nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> > +	if (status)
> > +		return status;
> > +	return nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> > +}
> > +
> >  #ifdef CONFIG_NFSD_PNFS
> >  static __be32
> >  nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp,
> > @@ -2370,7 +2404,7 @@ static const nfsd4_dec nfsd4_dec_ops[] = {
> >  	[OP_CREATE_SESSION]	= nfsd4_decode_create_session,
> >  	[OP_DESTROY_SESSION]	= nfsd4_decode_destroy_session,
> >  	[OP_FREE_STATEID]	= nfsd4_decode_free_stateid,
> > -	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_notsupp,
> > +	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_get_dir_delegation,
> >  #ifdef CONFIG_NFSD_PNFS
> >  	[OP_GETDEVICEINFO]	= nfsd4_decode_getdeviceinfo,
> >  	[OP_GETDEVICELIST]	= nfsd4_decode_notsupp,
> > @@ -5002,6 +5036,40 @@ nfsd4_encode_device_addr4(struct xdr_stream *xdr,
> >  	return nfserr_toosmall;
> >  }
> >  
> > +static __be32
> > +nfsd4_encode_get_dir_delegation(struct nfsd4_compoundres *resp, __be32 nfserr,
> > +				union nfsd4_op_u *u)
> > +{
> > +	struct xdr_stream *xdr = resp->xdr;
> > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > +
> > +	/* Encode the GDDR_* status code */
> 
> In other encoders, I've used simply the name of the field as it is
> in the RFC as a documenting comment. That's more clear, and is
> easily grep-able. So:
> 
> 	/* gddrnf_status */
> 
> 
> > +	if (xdr_stream_encode_u32(xdr, gdd->nf_status) != XDR_UNIT)
> > +		return nfserr_resource;
> > +
> > +	/* if it's GDD4_UNAVAIL then we're (almost) done */
> > +	if (gdd->nf_status == GDD4_UNAVAIL) {
> 
> I prefer using a switch for XDR unions. That makes our
> implementation look more like the XDR definition; easier for humans
> to audit and modify.
> 
> 
> > +		/* We never call back */
> > +		return nfsd4_encode_bool(xdr, false);
> 
> Again, let's move this boolean to struct nfsd4_get_dir_delegation to
> enable nfsd4_proc_get_dir_delegation to decide in the future.
> 
> 
> > +	}
> > +
> > +	/* GDD4_OK case */
> 
> If a switch is used, then this comment becomes a real piece of
> self-verifying code:
> 
> 	case GDD4_OK:
> 
> 
> > +	nfserr = nfsd4_encode_verifier4(xdr, &gdd->cookieverf);
> > +	if (nfserr)
> > +		return nfserr;
> > +	nfserr = nfsd4_encode_stateid4(xdr, &gdd->stateid);
> > +	if (nfserr)
> > +		return nfserr;
> > +	/* No notifications (yet) */
> > +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> > +	if (nfserr)
> > +		return nfserr;
> > +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> > +	if (nfserr)
> > +		return nfserr;
> > +	return nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> 
> All these as well can go in struct nfsd4_get_dir_delegation.
> 
> 
> > +}
> > +
> >  static __be32
> >  nfsd4_encode_getdeviceinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
> >  		union nfsd4_op_u *u)
> > @@ -5580,7 +5648,7 @@ static const nfsd4_enc nfsd4_enc_ops[] = {
> >  	[OP_CREATE_SESSION]	= nfsd4_encode_create_session,
> >  	[OP_DESTROY_SESSION]	= nfsd4_encode_noop,
> >  	[OP_FREE_STATEID]	= nfsd4_encode_noop,
> > -	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_noop,
> > +	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_get_dir_delegation,
> >  #ifdef CONFIG_NFSD_PNFS
> >  	[OP_GETDEVICEINFO]	= nfsd4_encode_getdeviceinfo,
> >  	[OP_GETDEVICELIST]	= nfsd4_encode_noop,
> > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > index 415516c1b27e..27de75f32dea 100644
> > --- a/fs/nfsd/xdr4.h
> > +++ b/fs/nfsd/xdr4.h
> > @@ -518,6 +518,13 @@ struct nfsd4_free_stateid {
> >  	stateid_t	fr_stateid;         /* request */
> >  };
> >  
> > +struct nfsd4_get_dir_delegation {
> > +	u32		notification_types[1];	/* request */
> > +	u32		nf_status;		/* response */
> > +	nfs4_verifier	cookieverf;		/* response */
> > +	stateid_t	stateid;		/* response */
> > +};
> > +
> >  /* also used for NVERIFY */
> >  struct nfsd4_verify {
> >  	u32		ve_bmval[3];        /* request */
> > @@ -797,6 +804,7 @@ struct nfsd4_op {
> >  		struct nfsd4_reclaim_complete	reclaim_complete;
> >  		struct nfsd4_test_stateid	test_stateid;
> >  		struct nfsd4_free_stateid	free_stateid;
> > +		struct nfsd4_get_dir_delegation	get_dir_delegation;
> >  		struct nfsd4_getdeviceinfo	getdeviceinfo;
> >  		struct nfsd4_layoutget		layoutget;
> >  		struct nfsd4_layoutcommit	layoutcommit;
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index ef8d2d618d5b..11ad088b411d 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -701,6 +701,11 @@ enum state_protect_how4 {
> >  	SP4_SSV		= 2
> >  };
> >  
> > +enum get_dir_delegation_non_fatal_res {
> > +	GDD4_OK		= 0,
> > +	GDD4_UNAVAIL	= 1
> > +};
> > +
> >  enum pnfs_layouttype {
> >  	LAYOUT_NFSV4_1_FILES  = 1,
> >  	LAYOUT_OSD2_OBJECTS = 2,
> > 
> > -- 
> > 2.44.0
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION
  2024-03-18 16:12     ` Jeff Layton
@ 2024-03-18 16:19       ` Chuck Lever
  0 siblings, 0 replies; 47+ messages in thread
From: Chuck Lever @ 2024-03-18 16:19 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Mon, Mar 18, 2024 at 12:12:06PM -0400, Jeff Layton wrote:
> On Sun, 2024-03-17 at 11:34 -0400, Chuck Lever wrote:
> > On Fri, Mar 15, 2024 at 12:53:03PM -0400, Jeff Layton wrote:
> > > This adds basic infrastructure for handing GET_DIR_DELEGATION calls from
> > > clients, including the  decoders and encoders. For now, the server side
> > > always just returns that the  delegation is GDDR_UNAVAIL (and that we
> > > won't call back).
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > >  fs/nfsd/nfs4proc.c   | 30 ++++++++++++++++++++++
> > >  fs/nfsd/nfs4xdr.c    | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > >  fs/nfsd/xdr4.h       |  8 ++++++
> > >  include/linux/nfs4.h |  5 ++++
> > >  4 files changed, 113 insertions(+), 2 deletions(-)
> > 
> > Just a handful of style preferences below.
> > 
> 
> Those comments all make sense. I'll respin along those lines.
> 
> Also, I may go ahead and send this patch separately from the rest of the
> series. I think it would be best to have trivial support for
> GET_DIR_DELEGATION in the kernel server as soon as possible.
> 
> Eventually, clients may start sending these calls, and it's better if we
> can just return a non-fatal error instead of sending back NFSERR_NOTSUPP
> when they do.

I have no objection to the XDR pieces going into NFSD before the
rest of the series. Thanks for this implementation!


> > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > index 2927b1263f08..7973fe17bf3c 100644
> > > --- a/fs/nfsd/nfs4proc.c
> > > +++ b/fs/nfsd/nfs4proc.c
> > > @@ -2173,6 +2173,18 @@ nfsd4_layout_verify(struct svc_export *exp, unsigned int layout_type)
> > >  	return nfsd4_layout_ops[layout_type];
> > >  }
> > >  
> > > +static __be32
> > > +nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
> > > +			 struct nfsd4_compound_state *cstate,
> > > +			 union nfsd4_op_u *u)
> > > +{
> > > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > > +
> > > +	/* FIXME: actually return a delegation */
> > > +	gdd->nf_status = GDD4_UNAVAIL;
> > > +	return nfs_ok;
> > > +}
> > > +
> > >  static __be32
> > >  nfsd4_getdeviceinfo(struct svc_rqst *rqstp,
> > >  		struct nfsd4_compound_state *cstate, union nfsd4_op_u *u)
> > > @@ -3082,6 +3094,18 @@ static u32 nfsd4_copy_notify_rsize(const struct svc_rqst *rqstp,
> > >  		* sizeof(__be32);
> > >  }
> > >  
> > > +static u32 nfsd4_get_dir_delegation_rsize(const struct svc_rqst *rqstp,
> > > +					  const struct nfsd4_op *op)
> > > +{
> > > +	return (op_encode_hdr_size +
> > > +		1 /* gddr_status */ +
> > > +		op_encode_verifier_maxsz +
> > > +		op_encode_stateid_maxsz +
> > > +		2 /* gddr_notification */ +
> > > +		2 /* gddr_child_attributes */ +
> > > +		2 /* gddr_dir_attributes */);
> > > +}
> > > +
> > >  #ifdef CONFIG_NFSD_PNFS
> > >  static u32 nfsd4_getdeviceinfo_rsize(const struct svc_rqst *rqstp,
> > >  				     const struct nfsd4_op *op)
> > > @@ -3470,6 +3494,12 @@ static const struct nfsd4_operation nfsd4_ops[] = {
> > >  		.op_get_currentstateid = nfsd4_get_freestateid,
> > >  		.op_rsize_bop = nfsd4_only_status_rsize,
> > >  	},
> > > +	[OP_GET_DIR_DELEGATION] = {
> > > +		.op_func = nfsd4_get_dir_delegation,
> > > +		.op_flags = OP_MODIFIES_SOMETHING,
> > > +		.op_name = "OP_GET_DIR_DELEGATION",
> > > +		.op_rsize_bop = nfsd4_get_dir_delegation_rsize,
> > > +	},
> > >  #ifdef CONFIG_NFSD_PNFS
> > >  	[OP_GETDEVICEINFO] = {
> > >  		.op_func = nfsd4_getdeviceinfo,
> > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > > index fac938f563ad..3718bef74e9f 100644
> > > --- a/fs/nfsd/nfs4xdr.c
> > > +++ b/fs/nfsd/nfs4xdr.c
> > > @@ -1732,6 +1732,40 @@ nfsd4_decode_free_stateid(struct nfsd4_compoundargs *argp,
> > >  	return nfsd4_decode_stateid4(argp, &free_stateid->fr_stateid);
> > >  }
> > >  
> > > +static __be32
> > > +nfsd4_decode_get_dir_delegation(struct nfsd4_compoundargs *argp,
> > > +		union nfsd4_op_u *u)
> > > +{
> > > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > > +	struct timespec64 ts;
> > > +	u32 signal_deleg_avail;
> > > +	u32 attrs[1];
> > 
> > I know this isn't how we've done XDR in the past, but I'd rather
> > see these dummy args as fields in struct nfsd4_get_dir_delegation,
> > and also move the comments about whether each argument is supported
> > to the putative nfsd4_proc_get_dir_delegation().
> > 
> > The actual implementation of GET_DIR_DELEGATION is in nfs4proc.c,
> > after all, not here. This is simply a translation function.
> > 
> > 
> > > +	__be32 status;
> > > +
> > > +	memset(gdd, 0, sizeof(*gdd));
> > > +
> > > +	/* No signal_avail support for now (and maybe never) */
> > > +	if (xdr_stream_decode_bool(argp->xdr, &signal_deleg_avail) < 0)
> > > +		return nfserr_bad_xdr;
> > > +	status = nfsd4_decode_bitmap4(argp, gdd->notification_types,
> > > +				      ARRAY_SIZE(gdd->notification_types));
> > > +	if (status)
> > > +		return status;
> > > +
> > > +	/* For now, we don't support child or dir attr change notification */
> > > +	status = nfsd4_decode_nfstime4(argp, &ts);
> > > +	if (status)
> > > +		return status;
> > > +	/* No dir attr notification support yet either */
> > > +	status = nfsd4_decode_nfstime4(argp, &ts);
> > > +	if (status)
> > > +		return status;
> > > +	status = nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> > > +	if (status)
> > > +		return status;
> > > +	return nfsd4_decode_bitmap4(argp, attrs, ARRAY_SIZE(attrs));
> > > +}
> > > +
> > >  #ifdef CONFIG_NFSD_PNFS
> > >  static __be32
> > >  nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp,
> > > @@ -2370,7 +2404,7 @@ static const nfsd4_dec nfsd4_dec_ops[] = {
> > >  	[OP_CREATE_SESSION]	= nfsd4_decode_create_session,
> > >  	[OP_DESTROY_SESSION]	= nfsd4_decode_destroy_session,
> > >  	[OP_FREE_STATEID]	= nfsd4_decode_free_stateid,
> > > -	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_notsupp,
> > > +	[OP_GET_DIR_DELEGATION]	= nfsd4_decode_get_dir_delegation,
> > >  #ifdef CONFIG_NFSD_PNFS
> > >  	[OP_GETDEVICEINFO]	= nfsd4_decode_getdeviceinfo,
> > >  	[OP_GETDEVICELIST]	= nfsd4_decode_notsupp,
> > > @@ -5002,6 +5036,40 @@ nfsd4_encode_device_addr4(struct xdr_stream *xdr,
> > >  	return nfserr_toosmall;
> > >  }
> > >  
> > > +static __be32
> > > +nfsd4_encode_get_dir_delegation(struct nfsd4_compoundres *resp, __be32 nfserr,
> > > +				union nfsd4_op_u *u)
> > > +{
> > > +	struct xdr_stream *xdr = resp->xdr;
> > > +	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> > > +
> > > +	/* Encode the GDDR_* status code */
> > 
> > In other encoders, I've used simply the name of the field as it is
> > in the RFC as a documenting comment. That's more clear, and is
> > easily grep-able. So:
> > 
> > 	/* gddrnf_status */
> > 
> > 
> > > +	if (xdr_stream_encode_u32(xdr, gdd->nf_status) != XDR_UNIT)
> > > +		return nfserr_resource;
> > > +
> > > +	/* if it's GDD4_UNAVAIL then we're (almost) done */
> > > +	if (gdd->nf_status == GDD4_UNAVAIL) {
> > 
> > I prefer using a switch for XDR unions. That makes our
> > implementation look more like the XDR definition; easier for humans
> > to audit and modify.
> > 
> > 
> > > +		/* We never call back */
> > > +		return nfsd4_encode_bool(xdr, false);
> > 
> > Again, let's move this boolean to struct nfsd4_get_dir_delegation to
> > enable nfsd4_proc_get_dir_delegation to decide in the future.
> > 
> > 
> > > +	}
> > > +
> > > +	/* GDD4_OK case */
> > 
> > If a switch is used, then this comment becomes a real piece of
> > self-verifying code:
> > 
> > 	case GDD4_OK:
> > 
> > 
> > > +	nfserr = nfsd4_encode_verifier4(xdr, &gdd->cookieverf);
> > > +	if (nfserr)
> > > +		return nfserr;
> > > +	nfserr = nfsd4_encode_stateid4(xdr, &gdd->stateid);
> > > +	if (nfserr)
> > > +		return nfserr;
> > > +	/* No notifications (yet) */
> > > +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> > > +	if (nfserr)
> > > +		return nfserr;
> > > +	nfserr = nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> > > +	if (nfserr)
> > > +		return nfserr;
> > > +	return nfsd4_encode_bitmap4(xdr, 0, 0, 0);
> > 
> > All these as well can go in struct nfsd4_get_dir_delegation.
> > 
> > 
> > > +}
> > > +
> > >  static __be32
> > >  nfsd4_encode_getdeviceinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
> > >  		union nfsd4_op_u *u)
> > > @@ -5580,7 +5648,7 @@ static const nfsd4_enc nfsd4_enc_ops[] = {
> > >  	[OP_CREATE_SESSION]	= nfsd4_encode_create_session,
> > >  	[OP_DESTROY_SESSION]	= nfsd4_encode_noop,
> > >  	[OP_FREE_STATEID]	= nfsd4_encode_noop,
> > > -	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_noop,
> > > +	[OP_GET_DIR_DELEGATION]	= nfsd4_encode_get_dir_delegation,
> > >  #ifdef CONFIG_NFSD_PNFS
> > >  	[OP_GETDEVICEINFO]	= nfsd4_encode_getdeviceinfo,
> > >  	[OP_GETDEVICELIST]	= nfsd4_encode_noop,
> > > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > > index 415516c1b27e..27de75f32dea 100644
> > > --- a/fs/nfsd/xdr4.h
> > > +++ b/fs/nfsd/xdr4.h
> > > @@ -518,6 +518,13 @@ struct nfsd4_free_stateid {
> > >  	stateid_t	fr_stateid;         /* request */
> > >  };
> > >  
> > > +struct nfsd4_get_dir_delegation {
> > > +	u32		notification_types[1];	/* request */
> > > +	u32		nf_status;		/* response */
> > > +	nfs4_verifier	cookieverf;		/* response */
> > > +	stateid_t	stateid;		/* response */
> > > +};
> > > +
> > >  /* also used for NVERIFY */
> > >  struct nfsd4_verify {
> > >  	u32		ve_bmval[3];        /* request */
> > > @@ -797,6 +804,7 @@ struct nfsd4_op {
> > >  		struct nfsd4_reclaim_complete	reclaim_complete;
> > >  		struct nfsd4_test_stateid	test_stateid;
> > >  		struct nfsd4_free_stateid	free_stateid;
> > > +		struct nfsd4_get_dir_delegation	get_dir_delegation;
> > >  		struct nfsd4_getdeviceinfo	getdeviceinfo;
> > >  		struct nfsd4_layoutget		layoutget;
> > >  		struct nfsd4_layoutcommit	layoutcommit;
> > > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > > index ef8d2d618d5b..11ad088b411d 100644
> > > --- a/include/linux/nfs4.h
> > > +++ b/include/linux/nfs4.h
> > > @@ -701,6 +701,11 @@ enum state_protect_how4 {
> > >  	SP4_SSV		= 2
> > >  };
> > >  
> > > +enum get_dir_delegation_non_fatal_res {
> > > +	GDD4_OK		= 0,
> > > +	GDD4_UNAVAIL	= 1
> > > +};
> > > +
> > >  enum pnfs_layouttype {
> > >  	LAYOUT_NFSV4_1_FILES  = 1,
> > >  	LAYOUT_OSD2_OBJECTS = 2,
> > > 
> > > -- 
> > > 2.44.0
> > > 
> > 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback
  2024-03-15 16:52 ` [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback Jeff Layton
  2024-03-17 14:56   ` Chuck Lever
@ 2024-03-20 13:13   ` Christian Brauner
  2024-03-20 13:17     ` Jeff Layton
  1 sibling, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2024-03-20 13:13 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Fri, Mar 15, 2024 at 12:52:53PM -0400, Jeff Layton wrote:
> The NFSv4.1 protocol adds support for directory delegations, but it
> specifies that if you already have a delegation and try to request a new
> one on the same filehandle, the server must reply that the delegation is
> unavailable.
> 
> Add a new lease_manager callback to allow the lease manager (nfsd in
> this case) to impose extra checks when performing a setlease.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/locks.c               |  5 +++++
>  include/linux/filelock.h | 10 ++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index cb4b35d26162..415cca8e9565 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1822,6 +1822,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
>  			continue;
>  		}
>  
> +		/* Allow the lease manager to veto the setlease */
> +		if (lease->fl_lmops->lm_set_conflict &&
> +		    lease->fl_lmops->lm_set_conflict(lease, fl))
> +			goto out;
> +
>  		/*
>  		 * No exclusive leases if someone else has a lease on
>  		 * this file:
> diff --git a/include/linux/filelock.h b/include/linux/filelock.h
> index daee999d05f3..c5fc768087df 100644
> --- a/include/linux/filelock.h
> +++ b/include/linux/filelock.h
> @@ -49,6 +49,16 @@ struct lease_manager_operations {
>  	int (*lm_change)(struct file_lease *, int, struct list_head *);
>  	void (*lm_setup)(struct file_lease *, void **);
>  	bool (*lm_breaker_owns_lease)(struct file_lease *);
> +
> +	/**
> +	 * lm_set_conflict - extra conditions for setlease
> +	 * @new: new file_lease being set
> +	 * @old: old (extant) file_lease
> +	 *
> +	 * This allows the lease manager to add extra conditions when
> +	 * setting a lease.
> +	 */
> +	bool (*lm_set_conflict)(struct file_lease *new, struct file_lease *old);

Minor, but it seems a bit misnamed to me. I'd recommend calling this
lm_may_set_lease() or lm_may_lease().

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback
  2024-03-20 13:13   ` Christian Brauner
@ 2024-03-20 13:17     ` Jeff Layton
  0 siblings, 0 replies; 47+ messages in thread
From: Jeff Layton @ 2024-03-20 13:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Wed, 2024-03-20 at 14:13 +0100, Christian Brauner wrote:
> On Fri, Mar 15, 2024 at 12:52:53PM -0400, Jeff Layton wrote:
> > The NFSv4.1 protocol adds support for directory delegations, but it
> > specifies that if you already have a delegation and try to request a new
> > one on the same filehandle, the server must reply that the delegation is
> > unavailable.
> > 
> > Add a new lease_manager callback to allow the lease manager (nfsd in
> > this case) to impose extra checks when performing a setlease.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/locks.c               |  5 +++++
> >  include/linux/filelock.h | 10 ++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index cb4b35d26162..415cca8e9565 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -1822,6 +1822,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
> >  			continue;
> >  		}
> >  
> > +		/* Allow the lease manager to veto the setlease */
> > +		if (lease->fl_lmops->lm_set_conflict &&
> > +		    lease->fl_lmops->lm_set_conflict(lease, fl))
> > +			goto out;
> > +
> >  		/*
> >  		 * No exclusive leases if someone else has a lease on
> >  		 * this file:
> > diff --git a/include/linux/filelock.h b/include/linux/filelock.h
> > index daee999d05f3..c5fc768087df 100644
> > --- a/include/linux/filelock.h
> > +++ b/include/linux/filelock.h
> > @@ -49,6 +49,16 @@ struct lease_manager_operations {
> >  	int (*lm_change)(struct file_lease *, int, struct list_head *);
> >  	void (*lm_setup)(struct file_lease *, void **);
> >  	bool (*lm_breaker_owns_lease)(struct file_lease *);
> > +
> > +	/**
> > +	 * lm_set_conflict - extra conditions for setlease
> > +	 * @new: new file_lease being set
> > +	 * @old: old (extant) file_lease
> > +	 *
> > +	 * This allows the lease manager to add extra conditions when
> > +	 * setting a lease.
> > +	 */
> > +	bool (*lm_set_conflict)(struct file_lease *new, struct file_lease *old);
> 
> Minor, but it seems a bit misnamed to me. I'd recommend calling this
> lm_may_set_lease() or lm_may_lease().
> 

Yeah, I wasn't thrilled with the name either. Your suggestions sound
reasonable to me. I'll change it to something along those lines.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 08/24] vfs: make vfs_mknod break delegations on parent directory
  2024-03-15 16:52 ` [PATCH RFC 08/24] vfs: make vfs_mknod " Jeff Layton
@ 2024-03-20 13:42   ` Christian Brauner
  2024-03-20 20:12     ` Jeff Layton
  0 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2024-03-20 13:42 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

>  int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> -              umode_t, dev_t);
> +              umode_t, dev_t, struct inode **);

So we will have at least the following helpers with an additional
delegated inode argument.

vfs_unlink()
vfs_link()
notify_change()
vfs_create()
vfs_mknod()
vfs_mkdir()
vfs_rmdir()

From looking at callers all these helpers will be called with non-NULL
delegated inode argument in vfs only. Unless it is generally conceivable
that other callers will want to pass a non-NULL inode argument over time
it might make more sense to add vfs_<operation>_delegated() or
__vfs_<operation>() and make vfs_mknod() and friends exported wrappers
around it.

I mean it's a matter of preference ultimately but this seems cleaner to
me. So at least for the new ones we should consider it. Would also make
the patch smaller.

>  int vfs_symlink(struct mnt_idmap *, struct inode *,
>  		struct dentry *, const char *);
>  int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
> @@ -1879,7 +1879,7 @@ static inline int vfs_whiteout(struct mnt_idmap *idmap,
>  			       struct inode *dir, struct dentry *dentry)
>  {
>  	return vfs_mknod(idmap, dir, dentry, S_IFCHR | WHITEOUT_MODE,
> -			 WHITEOUT_DEV);
> +			 WHITEOUT_DEV, NULL);
>  }
>  
>  struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 0748e7ea5210..34fbcc90c984 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1227,7 +1227,7 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr,
>  	idmap = mnt_idmap(parent.mnt);
>  	err = security_path_mknod(&parent, dentry, mode, 0);
>  	if (!err)
> -		err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0);
> +		err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0, NULL);
>  	if (err)
>  		goto out_path;
>  	err = mutex_lock_interruptible(&u->bindlock);
> 
> -- 
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 08/24] vfs: make vfs_mknod break delegations on parent directory
  2024-03-20 13:42   ` Christian Brauner
@ 2024-03-20 20:12     ` Jeff Layton
  2024-03-22 14:13       ` Christian Brauner
  0 siblings, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2024-03-20 20:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Wed, 2024-03-20 at 14:42 +0100, Christian Brauner wrote:
> >  int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> > -              umode_t, dev_t);
> > +              umode_t, dev_t, struct inode **);
> 
> So we will have at least the following helpers with an additional
> delegated inode argument.
> 
> vfs_unlink()
> vfs_link()
> notify_change()
> vfs_create()
> vfs_mknod()
> vfs_mkdir()
> vfs_rmdir()
> 
> From looking at callers all these helpers will be called with non-NULL
> delegated inode argument in vfs only. Unless it is generally conceivable
> that other callers will want to pass a non-NULL inode argument over time
> it might make more sense to add vfs_<operation>_delegated() or
> __vfs_<operation>() and make vfs_mknod() and friends exported wrappers
> around it.
> 
> I mean it's a matter of preference ultimately but this seems cleaner to
> me. So at least for the new ones we should consider it. Would also make
> the patch smaller.
> 

Good suggestion. I just respun along those lines and it's a lot cleaner.
I'm still testing it but here is the new diffstat. It's a little larger
actually, but it keeps the changes more confined to namei.c:

jlayton@tleilax:~/git/linux$ git diff master --stat
 fs/locks.c                |  12 +++-
 fs/namei.c                | 227 ++++++++++++++++++++++++++++++++++++++++++++++--------------------
 fs/nfs/delegation.c       |   5 ++
 fs/nfs/dir.c              |  20 ++++++
 fs/nfs/internal.h         |   2 +-
 fs/nfs/nfs4file.c         |   2 +
 fs/nfs/nfs4proc.c         |  62 +++++++++++++++++-
 fs/nfs/nfs4trace.h        | 104 ++++++++++++++++++++++++++++++
 fs/nfs/nfs4xdr.c          | 136 +++++++++++++++++++++++++++++++++++++++
 fs/nfs/nfstrace.h         |   8 ++-
 fs/nfsd/filecache.c       |  37 +++++++++--
 fs/nfsd/filecache.h       |   2 +
 fs/nfsd/nfs4proc.c        |  48 ++++++++++++++
 fs/nfsd/nfs4state.c       | 113 ++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4xdr.c         |  91 ++++++++++++++++++++++++++-
 fs/nfsd/state.h           |   5 ++
 fs/nfsd/vfs.c             |   5 +-
 fs/nfsd/vfs.h             |   2 +-
 fs/nfsd/xdr4.h            |  19 ++++++
 fs/smb/client/cifsfs.c    |   3 +
 include/linux/filelock.h  |  14 +++++
 include/linux/nfs4.h      |   7 +++
 include/linux/nfs_fs.h    |   1 +
 include/linux/nfs_fs_sb.h |   1 +
 include/linux/nfs_xdr.h   |   2 +
 25 files changed, 838 insertions(+), 90 deletions(-)

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH RFC 08/24] vfs: make vfs_mknod break delegations on parent directory
  2024-03-20 20:12     ` Jeff Layton
@ 2024-03-22 14:13       ` Christian Brauner
  0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2024-03-22 14:13 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Howells, Tyler Hicks, Neil Brown,
	Olga Kornievskaia, Dai Ngo, Miklos Szeredi, Amir Goldstein,
	Namjae Jeon, Sergey Senozhatsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, netfs, ecryptfs,
	linux-unionfs, netdev

On Wed, Mar 20, 2024 at 04:12:29PM -0400, Jeff Layton wrote:
> On Wed, 2024-03-20 at 14:42 +0100, Christian Brauner wrote:
> > >  int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> > > -              umode_t, dev_t);
> > > +              umode_t, dev_t, struct inode **);
> > 
> > So we will have at least the following helpers with an additional
> > delegated inode argument.
> > 
> > vfs_unlink()
> > vfs_link()
> > notify_change()
> > vfs_create()
> > vfs_mknod()
> > vfs_mkdir()
> > vfs_rmdir()
> > 
> > From looking at callers all these helpers will be called with non-NULL
> > delegated inode argument in vfs only. Unless it is generally conceivable
> > that other callers will want to pass a non-NULL inode argument over time
> > it might make more sense to add vfs_<operation>_delegated() or
> > __vfs_<operation>() and make vfs_mknod() and friends exported wrappers
> > around it.
> > 
> > I mean it's a matter of preference ultimately but this seems cleaner to
> > me. So at least for the new ones we should consider it. Would also make
> > the patch smaller.
> > 
> 
> Good suggestion. I just respun along those lines and it's a lot cleaner.
> I'm still testing it but here is the new diffstat. It's a little larger
> actually, but it keeps the changes more confined to namei.c:

Sounds good to me!

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-03-22 14:14 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15 16:52 [PATCH RFC 00/24] vfs, nfsd, nfs: implement directory delegations Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 01/24] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 02/24] filelock: add a lm_set_conflict lease_manager callback Jeff Layton
2024-03-17 14:56   ` Chuck Lever
2024-03-18 11:07     ` Jeff Layton
2024-03-20 13:13   ` Christian Brauner
2024-03-20 13:17     ` Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 03/24] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
2024-03-16 23:57   ` Al Viro
2024-03-17 14:09     ` Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 04/24] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 05/24] vfs: allow rmdir " Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 06/24] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
2024-03-17  0:19   ` Al Viro
2024-03-17 12:23     ` Jeff Layton
2024-03-18  8:25   ` Stefan Metzmacher
2024-03-18 10:21     ` Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 07/24] vfs: make vfs_create break delegations on parent directory Jeff Layton
2024-03-15 16:52 ` [PATCH RFC 08/24] vfs: make vfs_mknod " Jeff Layton
2024-03-20 13:42   ` Christian Brauner
2024-03-20 20:12     ` Jeff Layton
2024-03-22 14:13       ` Christian Brauner
2024-03-15 16:53 ` [PATCH RFC 09/24] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 10/24] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 11/24] nfsd: allow DELEGRETURN on directories Jeff Layton
2024-03-17 15:09   ` Chuck Lever
2024-03-17 16:03     ` Trond Myklebust
2024-03-18 11:22       ` Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 12/24] nfsd: encoders and decoders for GET_DIR_DELEGATION Jeff Layton
2024-03-17 15:34   ` Chuck Lever
2024-03-18 16:12     ` Jeff Layton
2024-03-18 16:19       ` Chuck Lever
2024-03-15 16:53 ` [PATCH RFC 13/24] nfsd: check for delegation conflicts vs. the same client Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 14/24] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
2024-03-17 15:42   ` Chuck Lever
2024-03-15 16:53 ` [PATCH RFC 15/24] nfs: fix nfs_stateid_hash prototype when CONFIG_CRC32 isn't set Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 16/24] nfs: remove unused NFS_CALL macro Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 17/24] nfs: add cache_validity to the nfs_inode_event tracepoints Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 18/24] nfs: add a tracepoint to nfs_inode_detach_delegation_locked Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 19/24] nfs: new tracepoint in nfs_delegation_need_return Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 20/24] nfs: new tracepoint in match_stateid operation Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 21/24] nfs: add a GDD_GETATTR rpc operation Jeff Layton
2024-03-15 20:50   ` Anna Schumaker
2024-03-15 21:29     ` Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 22/24] nfs: skip dentry revalidation when parent dir has a delegation Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 23/24] nfs: optionally request a delegation on GETATTR Jeff Layton
2024-03-15 16:53 ` [PATCH RFC 24/24] nfs: add a module parameter to disable directory delegations Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).