All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Eliminate delegation self-conflicts
@ 2017-08-25 21:52 J. Bruce Fields
  2017-08-25 21:52 ` [PATCH 1/3] fs: cleanup to hide some details of delegation logic J. Bruce Fields
                   ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-25 21:52 UTC (permalink / raw)
  To: linux-nfs; +Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

This is my attempt to fix the NFS server so we don't unnecessarily
recall delegations when the operation breaking the delegation comes from
the same client that holds the delegation.

To do that we need some way to pass the identity of the breaker down
through the VFS.

I didn't feel like adding another argument to all the VFS functions that
this might need to be passed down through.  But all of those functions
already take a struct inode **delegated inode, so instead I turned that
into a single-member struct deleg_ctrl *, which I then added a second
member to.

I dunno, welcome any more straightforward ways of doing this if anyone
has suggestions.

My first attempt was to do this by instead checking for conflicts in the
caller (nfsd) and then passing down one just one bit telling the lease
code conflicts had already been checked so it didn't need to.  But
that's much too early to check for conflicts, since the caller doesn't
have the necessary inode locks yet.

I'm still missing testing.  Regression tests pass, but I haven't
actually confirmed that the self-conflicts are gone!  Off to go hack on
pynfs....

--b.

J. Bruce Fields (3):
  fs: cleanup to hide some details of delegation logic
  fs: hide another detail of delegation logic
  nfsd: clients don't need to break their own delegations

 Documentation/filesystems/Locking |  2 ++
 fs/attr.c                         | 10 +++---
 fs/locks.c                        |  7 +++-
 fs/namei.c                        | 70 ++++++++++++++++++---------------------
 fs/nfsd/nfs4state.c               | 40 ++++++++++++++++++++++
 fs/nfsd/vfs.c                     | 26 ++++++++++++---
 fs/open.c                         | 24 ++++++--------
 fs/utimes.c                       | 12 +++----
 include/linux/fs.h                | 61 +++++++++++++++++++++-------------
 9 files changed, 159 insertions(+), 93 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-25 21:52 [PATCH 0/3] Eliminate delegation self-conflicts J. Bruce Fields
@ 2017-08-25 21:52 ` J. Bruce Fields
  2017-08-28  3:54   ` NeilBrown
  2017-08-25 21:52 ` [PATCH 2/3] fs: hide another detail " J. Bruce Fields
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-25 21:52 UTC (permalink / raw)
  To: linux-nfs; +Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Pull the checks for delegated_inode into break_deleg_wait() to simplify
the callers a little.

No change in behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/namei.c         | 26 ++++++++++----------------
 fs/open.c          | 16 ++++++----------
 fs/utimes.c        |  8 +++-----
 include/linux/fs.h | 12 +++++++-----
 4 files changed, 26 insertions(+), 36 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ddb6a7c2b3d4..5a93be7b2c9c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4048,11 +4048,9 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	if (inode)
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error)
-			goto retry_deleg;
-	}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY)
+		goto retry_deleg;
 	mnt_drop_write(path.mnt);
 exit1:
 	path_put(&path);
@@ -4283,12 +4281,10 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error) {
-			path_put(&old_path);
-			goto retry;
-		}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY) {
+		path_put(&old_path);
+		goto retry;
 	}
 	if (retry_estale(error, how)) {
 		path_put(&old_path);
@@ -4601,11 +4597,9 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 	dput(old_dentry);
 exit3:
 	unlock_rename(new_path.dentry, old_path.dentry);
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error)
-			goto retry_deleg;
-	}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY)
+		goto retry_deleg;
 	mnt_drop_write(old_path.mnt);
 exit2:
 	if (retry_estale(error, lookup_flags))
diff --git a/fs/open.c b/fs/open.c
index 35bb784763a4..d49e9385e45d 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -532,11 +532,9 @@ static int chmod_common(const struct path *path, umode_t mode)
 	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 out_unlock:
 	inode_unlock(inode);
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error)
-			goto retry_deleg;
-	}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY)
+		goto retry_deleg;
 	mnt_drop_write(path->mnt);
 	return error;
 }
@@ -611,11 +609,9 @@ static int chown_common(const struct path *path, uid_t user, gid_t group)
 	if (!error)
 		error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	inode_unlock(inode);
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error)
-			goto retry_deleg;
-	}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY)
+		goto retry_deleg;
 	return error;
 }
 
diff --git a/fs/utimes.c b/fs/utimes.c
index 6571d8c848a0..75467b7ebfce 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -89,11 +89,9 @@ static int utimes_common(const struct path *path, struct timespec *times)
 	inode_lock(inode);
 	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	inode_unlock(inode);
-	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
-		if (!error)
-			goto retry_deleg;
-	}
+	error = break_deleg_wait(&delegated_inode, error);
+	if (error == DELEG_RETRY)
+		goto retry_deleg;
 
 	mnt_drop_write(path->mnt);
 out:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..1d0d2fde1766 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2288,14 +2288,16 @@ static inline int try_break_deleg(struct inode *inode, struct inode **delegated_
 	return ret;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
-{
-	int ret;
+#define DELEG_RETRY 1
 
-	ret = break_deleg(*delegated_inode, O_WRONLY);
+static inline int break_deleg_wait(struct inode **delegated_inode, int error)
+{
+	if (!*delegated_inode)
+		return error;
+	error = break_deleg(*delegated_inode, O_WRONLY);
 	iput(*delegated_inode);
 	*delegated_inode = NULL;
-	return ret;
+	return error ? error : DELEG_RETRY;
 }
 
 static inline int break_layout(struct inode *inode, bool wait)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-25 21:52 [PATCH 0/3] Eliminate delegation self-conflicts J. Bruce Fields
  2017-08-25 21:52 ` [PATCH 1/3] fs: cleanup to hide some details of delegation logic J. Bruce Fields
@ 2017-08-25 21:52 ` J. Bruce Fields
  2017-08-28  4:43   ` NeilBrown
  2017-08-25 21:52 ` [PATCH 3/3] nfsd: clients don't need to break their own delegations J. Bruce Fields
  2017-08-26 18:06 ` [PATCH 0/3] Eliminate delegation self-conflicts Chuck Lever
  3 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-25 21:52 UTC (permalink / raw)
  To: linux-nfs; +Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Pass around a new struct deleg_break_ctl instead of pointers to inode
pointers; in a future patch I want to use this to pass a little more
information from the nfs server to the lease code.

No change in behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/attr.c          | 10 +++++-----
 fs/namei.c         | 50 +++++++++++++++++++++++++-------------------------
 fs/open.c          | 12 ++++++------
 fs/utimes.c        |  6 +++---
 include/linux/fs.h | 34 ++++++++++++++++++++--------------
 5 files changed, 59 insertions(+), 53 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 135304146120..255315dbca32 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -185,23 +185,23 @@ EXPORT_SYMBOL(setattr_copy);
  * notify_change - modify attributes of a filesytem object
  * @dentry:	object affected
  * @iattr:	new attributes
- * @delegated_inode: returns inode, if the inode is delegated
+ * @deleg_break_ctl: used to return inode, if the inode is delegated
  *
  * The caller must hold the i_mutex on the affected object.
  *
  * If notify_change discovers a delegation in need of breaking,
  * it will return -EWOULDBLOCK and return a reference to the inode in
- * delegated_inode.  The caller should then break the delegation and
+ * deleg_break_ctl.  The caller should then break the delegation and
  * retry.  Because breaking a delegation may take a long time, the
  * caller should drop the i_mutex before doing so.
  *
- * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * Alternatively, a caller may pass NULL for deleg_break_ctl. This may
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.  Also, passing NULL is fine for callers holding
  * the file open for write, as there can be no conflicting delegation in
  * that case.
  */
-int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
+int notify_change(struct dentry * dentry, struct iattr * attr, struct deleg_break_ctl *deleg_break_ctl)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -304,7 +304,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
 	error = security_inode_setattr(dentry, attr);
 	if (error)
 		return error;
-	error = try_break_deleg(inode, delegated_inode);
+	error = try_break_deleg(inode, deleg_break_ctl);
 	if (error)
 		return error;
 
diff --git a/fs/namei.c b/fs/namei.c
index 5a93be7b2c9c..a75ab583aee7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3941,21 +3941,21 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
  * vfs_unlink - unlink a filesystem object
  * @dir:	parent directory
  * @dentry:	victim
- * @delegated_inode: returns victim inode, if the inode is delegated.
+ * @deleg_break_ctl: used to return victim inode, if the inode is delegated.
  *
  * The caller must hold dir->i_mutex.
  *
  * If vfs_unlink discovers a delegation, it will return -EWOULDBLOCK and
- * return a reference to the inode in delegated_inode.  The caller
+ * return a reference to the inode in deleg_break_ctl.  The caller
  * should then break the delegation on that inode and retry.  Because
  * breaking a delegation may take a long time, the caller should drop
  * dir->i_mutex before doing so.
  *
- * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * Alternatively, a caller may pass NULL for deleg_break_ctl.  This may
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.
  */
-int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegated_inode)
+int vfs_unlink(struct inode *dir, struct dentry *dentry, struct deleg_break_ctl *deleg_break_ctl)
 {
 	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
@@ -3972,7 +3972,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = try_break_deleg(target, delegated_inode);
+			error = try_break_deleg(target, deleg_break_ctl);
 			if (error)
 				goto out;
 			error = dir->i_op->unlink(dir, dentry);
@@ -4010,7 +4010,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	struct qstr last;
 	int type;
 	struct inode *inode = NULL;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 	unsigned int lookup_flags = 0;
 retry:
 	name = filename_parentat(dfd, getname(pathname), lookup_flags,
@@ -4040,7 +4040,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 		error = security_path_unlink(&path, dentry);
 		if (error)
 			goto exit2;
-		error = vfs_unlink(path.dentry->d_inode, dentry, &delegated_inode);
+		error = vfs_unlink(path.dentry->d_inode, dentry, &deleg_break_ctl);
 exit2:
 		dput(dentry);
 	}
@@ -4048,7 +4048,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	if (inode)
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY)
 		goto retry_deleg;
 	mnt_drop_write(path.mnt);
@@ -4150,21 +4150,21 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
  * @old_dentry:	object to be linked
  * @dir:	new parent
  * @new_dentry:	where to create the new link
- * @delegated_inode: returns inode needing a delegation break
+ * @deleg_break_ctl: returns inode needing a delegation break
  *
  * The caller must hold dir->i_mutex
  *
  * If vfs_link discovers a delegation on the to-be-linked file in need
  * of breaking, it will return -EWOULDBLOCK and return a reference to the
- * inode in delegated_inode.  The caller should then break the delegation
+ * inode in deleg_break_ctl.  The caller should then break the delegation
  * and retry.  Because breaking a delegation may take a long time, the
  * caller should drop the i_mutex before doing so.
  *
- * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * Alternatively, a caller may pass NULL for deleg_break_ctl.  This may
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.
  */
-int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct inode **delegated_inode)
+int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct deleg_break_ctl *deleg_break_ctl)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4208,7 +4208,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
 	else {
-		error = try_break_deleg(inode, delegated_inode);
+		error = try_break_deleg(inode, deleg_break_ctl);
 		if (!error)
 			error = dir->i_op->link(old_dentry, dir, new_dentry);
 	}
@@ -4239,7 +4239,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 {
 	struct dentry *new_dentry;
 	struct path old_path, new_path;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 	int how = 0;
 	int error;
 
@@ -4278,10 +4278,10 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 	error = security_path_link(old_path.dentry, &new_path, new_dentry);
 	if (error)
 		goto out_dput;
-	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
+	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &deleg_break_ctl);
 out_dput:
 	done_path_create(&new_path, new_dentry);
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY) {
 		path_put(&old_path);
 		goto retry;
@@ -4308,19 +4308,19 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  * @old_dentry:	source
  * @new_dir:	parent of destination
  * @new_dentry:	destination
- * @delegated_inode: returns an inode needing a delegation break
+ * @deleg_break_ctl: used to return an inode needing a delegation break
  * @flags:	rename flags
  *
  * The caller must hold multiple mutexes--see lock_rename()).
  *
  * If vfs_rename discovers a delegation in need of breaking at either
  * the source or destination, it will return -EWOULDBLOCK and return a
- * reference to the inode in delegated_inode.  The caller should then
+ * reference to the inode in deleg_break_ctl.  The caller should then
  * break the delegation and retry.  Because breaking a delegation may
  * take a long time, the caller should drop all locks before doing
  * so.
  *
- * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * Alternatively, a caller may pass NULL for deleg_break_ctl.  This may
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.
  *
@@ -4354,7 +4354,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  */
 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	       struct inode *new_dir, struct dentry *new_dentry,
-	       struct inode **delegated_inode, unsigned int flags)
+	       struct deleg_break_ctl *deleg_break_ctl, unsigned int flags)
 {
 	int error;
 	bool is_dir = d_is_dir(old_dentry);
@@ -4431,12 +4431,12 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (is_dir && !(flags & RENAME_EXCHANGE) && target)
 		shrink_dcache_parent(new_dentry);
 	if (!is_dir) {
-		error = try_break_deleg(source, delegated_inode);
+		error = try_break_deleg(source, deleg_break_ctl);
 		if (error)
 			goto out;
 	}
 	if (target && !new_is_dir) {
-		error = try_break_deleg(target, delegated_inode);
+		error = try_break_deleg(target, deleg_break_ctl);
 		if (error)
 			goto out;
 	}
@@ -4485,7 +4485,7 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 	struct path old_path, new_path;
 	struct qstr old_last, new_last;
 	int old_type, new_type;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 	struct filename *from;
 	struct filename *to;
 	unsigned int lookup_flags = 0, target_flags = LOOKUP_RENAME_TARGET;
@@ -4590,14 +4590,14 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 		goto exit5;
 	error = vfs_rename(old_path.dentry->d_inode, old_dentry,
 			   new_path.dentry->d_inode, new_dentry,
-			   &delegated_inode, flags);
+			   &deleg_break_ctl, flags);
 exit5:
 	dput(new_dentry);
 exit4:
 	dput(old_dentry);
 exit3:
 	unlock_rename(new_path.dentry, old_path.dentry);
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY)
 		goto retry_deleg;
 	mnt_drop_write(old_path.mnt);
diff --git a/fs/open.c b/fs/open.c
index d49e9385e45d..6c6443476316 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -515,7 +515,7 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename)
 static int chmod_common(const struct path *path, umode_t mode)
 {
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 	struct iattr newattrs;
 	int error;
 
@@ -529,10 +529,10 @@ static int chmod_common(const struct path *path, umode_t mode)
 		goto out_unlock;
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	error = notify_change(path->dentry, &newattrs, &delegated_inode);
+	error = notify_change(path->dentry, &newattrs, &deleg_break_ctl);
 out_unlock:
 	inode_unlock(inode);
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY)
 		goto retry_deleg;
 	mnt_drop_write(path->mnt);
@@ -578,7 +578,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 static int chown_common(const struct path *path, uid_t user, gid_t group)
 {
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 	int error;
 	struct iattr newattrs;
 	kuid_t uid;
@@ -607,9 +607,9 @@ static int chown_common(const struct path *path, uid_t user, gid_t group)
 	inode_lock(inode);
 	error = security_path_chown(path, uid, gid);
 	if (!error)
-		error = notify_change(path->dentry, &newattrs, &delegated_inode);
+		error = notify_change(path->dentry, &newattrs, &deleg_break_ctl);
 	inode_unlock(inode);
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY)
 		goto retry_deleg;
 	return error;
diff --git a/fs/utimes.c b/fs/utimes.c
index 75467b7ebfce..9af7ca3810db 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -49,7 +49,7 @@ static int utimes_common(const struct path *path, struct timespec *times)
 	int error;
 	struct iattr newattrs;
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct deleg_break_ctl deleg_break_ctl = {};
 
 	error = mnt_want_write(path->mnt);
 	if (error)
@@ -87,9 +87,9 @@ static int utimes_common(const struct path *path, struct timespec *times)
 	}
 retry_deleg:
 	inode_lock(inode);
-	error = notify_change(path->dentry, &newattrs, &delegated_inode);
+	error = notify_change(path->dentry, &newattrs, &deleg_break_ctl);
 	inode_unlock(inode);
-	error = break_deleg_wait(&delegated_inode, error);
+	error = break_deleg_wait(&deleg_break_ctl, error);
 	if (error == DELEG_RETRY)
 		goto retry_deleg;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1d0d2fde1766..20a07375e60c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1564,6 +1564,11 @@ static inline void sb_start_intwrite(struct super_block *sb)
 
 extern bool inode_owner_or_capable(const struct inode *inode);
 
+/* Used to pass some information used by NFSv4 delegations */
+struct deleg_break_ctl {
+	struct inode *delegated_inode; /* inode with in-progress break */
+};
+
 /*
  * VFS helper functions..
  */
@@ -1571,10 +1576,10 @@ extern int vfs_create(struct inode *, struct dentry *, umode_t, bool);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
-extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
+extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct deleg_break_ctl *);
 extern int vfs_rmdir(struct inode *, struct dentry *);
-extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
-extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
+extern int vfs_unlink(struct inode *, struct dentry *, struct deleg_break_ctl *);
+extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct deleg_break_ctl *, unsigned int);
 extern int vfs_whiteout(struct inode *, struct dentry *);
 
 extern struct dentry *vfs_tmpfile(struct dentry *dentry, umode_t mode,
@@ -2276,13 +2281,13 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *deleg_break_ctl)
 {
 	int ret;
 
 	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
-	if (ret == -EWOULDBLOCK && delegated_inode) {
-		*delegated_inode = inode;
+	if (ret == -EWOULDBLOCK && deleg_break_ctl) {
+		deleg_break_ctl->delegated_inode = inode;
 		ihold(inode);
 	}
 	return ret;
@@ -2290,13 +2295,14 @@ static inline int try_break_deleg(struct inode *inode, struct inode **delegated_
 
 #define DELEG_RETRY 1
 
-static inline int break_deleg_wait(struct inode **delegated_inode, int error)
+static inline int break_deleg_wait(struct deleg_break_ctl *deleg_break_ctl, int error)
 {
-	if (!*delegated_inode)
+	if (!deleg_break_ctl->delegated_inode)
 		return error;
-	error = break_deleg(*delegated_inode, O_WRONLY);
-	iput(*delegated_inode);
-	*delegated_inode = NULL;
+
+	error = break_deleg(deleg_break_ctl->delegated_inode, O_WRONLY);
+	iput(deleg_break_ctl->delegated_inode);
+	deleg_break_ctl->delegated_inode = NULL;
 	return error ? error : DELEG_RETRY;
 }
 
@@ -2321,12 +2327,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *deleg_break_ctl)
 {
 	return 0;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
+static inline int break_deleg_wait(struct deleg_break_ctl *deleg_break_ctl, int error)
 {
 	BUG();
 	return 0;
@@ -2639,7 +2645,7 @@ extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
 extern sector_t bmap(struct inode *, sector_t);
 #endif
-extern int notify_change(struct dentry *, struct iattr *, struct inode **);
+extern int notify_change(struct dentry *, struct iattr *, struct deleg_break_ctl *);
 extern int inode_permission(struct inode *, int);
 extern int __inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-08-25 21:52 [PATCH 0/3] Eliminate delegation self-conflicts J. Bruce Fields
  2017-08-25 21:52 ` [PATCH 1/3] fs: cleanup to hide some details of delegation logic J. Bruce Fields
  2017-08-25 21:52 ` [PATCH 2/3] fs: hide another detail " J. Bruce Fields
@ 2017-08-25 21:52 ` J. Bruce Fields
  2017-08-28  4:32   ` NeilBrown
  2017-08-26 18:06 ` [PATCH 0/3] Eliminate delegation self-conflicts Chuck Lever
  3 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-25 21:52 UTC (permalink / raw)
  To: linux-nfs; +Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink).  But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from.  (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client.  But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 Documentation/filesystems/Locking |  2 ++
 fs/locks.c                        |  7 ++++++-
 fs/nfsd/nfs4state.c               | 40 +++++++++++++++++++++++++++++++++++++++
 fs/nfsd/vfs.c                     | 26 ++++++++++++++++++++-----
 include/linux/fs.h                | 27 ++++++++++++++++----------
 5 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index fe25787ff6d4..8876a32df5ff 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -366,6 +366,7 @@ prototypes:
 	int (*lm_grant)(struct file_lock *, struct file_lock *, int);
 	void (*lm_break)(struct file_lock *); /* break_lease callback */
 	int (*lm_change)(struct file_lock **, int);
+	bool (*lm_breaker_owns_lease)(void *, struct file_lock *);
 
 locking rules:
 
@@ -376,6 +377,7 @@ lm_notify:		yes		yes			no
 lm_grant:		no		no			no
 lm_break:		yes		no			no
 lm_change		yes		no			no
+lm_breaker_owns_lease:	no		no			no
 
 [1]:	->lm_compare_owner and ->lm_owner_key are generally called with
 *an* inode->i_lock held. It may not be the i_lock of the inode
diff --git a/fs/locks.c b/fs/locks.c
index afefeb4ad6de..a3de5b96c81c 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1404,6 +1404,9 @@ static void time_out_leases(struct inode *inode, struct list_head *dispose)
 
 static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 {
+	if (lease->fl_lmops->lm_breaker_owns_lease && breaker->fl_owner &&
+	    lease->fl_lmops->lm_breaker_owns_lease(breaker->fl_owner, lease))
+		return false;
 	if ((breaker->fl_flags & FL_LAYOUT) != (lease->fl_flags & FL_LAYOUT))
 		return false;
 	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
@@ -1429,6 +1432,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
+ *	@who: if an nfs client is breaking, which client it is
  *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
  *	    break all leases
  *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
@@ -1439,7 +1443,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
  *	a call to open() or truncate().  This function can sleep unless you
  *	specified %O_NONBLOCK to your open().
  */
-int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+int __break_lease(struct inode *inode, void *who, unsigned int mode, unsigned int type)
 {
 	int error = 0;
 	struct file_lock_context *ctx;
@@ -1452,6 +1456,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
 	new_fl->fl_flags = type;
+	new_fl->fl_owner = who;
 
 	/* typically we will check that ctx is non-NULL before calling */
 	ctx = smp_load_acquire(&inode->i_flctx);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c04f81aa63b..fb15efcc4e08 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3825,6 +3825,45 @@ nfsd_break_deleg_cb(struct file_lock *fl)
 	return ret;
 }
 
+static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
+{
+	struct nfsd4_compoundres *resp;
+
+	/*
+	 * In case it's possible we could be called from NLM or ACL
+	 * code?:
+	 */
+	if (rqst->rq_prog != NFS_PROGRAM)
+		return NULL;
+	if (rqst->rq_vers != 4)
+		return NULL;
+	resp = rqst->rq_resp;
+	return resp->cstate.clp;
+}
+
+static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
+{
+	struct svc_rqst *rqst = who;
+	struct nfs4_client *cl;
+	struct nfs4_delegation *dl;
+	struct nfs4_file *fi = (struct nfs4_file *)fl->fl_owner;
+	bool ret = true;
+
+	cl = nfsd4_client_from_rqst(rqst);
+	if (!cl)
+		return false;
+
+	spin_lock(&fi->fi_lock);
+	list_for_each_entry(dl, &fi->fi_delegations, dl_perfile) {
+		if (dl->dl_stid.sc_client != cl) {
+			ret = false;
+			break;
+		}
+	}
+	spin_unlock(&fi->fi_lock);
+	return ret;
+}
+
 static int
 nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 		     struct list_head *dispose)
@@ -3836,6 +3875,7 @@ nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 }
 
 static const struct lock_manager_operations nfsd_lease_mng_ops = {
+	.lm_breaker_owns_lease = nfsd_breaker_owns_lease,
 	.lm_break = nfsd_break_deleg_cb,
 	.lm_change = nfsd_change_deleg_cb,
 };
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 38d0383dc7f9..fe62bc744143 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -394,6 +394,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	int		host_err;
 	bool		get_write_count;
 	bool		size_change = (iap->ia_valid & ATTR_SIZE);
+	struct deleg_break_ctl deleg_break_ctl = {
+			.delegated_inode = DELEG_NO_WAIT,
+			.who = rqstp
+	};
 
 	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
 		accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
@@ -455,7 +459,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 			.ia_size	= iap->ia_size,
 		};
 
-		host_err = notify_change(dentry, &size_attr, NULL);
+		host_err = notify_change(dentry, &size_attr, &deleg_break_ctl);
 		if (host_err)
 			goto out_unlock;
 		iap->ia_valid &= ~ATTR_SIZE;
@@ -470,7 +474,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	}
 
 	iap->ia_valid |= ATTR_CTIME;
-	host_err = notify_change(dentry, iap, NULL);
+	host_err = notify_change(dentry, iap, &deleg_break_ctl);
 
 out_unlock:
 	fh_unlock(fhp);
@@ -1553,6 +1557,10 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	struct inode	*dirp;
 	__be32		err;
 	int		host_err;
+	struct deleg_break_ctl deleg_break_ctl = {
+		.delegated_inode = DELEG_NO_WAIT,
+		.who = rqstp
+	};
 
 	err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_CREATE);
 	if (err)
@@ -1590,7 +1598,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	err = nfserr_noent;
 	if (d_really_is_negative(dold))
 		goto out_dput;
-	host_err = vfs_link(dold, dirp, dnew, NULL);
+	host_err = vfs_link(dold, dirp, dnew, &deleg_break_ctl);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
 		if (!err)
@@ -1626,6 +1634,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	struct inode	*fdir, *tdir;
 	__be32		err;
 	int		host_err;
+	struct deleg_break_ctl deleg_break_ctl = {
+		.delegated_inode = DELEG_NO_WAIT,
+		.who = rqstp
+	};
 
 	err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
 	if (err)
@@ -1683,7 +1695,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
 		goto out_dput_new;
 
-	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL, 0);
+	host_err = vfs_rename(fdir, odentry, tdir, ndentry, &deleg_break_ctl, 0);
 	if (!host_err) {
 		host_err = commit_metadata(tfhp);
 		if (!host_err)
@@ -1722,6 +1734,10 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	struct inode	*dirp;
 	__be32		err;
 	int		host_err;
+	struct deleg_break_ctl deleg_break_ctl = {
+		.delegated_inode = DELEG_NO_WAIT,
+		.who = rqstp
+	};
 
 	err = nfserr_acces;
 	if (!flen || isdotent(fname, flen))
@@ -1753,7 +1769,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 		type = d_inode(rdentry)->i_mode & S_IFMT;
 
 	if (type != S_IFDIR)
-		host_err = vfs_unlink(dirp, rdentry, NULL);
+		host_err = vfs_unlink(dirp, rdentry, &deleg_break_ctl);
 	else
 		host_err = vfs_rmdir(dirp, rdentry);
 	if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 20a07375e60c..8626071d9b54 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -953,6 +953,7 @@ struct lock_manager_operations {
 	bool (*lm_break)(struct file_lock *);
 	int (*lm_change)(struct file_lock *, int, struct list_head *);
 	void (*lm_setup)(struct file_lock *, void **);
+	bool (*lm_breaker_owns_lease)(void *, struct file_lock *);
 };
 
 struct lock_manager {
@@ -1082,7 +1083,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int locks_lock_inode_wait(struct inode *inode, struct file_lock *fl);
-extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
+extern int __break_lease(struct inode *inode, void *who, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **, void **priv);
 extern int vfs_setlease(struct file *, long, struct file_lock **, void **);
@@ -1193,7 +1194,7 @@ static inline int locks_lock_inode_wait(struct inode *inode, struct file_lock *f
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+static inline int __break_lease(struct inode *inode, void *who, unsigned int mode, unsigned int type)
 {
 	return 0;
 }
@@ -1567,6 +1568,7 @@ extern bool inode_owner_or_capable(const struct inode *inode);
 /* Used to pass some information used by NFSv4 delegations */
 struct deleg_break_ctl {
 	struct inode *delegated_inode; /* inode with in-progress break */
+	void *who; /* who is breaking the lease (used by nfsd) */
 };
 
 /*
@@ -2263,11 +2265,11 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	 */
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode, mode, FL_LEASE);
+		return __break_lease(inode, NULL, mode, FL_LEASE);
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, void *who, unsigned int mode)
 {
 	/*
 	 * Since this check is lockless, we must ensure that any refcounts
@@ -2277,16 +2279,20 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	 */
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode, mode, FL_DELEG);
+		return __break_lease(inode, who, mode, FL_DELEG);
 	return 0;
 }
 
+#define DELEG_NO_WAIT ((struct inode *)1)
+
 static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *deleg_break_ctl)
 {
 	int ret;
+	void *who = deleg_break_ctl ? deleg_break_ctl->who : NULL;
 
-	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
-	if (ret == -EWOULDBLOCK && deleg_break_ctl) {
+	ret = break_deleg(inode, who, O_WRONLY|O_NONBLOCK);
+	if (ret == -EWOULDBLOCK && deleg_break_ctl &&
+			deleg_break_ctl->delegated_inode != DELEG_NO_WAIT) {
 		deleg_break_ctl->delegated_inode = inode;
 		ihold(inode);
 	}
@@ -2300,7 +2306,8 @@ static inline int break_deleg_wait(struct deleg_break_ctl *deleg_break_ctl, int
 	if (!deleg_break_ctl->delegated_inode)
 		return error;
 
-	error = break_deleg(deleg_break_ctl->delegated_inode, O_WRONLY);
+	error = break_deleg(deleg_break_ctl->delegated_inode,
+			  deleg_break_ctl->who, O_WRONLY);
 	iput(deleg_break_ctl->delegated_inode);
 	deleg_break_ctl->delegated_inode = NULL;
 	return error ? error : DELEG_RETRY;
@@ -2310,7 +2317,7 @@ static inline int break_layout(struct inode *inode, bool wait)
 {
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode,
+		return __break_lease(inode, NULL,
 				wait ? O_WRONLY : O_WRONLY | O_NONBLOCK,
 				FL_LAYOUT);
 	return 0;
@@ -2322,7 +2329,7 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, void *who, unsigned int mode)
 {
 	return 0;
 }
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Eliminate delegation self-conflicts
  2017-08-25 21:52 [PATCH 0/3] Eliminate delegation self-conflicts J. Bruce Fields
                   ` (2 preceding siblings ...)
  2017-08-25 21:52 ` [PATCH 3/3] nfsd: clients don't need to break their own delegations J. Bruce Fields
@ 2017-08-26 18:06 ` Chuck Lever
  2017-08-29 21:52   ` J. Bruce Fields
  3 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2017-08-26 18:06 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linux NFS Mailing List, linux-fsdevel, Trond Myklebust


> On Aug 25, 2017, at 5:52 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> This is my attempt to fix the NFS server so we don't unnecessarily
> recall delegations when the operation breaking the delegation comes from
> the same client that holds the delegation.
> 
> To do that we need some way to pass the identity of the breaker down
> through the VFS.
> 
> I didn't feel like adding another argument to all the VFS functions that
> this might need to be passed down through.  But all of those functions
> already take a struct inode **delegated inode, so instead I turned that
> into a single-member struct deleg_ctrl *, which I then added a second
> member to.
> 
> I dunno, welcome any more straightforward ways of doing this if anyone
> has suggestions.
> 
> My first attempt was to do this by instead checking for conflicts in the
> caller (nfsd) and then passing down one just one bit telling the lease
> code conflicts had already been checked so it didn't need to.  But
> that's much too early to check for conflicts, since the caller doesn't
> have the necessary inode locks yet.
> 
> I'm still missing testing.  Regression tests pass, but I haven't
> actually confirmed that the self-conflicts are gone!  Off to go hack on
> pynfs....

FWIW, I observe a lot of delegation recall activity when running
the git regression tests on an NFSv4.x mount. This is a single
client.

Unpack a recent release of the git tarball.

$ cd src/git
$ make clean
$ make
$ make test

Easily scriptable, and you can "cd t/" and run individual
regression tests if you like.


> --b.
> 
> J. Bruce Fields (3):
>  fs: cleanup to hide some details of delegation logic
>  fs: hide another detail of delegation logic
>  nfsd: clients don't need to break their own delegations
> 
> Documentation/filesystems/Locking |  2 ++
> fs/attr.c                         | 10 +++---
> fs/locks.c                        |  7 +++-
> fs/namei.c                        | 70 ++++++++++++++++++---------------------
> fs/nfsd/nfs4state.c               | 40 ++++++++++++++++++++++
> fs/nfsd/vfs.c                     | 26 ++++++++++++---
> fs/open.c                         | 24 ++++++--------
> fs/utimes.c                       | 12 +++----
> include/linux/fs.h                | 61 +++++++++++++++++++++-------------
> 9 files changed, 159 insertions(+), 93 deletions(-)
> 
> -- 
> 2.13.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-25 21:52 ` [PATCH 1/3] fs: cleanup to hide some details of delegation logic J. Bruce Fields
@ 2017-08-28  3:54   ` NeilBrown
  2017-08-29 21:37     ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-08-28  3:54 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs
  Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

[-- Attachment #1: Type: text/plain, Size: 1567 bytes --]

On Fri, Aug 25 2017, J. Bruce Fields wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
>
> Pull the checks for delegated_inode into break_deleg_wait() to simplify
> the callers a little.
>
> No change in behavior.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/namei.c         | 26 ++++++++++----------------
>  fs/open.c          | 16 ++++++----------
>  fs/utimes.c        |  8 +++-----
>  include/linux/fs.h | 12 +++++++-----
>  4 files changed, 26 insertions(+), 36 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index ddb6a7c2b3d4..5a93be7b2c9c 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4048,11 +4048,9 @@ static long do_unlinkat(int dfd, const char __user *pathname)
>  	if (inode)
>  		iput(inode);	/* truncate the inode here */
>  	inode = NULL;
> -	if (delegated_inode) {
> -		error = break_deleg_wait(&delegated_inode);
> -		if (!error)
> -			goto retry_deleg;
> -	}
> +	error = break_deleg_wait(&delegated_inode, error);
> +	if (error == DELEG_RETRY)
> +		goto retry_deleg;

<mode=bikeshed>

I don't like the "DELEG_RETRY".  You are comparing it against an
'error', but it doesn't start with '-E', so I get confused (happens
often).

If this read:

     if (error > 0)
          goto retry_deleg;

it would be must more obvious to me what was happening.  Clearly the
return value isn't an error, and it isn't "success" either.  This is a
pattern I've seen elsewhere.

Alternately you could use "-EAGAIN", but I suspect there is a risk of
unwanted side-effects if you re-use and existing code.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-08-25 21:52 ` [PATCH 3/3] nfsd: clients don't need to break their own delegations J. Bruce Fields
@ 2017-08-28  4:32   ` NeilBrown
  2017-08-29 21:49     ` J. Bruce Fields
  2017-09-07 22:01       ` J. Bruce Fields
  0 siblings, 2 replies; 35+ messages in thread
From: NeilBrown @ 2017-08-28  4:32 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs
  Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

[-- Attachment #1: Type: text/plain, Size: 6611 bytes --]

On Fri, Aug 25 2017, J. Bruce Fields wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
>
> We currently revoke read delegations on any write open or any operation
> that modifies file data or metadata (including rename, link, and
> unlink).  But if the delegation in question is the only read delegation
> and is held by the client performing the operation, that's not really
> necessary.
>
> It's not always possible to prevent this in the NFSv4.0 case, because
> there's not always a way to determine which client an NFSv4.0 delegation
> came from.  (In theory we could try to guess this from the transport
> layer, e.g., by assuming all traffic on a given TCP connection comes
> from the same client.  But that's not really correct.)
>
> In the NFSv4.1 case the session layer always tells us the client.
>
> This patch should remove such self-conflicts in all cases where we can
> reliably determine the client from the compound.

I don't see any mention of the new DELEG_NO_WAIT, either here or where
the value is defined.  That means I have to figure it out for myself?


>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  Documentation/filesystems/Locking |  2 ++
>  fs/locks.c                        |  7 ++++++-
>  fs/nfsd/nfs4state.c               | 40 +++++++++++++++++++++++++++++++++++++++
>  fs/nfsd/vfs.c                     | 26 ++++++++++++++++++++-----
>  include/linux/fs.h                | 27 ++++++++++++++++----------
>  5 files changed, 86 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
> index fe25787ff6d4..8876a32df5ff 100644
> --- a/Documentation/filesystems/Locking
> +++ b/Documentation/filesystems/Locking
> @@ -366,6 +366,7 @@ prototypes:
>  	int (*lm_grant)(struct file_lock *, struct file_lock *, int);
>  	void (*lm_break)(struct file_lock *); /* break_lease callback */
>  	int (*lm_change)(struct file_lock **, int);
> +	bool (*lm_breaker_owns_lease)(void *, struct file_lock *);
>  
>  locking rules:
>  
> @@ -376,6 +377,7 @@ lm_notify:		yes		yes			no
>  lm_grant:		no		no			no
>  lm_break:		yes		no			no
>  lm_change		yes		no			no
> +lm_breaker_owns_lease:	no		no			no
>  
>  [1]:	->lm_compare_owner and ->lm_owner_key are generally called with
>  *an* inode->i_lock held. It may not be the i_lock of the inode
> diff --git a/fs/locks.c b/fs/locks.c
> index afefeb4ad6de..a3de5b96c81c 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1404,6 +1404,9 @@ static void time_out_leases(struct inode *inode, struct list_head *dispose)
>  
>  static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
>  {
> +	if (lease->fl_lmops->lm_breaker_owns_lease && breaker->fl_owner &&
> +	    lease->fl_lmops->lm_breaker_owns_lease(breaker->fl_owner, lease))
> +		return false;
>  	if ((breaker->fl_flags & FL_LAYOUT) != (lease->fl_flags & FL_LAYOUT))
>  		return false;
>  	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
> @@ -1429,6 +1432,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
>  /**
>   *	__break_lease	-	revoke all outstanding leases on file
>   *	@inode: the inode of the file to return
> + *	@who: if an nfs client is breaking, which client it is
>   *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
>   *	    break all leases
>   *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
> @@ -1439,7 +1443,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
>   *	a call to open() or truncate().  This function can sleep unless you
>   *	specified %O_NONBLOCK to your open().
>   */
> -int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
> +int __break_lease(struct inode *inode, void *who, unsigned int mode, unsigned int type)
>  {
>  	int error = 0;
>  	struct file_lock_context *ctx;
> @@ -1452,6 +1456,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
>  	if (IS_ERR(new_fl))
>  		return PTR_ERR(new_fl);
>  	new_fl->fl_flags = type;
> +	new_fl->fl_owner = who;

When I saw this, I thought: "Shouldn't 'who' be 'fl_owner_t' rather that
'void*'".
Then I saw

/* legacy typedef, should eventually be removed */
typedef void *fl_owner_t;


Maybe you could do the world a favor and remove fl_owner_t in a
preliminary patch :-)


And it is kind-a weird that the "who" you pass to break_lease() is
different from the owner passed to vfs_setlease().

>  
>  	/* typically we will check that ctx is non-NULL before calling */
>  	ctx = smp_load_acquire(&inode->i_flctx);
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 0c04f81aa63b..fb15efcc4e08 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3825,6 +3825,45 @@ nfsd_break_deleg_cb(struct file_lock *fl)
>  	return ret;
>  }
>  
> +static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
> +{
> +	struct nfsd4_compoundres *resp;
> +
> +	/*
> +	 * In case it's possible we could be called from NLM or ACL
> +	 * code?:
> +	 */
> +	if (rqst->rq_prog != NFS_PROGRAM)
> +		return NULL;
> +	if (rqst->rq_vers != 4)
> +		return NULL;
> +	resp = rqst->rq_resp;
> +	return resp->cstate.clp;
> +}
> +
> +static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
> +{
> +	struct svc_rqst *rqst = who;
> +	struct nfs4_client *cl;
> +	struct nfs4_delegation *dl;
> +	struct nfs4_file *fi = (struct nfs4_file *)fl->fl_owner;
> +	bool ret = true;
> +
> +	cl = nfsd4_client_from_rqst(rqst);
> +	if (!cl)
> +		return false;
> +
> +	spin_lock(&fi->fi_lock);
> +	list_for_each_entry(dl, &fi->fi_delegations, dl_perfile) {
> +		if (dl->dl_stid.sc_client != cl) {
> +			ret = false;
> +			break;
> +		}
> +	}
> +	spin_unlock(&fi->fi_lock);
> +	return ret;
> +}

You haven't provided any documentation telling me what the
"lm_breaker_owns_lease" callback is meant to do.
So I look at this one piece of sample code -- it seems to compute:
  not (an other client owns lease)
rather than
  (this client owns lease)
which is what I would have expected.

Given that any_leases_conflict() already loops over all leases, does
nfsd_breaker_owns_lease() need to loop as well?
Or does nfsd only take a single lease to cover all the delegations to
all the clients?
... hmmm, yes that does seem to be how nfsd works.

Would this all turn out to be easier if nfsd took a separate lease for
each client?  What would be the cost of that?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-25 21:52 ` [PATCH 2/3] fs: hide another detail " J. Bruce Fields
@ 2017-08-28  4:43   ` NeilBrown
  2017-08-29 21:40     ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-08-28  4:43 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs
  Cc: linux-fsdevel, Trond Myklebust, J. Bruce Fields

[-- Attachment #1: Type: text/plain, Size: 858 bytes --]

On Fri, Aug 25 2017, J. Bruce Fields wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
>
> Pass around a new struct deleg_break_ctl instead of pointers to inode
> pointers; in a future patch I want to use this to pass a little more
> information from the nfs server to the lease code.

The information you are passing from the nfs server to the lease code is
largely ignored by the lease code and is passed back to the nfs server,
in the sm_breaker_owns_lease call back.

If try_break_deleg() passed the 'delegated_inode' pointer though to
__break_lease(), it could pass it through any_leases_conflict() and
leases_conflict() to the lm_breaker_owns_lease() callback.
Then container_of() could be used to access whatever other data nfsd had
stashed near the inode.  The common code wouldn't need to know any of
the details.

Just a thought...

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-28  3:54   ` NeilBrown
@ 2017-08-29 21:37     ` J. Bruce Fields
  2017-08-30 19:50       ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-29 21:37 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Mon, Aug 28, 2017 at 01:54:12PM +1000, NeilBrown wrote:
> On Fri, Aug 25 2017, J. Bruce Fields wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> >
> > Pull the checks for delegated_inode into break_deleg_wait() to simplify
> > the callers a little.
> >
> > No change in behavior.
> >
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> >  fs/namei.c         | 26 ++++++++++----------------
> >  fs/open.c          | 16 ++++++----------
> >  fs/utimes.c        |  8 +++-----
> >  include/linux/fs.h | 12 +++++++-----
> >  4 files changed, 26 insertions(+), 36 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index ddb6a7c2b3d4..5a93be7b2c9c 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -4048,11 +4048,9 @@ static long do_unlinkat(int dfd, const char __user *pathname)
> >  	if (inode)
> >  		iput(inode);	/* truncate the inode here */
> >  	inode = NULL;
> > -	if (delegated_inode) {
> > -		error = break_deleg_wait(&delegated_inode);
> > -		if (!error)
> > -			goto retry_deleg;
> > -	}
> > +	error = break_deleg_wait(&delegated_inode, error);
> > +	if (error == DELEG_RETRY)
> > +		goto retry_deleg;
> 
> <mode=bikeshed>
> 
> I don't like the "DELEG_RETRY".  You are comparing it against an
> 'error', but it doesn't start with '-E', so I get confused (happens
> often).
> 
> If this read:
> 
>      if (error > 0)
>           goto retry_deleg;
> 
> it would be must more obvious to me what was happening.  Clearly the
> return value isn't an error, and it isn't "success" either.  This is a
> pattern I've seen elsewhere.
> 
> Alternately you could use "-EAGAIN", but I suspect there is a risk of
> unwanted side-effects if you re-use and existing code.

Yes.  OK, I think I like your suggestion.  The change would look like
the following (untested).

--b.

diff --git a/fs/namei.c b/fs/namei.c
index 5a93be7b2c9c..e8688498aff7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4049,7 +4049,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY)
+	if (error > 0)
 		goto retry_deleg;
 	mnt_drop_write(path.mnt);
 exit1:
@@ -4282,7 +4282,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 out_dput:
 	done_path_create(&new_path, new_dentry);
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY) {
+	if (error > 0) {
 		path_put(&old_path);
 		goto retry;
 	}
@@ -4598,7 +4598,7 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 exit3:
 	unlock_rename(new_path.dentry, old_path.dentry);
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY)
+	if (error > 0)
 		goto retry_deleg;
 	mnt_drop_write(old_path.mnt);
 exit2:
diff --git a/fs/open.c b/fs/open.c
index d49e9385e45d..80975c4dd146 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -533,7 +533,7 @@ static int chmod_common(const struct path *path, umode_t mode)
 out_unlock:
 	inode_unlock(inode);
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY)
+	if (error > 0)
 		goto retry_deleg;
 	mnt_drop_write(path->mnt);
 	return error;
@@ -610,7 +610,7 @@ static int chown_common(const struct path *path, uid_t user, gid_t group)
 		error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	inode_unlock(inode);
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY)
+	if (error > 0)
 		goto retry_deleg;
 	return error;
 }
diff --git a/fs/utimes.c b/fs/utimes.c
index 75467b7ebfce..4dc6717638e6 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -90,7 +90,7 @@ static int utimes_common(const struct path *path, struct timespec *times)
 	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	inode_unlock(inode);
 	error = break_deleg_wait(&delegated_inode, error);
-	if (error == DELEG_RETRY)
+	if (error > 0)
 		goto retry_deleg;
 
 	mnt_drop_write(path->mnt);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1d0d2fde1766..1c7f7be3f26d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2288,8 +2288,6 @@ static inline int try_break_deleg(struct inode *inode, struct inode **delegated_
 	return ret;
 }
 
-#define DELEG_RETRY 1
-
 static inline int break_deleg_wait(struct inode **delegated_inode, int error)
 {
 	if (!*delegated_inode)
@@ -2297,7 +2295,13 @@ static inline int break_deleg_wait(struct inode **delegated_inode, int error)
 	error = break_deleg(*delegated_inode, O_WRONLY);
 	iput(*delegated_inode);
 	*delegated_inode = NULL;
-	return error ? error : DELEG_RETRY;
+	if (error)
+		return error;
+	/*
+	 * Signal to the caller that it can retry the original operation
+	 * now that the delegation is broken:
+	 */
+	return 1;
 }
 
 static inline int break_layout(struct inode *inode, bool wait)

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-28  4:43   ` NeilBrown
@ 2017-08-29 21:40     ` J. Bruce Fields
  2017-08-30  0:43       ` NeilBrown
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-29 21:40 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Mon, Aug 28, 2017 at 02:43:05PM +1000, NeilBrown wrote:
> On Fri, Aug 25 2017, J. Bruce Fields wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> >
> > Pass around a new struct deleg_break_ctl instead of pointers to inode
> > pointers; in a future patch I want to use this to pass a little more
> > information from the nfs server to the lease code.
> 
> The information you are passing from the nfs server to the lease code is
> largely ignored by the lease code and is passed back to the nfs server,
> in the sm_breaker_owns_lease call back.
> 
> If try_break_deleg() passed the 'delegated_inode' pointer though to
> __break_lease(), it could pass it through any_leases_conflict() and
> leases_conflict() to the lm_breaker_owns_lease() callback.
> Then container_of() could be used to access whatever other data nfsd had
> stashed near the inode.  The common code wouldn't need to know any of
> the details.

The new information that we need is some notion of "who" (really, which
NFSv4 client) is doing the operation (unlink, whatever) that breaks the
lease.  We can't get that information from an inode pointer.

I may just not understand your suggestion.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-08-28  4:32   ` NeilBrown
@ 2017-08-29 21:49     ` J. Bruce Fields
  2018-03-16 14:43       ` J. Bruce Fields
  2017-09-07 22:01       ` J. Bruce Fields
  1 sibling, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-29 21:49 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

Now we get to harder questions....

On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> On Fri, Aug 25 2017, J. Bruce Fields wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> >
> > We currently revoke read delegations on any write open or any operation
> > that modifies file data or metadata (including rename, link, and
> > unlink).  But if the delegation in question is the only read delegation
> > and is held by the client performing the operation, that's not really
> > necessary.
> >
> > It's not always possible to prevent this in the NFSv4.0 case, because
> > there's not always a way to determine which client an NFSv4.0 delegation
> > came from.  (In theory we could try to guess this from the transport
> > layer, e.g., by assuming all traffic on a given TCP connection comes
> > from the same client.  But that's not really correct.)
> >
> > In the NFSv4.1 case the session layer always tells us the client.
> >
> > This patch should remove such self-conflicts in all cases where we can
> > reliably determine the client from the compound.
> 
> I don't see any mention of the new DELEG_NO_WAIT, either here or where
> the value is defined.  That means I have to figure it out for myself?

That's a fair complaint.

> When I saw this, I thought: "Shouldn't 'who' be 'fl_owner_t' rather that
> 'void*'".
> Then I saw
> 
> /* legacy typedef, should eventually be removed */
> typedef void *fl_owner_t;
> 
> 
> Maybe you could do the world a favor and remove fl_owner_t in a
> preliminary patch :-)

I remember trying that before and getting frustrated for some reason.
OK, I'll take another look.

> And it is kind-a weird that the "who" you pass to break_lease() is
> different from the owner passed to vfs_setlease().
> 
> >  
> >  	/* typically we will check that ctx is non-NULL before calling */
> >  	ctx = smp_load_acquire(&inode->i_flctx);
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 0c04f81aa63b..fb15efcc4e08 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -3825,6 +3825,45 @@ nfsd_break_deleg_cb(struct file_lock *fl)
> >  	return ret;
> >  }
> >  
> > +static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
> > +{
> > +	struct nfsd4_compoundres *resp;
> > +
> > +	/*
> > +	 * In case it's possible we could be called from NLM or ACL
> > +	 * code?:
> > +	 */
> > +	if (rqst->rq_prog != NFS_PROGRAM)
> > +		return NULL;
> > +	if (rqst->rq_vers != 4)
> > +		return NULL;
> > +	resp = rqst->rq_resp;
> > +	return resp->cstate.clp;
> > +}
> > +
> > +static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
> > +{
> > +	struct svc_rqst *rqst = who;
> > +	struct nfs4_client *cl;
> > +	struct nfs4_delegation *dl;
> > +	struct nfs4_file *fi = (struct nfs4_file *)fl->fl_owner;
> > +	bool ret = true;
> > +
> > +	cl = nfsd4_client_from_rqst(rqst);
> > +	if (!cl)
> > +		return false;
> > +
> > +	spin_lock(&fi->fi_lock);
> > +	list_for_each_entry(dl, &fi->fi_delegations, dl_perfile) {
> > +		if (dl->dl_stid.sc_client != cl) {
> > +			ret = false;
> > +			break;
> > +		}
> > +	}
> > +	spin_unlock(&fi->fi_lock);
> > +	return ret;
> > +}
> 
> You haven't provided any documentation telling me what the
> "lm_breaker_owns_lease" callback is meant to do.
> So I look at this one piece of sample code -- it seems to compute:
>   not (an other client owns lease)
> rather than
>   (this client owns lease)
> which is what I would have expected.
> 
> Given that any_leases_conflict() already loops over all leases, does
> nfsd_breaker_owns_lease() need to loop as well?
> Or does nfsd only take a single lease to cover all the delegations to
> all the clients?
> ... hmmm, yes that does seem to be how nfsd works.
> 
> Would this all turn out to be easier if nfsd took a separate lease for
> each client?  What would be the cost of that?

I'll have to remind myself.

I think it might have been forced by the decision to consolidate NFSv4
opens in a similar way.

Both decisions I regret since they've been a source of overly
complicated code that's had lots of bugs.  But I may just be forgetting
the drawbacks of the alternative.

Anyway, I'll review that code.  But I think the answer will be that we
need to live with it for now.

But, yes, this needs some comments and better variable names at a
minimum.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Eliminate delegation self-conflicts
  2017-08-26 18:06 ` [PATCH 0/3] Eliminate delegation self-conflicts Chuck Lever
@ 2017-08-29 21:52   ` J. Bruce Fields
  2017-08-29 23:39     ` Chuck Lever
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-29 21:52 UTC (permalink / raw)
  To: Chuck Lever
  Cc: J. Bruce Fields, Linux NFS Mailing List, linux-fsdevel, Trond Myklebust

On Sat, Aug 26, 2017 at 02:06:05PM -0400, Chuck Lever wrote:
> 
> > On Aug 25, 2017, at 5:52 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > This is my attempt to fix the NFS server so we don't unnecessarily
> > recall delegations when the operation breaking the delegation comes from
> > the same client that holds the delegation.
> > 
> > To do that we need some way to pass the identity of the breaker down
> > through the VFS.
> > 
> > I didn't feel like adding another argument to all the VFS functions that
> > this might need to be passed down through.  But all of those functions
> > already take a struct inode **delegated inode, so instead I turned that
> > into a single-member struct deleg_ctrl *, which I then added a second
> > member to.
> > 
> > I dunno, welcome any more straightforward ways of doing this if anyone
> > has suggestions.
> > 
> > My first attempt was to do this by instead checking for conflicts in the
> > caller (nfsd) and then passing down one just one bit telling the lease
> > code conflicts had already been checked so it didn't need to.  But
> > that's much too early to check for conflicts, since the caller doesn't
> > have the necessary inode locks yet.
> > 
> > I'm still missing testing.  Regression tests pass, but I haven't
> > actually confirmed that the self-conflicts are gone!  Off to go hack on
> > pynfs....
> 
> FWIW, I observe a lot of delegation recall activity when running

How are you measuring that?

> the git regression tests on an NFSv4.x mount. This is a single
> client.

> 
> Unpack a recent release of the git tarball.
> 
> $ cd src/git
> $ make clean
> $ make
> $ make test
> 
> Easily scriptable, and you can "cd t/" and run individual
> regression tests if you like.

Thanks, I'll look into that.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Eliminate delegation self-conflicts
  2017-08-29 21:52   ` J. Bruce Fields
@ 2017-08-29 23:39     ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2017-08-29 23:39 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Linux NFS Mailing List, linux-fsdevel, Trond Myklebust


> On Aug 29, 2017, at 5:52 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Sat, Aug 26, 2017 at 02:06:05PM -0400, Chuck Lever wrote:
>> 
>>> On Aug 25, 2017, at 5:52 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> 
>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>> 
>>> This is my attempt to fix the NFS server so we don't unnecessarily
>>> recall delegations when the operation breaking the delegation comes from
>>> the same client that holds the delegation.
>>> 
>>> To do that we need some way to pass the identity of the breaker down
>>> through the VFS.
>>> 
>>> I didn't feel like adding another argument to all the VFS functions that
>>> this might need to be passed down through.  But all of those functions
>>> already take a struct inode **delegated inode, so instead I turned that
>>> into a single-member struct deleg_ctrl *, which I then added a second
>>> member to.
>>> 
>>> I dunno, welcome any more straightforward ways of doing this if anyone
>>> has suggestions.
>>> 
>>> My first attempt was to do this by instead checking for conflicts in the
>>> caller (nfsd) and then passing down one just one bit telling the lease
>>> code conflicts had already been checked so it didn't need to.  But
>>> that's much too early to check for conflicts, since the caller doesn't
>>> have the necessary inode locks yet.
>>> 
>>> I'm still missing testing.  Regression tests pass, but I haven't
>>> actually confirmed that the self-conflicts are gone!  Off to go hack on
>>> pynfs....
>> 
>> FWIW, I observe a lot of delegation recall activity when running
> 
> How are you measuring that?

My observation is based on wire traces.


>> the git regression tests on an NFSv4.x mount. This is a single
>> client.
> 
>> 
>> Unpack a recent release of the git tarball.
>> 
>> $ cd src/git
>> $ make clean
>> $ make
>> $ make test
>> 
>> Easily scriptable, and you can "cd t/" and run individual
>> regression tests if you like.
> 
> Thanks, I'll look into that.
> 
> --b.

--
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-29 21:40     ` J. Bruce Fields
@ 2017-08-30  0:43       ` NeilBrown
  2017-08-30 17:09         ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-08-30  0:43 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 2348 bytes --]

On Tue, Aug 29 2017, J. Bruce Fields wrote:

> On Mon, Aug 28, 2017 at 02:43:05PM +1000, NeilBrown wrote:
>> On Fri, Aug 25 2017, J. Bruce Fields wrote:
>> 
>> > From: "J. Bruce Fields" <bfields@redhat.com>
>> >
>> > Pass around a new struct deleg_break_ctl instead of pointers to inode
>> > pointers; in a future patch I want to use this to pass a little more
>> > information from the nfs server to the lease code.
>> 
>> The information you are passing from the nfs server to the lease code is
>> largely ignored by the lease code and is passed back to the nfs server,
>> in the sm_breaker_owns_lease call back.
>> 
>> If try_break_deleg() passed the 'delegated_inode' pointer though to
>> __break_lease(), it could pass it through any_leases_conflict() and
>> leases_conflict() to the lm_breaker_owns_lease() callback.
>> Then container_of() could be used to access whatever other data nfsd had
>> stashed near the inode.  The common code wouldn't need to know any of
>> the details.
>
> The new information that we need is some notion of "who" (really, which
> NFSv4 client) is doing the operation (unlink, whatever) that breaks the
> lease.  We can't get that information from an inode pointer.
>
> I may just not understand your suggestion.

Probably I was too terse.

I'm suggesting that nfsd have a local "struct deleg_break_ctl" (or
whatever name you like) which contains a 'struct inode *delegated_inode'
plus whatever else is useful to nfsd.
Then nfsd/vfs.c, when it calls things like vfs_unlink(), passes
 &dbc.delegated_inode
(where 'struct deleg_break_ctl dbc').
So the vfs codes doesn't know about 'struct deleg_break_ctl', it just
knows about the 'struct inode ** inodep' like it does now, though with the
understanding that "DELEG_NO_WAIT" in the **inodep means that same as
inodep==NULL.

The vfs passes this same 'struct **inode' to lm_breaker_owns_lease() and
the nfsd code uses
   dbc = container_of(inodep, struct deleg_break_ctl, delegated_inode)
to get the dbc, and it can use the other fields however it likes.

Then instead of the rather task-specific name "lm_breaker_owns_lease" we
could have a more general name like "lm_lease_compatible" ... or
something.  "lm_break_doesn't_see_this_lease_as_being_in_conflict" is a
bit long, and contains "'".

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-30  0:43       ` NeilBrown
@ 2017-08-30 17:09         ` J. Bruce Fields
  2017-08-30 23:26           ` NeilBrown
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-30 17:09 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Wed, Aug 30, 2017 at 10:43:52AM +1000, NeilBrown wrote:
> On Tue, Aug 29 2017, J. Bruce Fields wrote:
> 
> > On Mon, Aug 28, 2017 at 02:43:05PM +1000, NeilBrown wrote:
> >> On Fri, Aug 25 2017, J. Bruce Fields wrote:
> >> 
> >> > From: "J. Bruce Fields" <bfields@redhat.com>
> >> >
> >> > Pass around a new struct deleg_break_ctl instead of pointers to inode
> >> > pointers; in a future patch I want to use this to pass a little more
> >> > information from the nfs server to the lease code.
> >> 
> >> The information you are passing from the nfs server to the lease code is
> >> largely ignored by the lease code and is passed back to the nfs server,
> >> in the sm_breaker_owns_lease call back.
> >> 
> >> If try_break_deleg() passed the 'delegated_inode' pointer though to
> >> __break_lease(), it could pass it through any_leases_conflict() and
> >> leases_conflict() to the lm_breaker_owns_lease() callback.
> >> Then container_of() could be used to access whatever other data nfsd had
> >> stashed near the inode.  The common code wouldn't need to know any of
> >> the details.
> >
> > The new information that we need is some notion of "who" (really, which
> > NFSv4 client) is doing the operation (unlink, whatever) that breaks the
> > lease.  We can't get that information from an inode pointer.
> >
> > I may just not understand your suggestion.
> 
> Probably I was too terse.
> 
> I'm suggesting that nfsd have a local "struct deleg_break_ctl" (or
> whatever name you like) which contains a 'struct inode *delegated_inode'
> plus whatever else is useful to nfsd.
> Then nfsd/vfs.c, when it calls things like vfs_unlink(), passes
>  &dbc.delegated_inode
> (where 'struct deleg_break_ctl dbc').
> So the vfs codes doesn't know about 'struct deleg_break_ctl', it just
> knows about the 'struct inode ** inodep' like it does now, though with the
> understanding that "DELEG_NO_WAIT" in the **inodep means that same as
> inodep==NULL.
> 
> The vfs passes this same 'struct **inode' to lm_breaker_owns_lease() and
> the nfsd code uses
>    dbc = container_of(inodep, struct deleg_break_ctl, delegated_inode)
> to get the dbc, and it can use the other fields however it likes.

Oh, now I understand.  That's an interesting idea.  I don't *think* it
works on its own, because I don't think we've got a way in that case to
know whether the passed-down delegated inode came from nfsd (and thus is
contained in a deleg_break_ctl structure).  We get the
lm_breaker_owns_lease operation from the lease that's already set on the
inode, but we don't know who that breaking operation is coming from.

Maybe something alon those lines could be made to work.

> Then instead of the rather task-specific name "lm_breaker_owns_lease" we
> could have a more general name like "lm_lease_compatible" ... or
> something.  "lm_break_doesn't_see_this_lease_as_being_in_conflict" is a
> bit long, and contains "'".

Hah, yes.--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-29 21:37     ` J. Bruce Fields
@ 2017-08-30 19:50       ` Jeff Layton
  2017-08-31 21:10         ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2017-08-30 19:50 UTC (permalink / raw)
  To: J. Bruce Fields, NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Tue, 2017-08-29 at 17:37 -0400, J. Bruce Fields wrote:
> On Mon, Aug 28, 2017 at 01:54:12PM +1000, NeilBrown wrote:
> > On Fri, Aug 25 2017, J. Bruce Fields wrote:
> > 
> > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > 
> > > Pull the checks for delegated_inode into break_deleg_wait() to
> > > simplify
> > > the callers a little.
> > > 
> > > No change in behavior.
> > > 
> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > ---
> > >  fs/namei.c         | 26 ++++++++++----------------
> > >  fs/open.c          | 16 ++++++----------
> > >  fs/utimes.c        |  8 +++-----
> > >  include/linux/fs.h | 12 +++++++-----
> > >  4 files changed, 26 insertions(+), 36 deletions(-)
> > > 
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index ddb6a7c2b3d4..5a93be7b2c9c 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -4048,11 +4048,9 @@ static long do_unlinkat(int dfd, const
> > > char __user *pathname)
> > >  	if (inode)
> > >  		iput(inode);	/* truncate the inode here
> > > */
> > >  	inode = NULL;
> > > -	if (delegated_inode) {
> > > -		error = break_deleg_wait(&delegated_inode);
> > > -		if (!error)
> > > -			goto retry_deleg;
> > > -	}
> > > +	error = break_deleg_wait(&delegated_inode, error);
> > > +	if (error == DELEG_RETRY)
> > > +		goto retry_deleg;
> > 
> > <mode=bikeshed>
> > 
> > I don't like the "DELEG_RETRY".  You are comparing it against an
> > 'error', but it doesn't start with '-E', so I get confused (happens
> > often).
> > 
> > If this read:
> > 
> >      if (error > 0)
> >           goto retry_deleg;
> > 
> > it would be must more obvious to me what was happening.  Clearly
> > the
> > return value isn't an error, and it isn't "success" either.  This
> > is a
> > pattern I've seen elsewhere.
> > 
> > Alternately you could use "-EAGAIN", but I suspect there is a risk
> > of
> > unwanted side-effects if you re-use and existing code.
> 
> Yes.  OK, I think I like your suggestion.  The change would look like
> the following (untested).
> 
> --b.
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 5a93be7b2c9c..e8688498aff7 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4049,7 +4049,7 @@ static long do_unlinkat(int dfd, const char
> __user *pathname)
>  		iput(inode);	/* truncate the inode here */
>  	inode = NULL;
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY)
> +	if (error > 0)
>  		goto retry_deleg;
>  	mnt_drop_write(path.mnt);
>  exit1:
> @@ -4282,7 +4282,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char
> __user *, oldname,
>  out_dput:
>  	done_path_create(&new_path, new_dentry);
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY) {
> +	if (error > 0) {
>  		path_put(&old_path);
>  		goto retry;
>  	}
> @@ -4598,7 +4598,7 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const
> char __user *, oldname,
>  exit3:
>  	unlock_rename(new_path.dentry, old_path.dentry);
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY)
> +	if (error > 0)
>  		goto retry_deleg;
>  	mnt_drop_write(old_path.mnt);
>  exit2:
> diff --git a/fs/open.c b/fs/open.c
> index d49e9385e45d..80975c4dd146 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -533,7 +533,7 @@ static int chmod_common(const struct path *path,
> umode_t mode)
>  out_unlock:
>  	inode_unlock(inode);
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY)
> +	if (error > 0)
>  		goto retry_deleg;
>  	mnt_drop_write(path->mnt);
>  	return error;
> @@ -610,7 +610,7 @@ static int chown_common(const struct path *path,
> uid_t user, gid_t group)
>  		error = notify_change(path->dentry, &newattrs,
> &delegated_inode);
>  	inode_unlock(inode);
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY)
> +	if (error > 0)
>  		goto retry_deleg;
>  	return error;
>  }
> diff --git a/fs/utimes.c b/fs/utimes.c
> index 75467b7ebfce..4dc6717638e6 100644
> --- a/fs/utimes.c
> +++ b/fs/utimes.c
> @@ -90,7 +90,7 @@ static int utimes_common(const struct path *path,
> struct timespec *times)
>  	error = notify_change(path->dentry, &newattrs,
> &delegated_inode);
>  	inode_unlock(inode);
>  	error = break_deleg_wait(&delegated_inode, error);
> -	if (error == DELEG_RETRY)
> +	if (error > 0)
>  		goto retry_deleg;
>  
>  	mnt_drop_write(path->mnt);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 1d0d2fde1766..1c7f7be3f26d 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2288,8 +2288,6 @@ static inline int try_break_deleg(struct inode
> *inode, struct inode **delegated_
>  	return ret;
>  }
>  
> -#define DELEG_RETRY 1
> -
>  static inline int break_deleg_wait(struct inode **delegated_inode,
> int error)
>  {
>  	if (!*delegated_inode)
> @@ -2297,7 +2295,13 @@ static inline int break_deleg_wait(struct
> inode **delegated_inode, int error)
>  	error = break_deleg(*delegated_inode, O_WRONLY);
>  	iput(*delegated_inode);
>  	*delegated_inode = NULL;
> -	return error ? error : DELEG_RETRY;
> +	if (error)
> +		return error;
> +	/*
> +	 * Signal to the caller that it can retry the original
> operation
> +	 * now that the delegation is broken:
> +	 */
> +	return 1;
>  }
>  
>  static inline int break_layout(struct inode *inode, bool wait)

ACK, I like that better too. I think a kerneldoc header is probably
warranted here too, since this is a bit of an odd return situation.
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-30 17:09         ` J. Bruce Fields
@ 2017-08-30 23:26           ` NeilBrown
  2017-08-31 19:05             ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-08-30 23:26 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 3893 bytes --]

On Wed, Aug 30 2017, J. Bruce Fields wrote:

> On Wed, Aug 30, 2017 at 10:43:52AM +1000, NeilBrown wrote:
>> On Tue, Aug 29 2017, J. Bruce Fields wrote:
>> 
>> > On Mon, Aug 28, 2017 at 02:43:05PM +1000, NeilBrown wrote:
>> >> On Fri, Aug 25 2017, J. Bruce Fields wrote:
>> >> 
>> >> > From: "J. Bruce Fields" <bfields@redhat.com>
>> >> >
>> >> > Pass around a new struct deleg_break_ctl instead of pointers to inode
>> >> > pointers; in a future patch I want to use this to pass a little more
>> >> > information from the nfs server to the lease code.
>> >> 
>> >> The information you are passing from the nfs server to the lease code is
>> >> largely ignored by the lease code and is passed back to the nfs server,
>> >> in the sm_breaker_owns_lease call back.
>> >> 
>> >> If try_break_deleg() passed the 'delegated_inode' pointer though to
>> >> __break_lease(), it could pass it through any_leases_conflict() and
>> >> leases_conflict() to the lm_breaker_owns_lease() callback.
>> >> Then container_of() could be used to access whatever other data nfsd had
>> >> stashed near the inode.  The common code wouldn't need to know any of
>> >> the details.
>> >
>> > The new information that we need is some notion of "who" (really, which
>> > NFSv4 client) is doing the operation (unlink, whatever) that breaks the
>> > lease.  We can't get that information from an inode pointer.
>> >
>> > I may just not understand your suggestion.
>> 
>> Probably I was too terse.
>> 
>> I'm suggesting that nfsd have a local "struct deleg_break_ctl" (or
>> whatever name you like) which contains a 'struct inode *delegated_inode'
>> plus whatever else is useful to nfsd.
>> Then nfsd/vfs.c, when it calls things like vfs_unlink(), passes
>>  &dbc.delegated_inode
>> (where 'struct deleg_break_ctl dbc').
>> So the vfs codes doesn't know about 'struct deleg_break_ctl', it just
>> knows about the 'struct inode ** inodep' like it does now, though with the
>> understanding that "DELEG_NO_WAIT" in the **inodep means that same as
>> inodep==NULL.
>> 
>> The vfs passes this same 'struct **inode' to lm_breaker_owns_lease() and
>> the nfsd code uses
>>    dbc = container_of(inodep, struct deleg_break_ctl, delegated_inode)
>> to get the dbc, and it can use the other fields however it likes.
>
> Oh, now I understand.  That's an interesting idea.  I don't *think* it
> works on its own, because I don't think we've got a way in that case to
> know whether the passed-down delegated inode came from nfsd (and thus is
> contained in a deleg_break_ctl structure).  We get the
> lm_breaker_owns_lease operation from the lease that's already set on the
> inode, but we don't know who that breaking operation is coming from.

That is a perfectly valid criticism and one that, I think, applies
equally to your original code.

 +static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
 +{
 +	struct svc_rqst *rqst = who;

How does nfsd know that 'who' is an svc_rqst??

You can save your code by passing
   nfsd4_client_from_rqst(rqst)

as the 'who', and then testing
 +		if (dl->dl_stid.sc_client != who) {

in the loop in nfsd_breaker_owns_lease.  So the only action performed
on the void* is an equality test.

I cannot save my code quite so easily. :-(

Thanks,
NeilBrown


>
> Maybe something alon those lines could be made to work.
>
>> Then instead of the rather task-specific name "lm_breaker_owns_lease" we
>> could have a more general name like "lm_lease_compatible" ... or
>> something.  "lm_break_doesn't_see_this_lease_as_being_in_conflict" is a
>> bit long, and contains "'".
>
> Hah, yes.--b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-30 23:26           ` NeilBrown
@ 2017-08-31 19:05             ` J. Bruce Fields
  2017-08-31 23:27               ` NeilBrown
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-31 19:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Thu, Aug 31, 2017 at 09:26:28AM +1000, NeilBrown wrote:
> On Wed, Aug 30 2017, J. Bruce Fields wrote:
> 
> > On Wed, Aug 30, 2017 at 10:43:52AM +1000, NeilBrown wrote:
> >> I'm suggesting that nfsd have a local "struct deleg_break_ctl" (or
> >> whatever name you like) which contains a 'struct inode *delegated_inode'
> >> plus whatever else is useful to nfsd.
> >> Then nfsd/vfs.c, when it calls things like vfs_unlink(), passes
> >>  &dbc.delegated_inode
> >> (where 'struct deleg_break_ctl dbc').
> >> So the vfs codes doesn't know about 'struct deleg_break_ctl', it just
> >> knows about the 'struct inode ** inodep' like it does now, though with the
> >> understanding that "DELEG_NO_WAIT" in the **inodep means that same as
> >> inodep==NULL.
> >> 
> >> The vfs passes this same 'struct **inode' to lm_breaker_owns_lease() and
> >> the nfsd code uses
> >>    dbc = container_of(inodep, struct deleg_break_ctl, delegated_inode)
> >> to get the dbc, and it can use the other fields however it likes.
> >
> > Oh, now I understand.  That's an interesting idea.  I don't *think* it
> > works on its own, because I don't think we've got a way in that case to
> > know whether the passed-down delegated inode came from nfsd (and thus is
> > contained in a deleg_break_ctl structure).  We get the
> > lm_breaker_owns_lease operation from the lease that's already set on the
> > inode, but we don't know who that breaking operation is coming from.
> 
> That is a perfectly valid criticism and one that, I think, applies
> equally to your original code.
> 
>  +static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
>  +{
>  +	struct svc_rqst *rqst = who;
> 
> How does nfsd know that 'who' is an svc_rqst??

Only nfsd fills in the "who" field of deleg_break_ctl.  But non-nfsd
users do need to pass a non-NULL delegated_inode.

> You can save your code by passing
>    nfsd4_client_from_rqst(rqst)
> 
> as the 'who', and then testing
>  +		if (dl->dl_stid.sc_client != who) {
> 
> in the loop in nfsd_breaker_owns_lease.  So the only action performed
> on the void* is an equality test.

Yes, that sounds more robust, thanks for the suggestion.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-30 19:50       ` Jeff Layton
@ 2017-08-31 21:10         ` J. Bruce Fields
  2017-08-31 23:13           ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-08-31 21:10 UTC (permalink / raw)
  To: Jeff Layton; +Cc: NeilBrown, linux-nfs, linux-fsdevel, Trond Myklebust

On Wed, Aug 30, 2017 at 03:50:59PM -0400, Jeff Layton wrote:
> ACK, I like that better too. I think a kerneldoc header is probably
> warranted here too, since this is a bit of an odd return situation.

Am I overdoing it?:

--b.

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6421feeda4bd..2261728cc900 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2285,6 +2285,25 @@ static inline int break_deleg(struct inode *inode, void *who, unsigned int mode)
 
 #define DELEG_NO_WAIT ((struct inode *)1)
 
+/**
+ * try_break_deleg - initiate a delegation break
+ * @inode: inode to break the delegation on
+ * @deleg_break_ctl: delegation state; see below
+ *
+ * VFS operations that are incompatible with a delegation call this to
+ * break any delegations on the inode first.  The caller must first lock
+ * the inode to prevent races with processes granting new delegations.
+ *
+ * Delegations may be slow to recall, so we initiate the recall but do
+ * not wait for it here while holding locks.  The caller should instead
+ * drop locks and call break_deleg_wait() which will wait for a recall,
+ * if there is one.  The inode to wait on will be stored in
+ * deleg_break_ctl, which also tracks who is breaking the delegation in
+ * the NFS case.  The caller can then retry the operation (possibly on a
+ * different inode, since a new lookup may have been required after
+ * reacquiring locks.)
+ */
+
 static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *deleg_break_ctl)
 {
 	int ret;
@@ -2299,6 +2318,22 @@ static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *d
 	return ret;
 }
 
+/**
+ * break_deleg_wait - wait on a delegation recall if necessary
+ * @deleg_break_ctl: delegation state
+ * @error: error to use if there is no delegation to wait on
+ *
+ * This should be called with the deleg_break_ctl previously passed to
+ * try_break_deleg().
+ *
+ * If the previous try_break_deleg() found no delegation in need of
+ * breaking, this is a no-op that just returns the given error.
+ *
+ * Otherwise it will wait for the delegation recall.  If the wait is
+ * succesful, it will return a positive value to indicate to the caller
+ * that it should retry the operation that originally prompted the
+ * break.
+ */
 static inline int break_deleg_wait(struct deleg_break_ctl *deleg_break_ctl, int error)
 {
 	if (!deleg_break_ctl->delegated_inode)

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] fs: cleanup to hide some details of delegation logic
  2017-08-31 21:10         ` J. Bruce Fields
@ 2017-08-31 23:13           ` Jeff Layton
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2017-08-31 23:13 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NeilBrown, linux-nfs, linux-fsdevel, Trond Myklebust

On Thu, 2017-08-31 at 17:10 -0400, J. Bruce Fields wrote:
> On Wed, Aug 30, 2017 at 03:50:59PM -0400, Jeff Layton wrote:
> > ACK, I like that better too. I think a kerneldoc header is probably
> > warranted here too, since this is a bit of an odd return situation.
> 
> Am I overdoing it?:
> 
> --b.
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 6421feeda4bd..2261728cc900 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2285,6 +2285,25 @@ static inline int break_deleg(struct inode *inode, void *who, unsigned int mode)
>  
>  #define DELEG_NO_WAIT ((struct inode *)1)
>  
> +/**
> + * try_break_deleg - initiate a delegation break
> + * @inode: inode to break the delegation on
> + * @deleg_break_ctl: delegation state; see below
> + *
> + * VFS operations that are incompatible with a delegation call this to
> + * break any delegations on the inode first.  The caller must first lock
> + * the inode to prevent races with processes granting new delegations.
> + *
> + * Delegations may be slow to recall, so we initiate the recall but do
> + * not wait for it here while holding locks.  The caller should instead
> + * drop locks and call break_deleg_wait() which will wait for a recall,
> + * if there is one.  The inode to wait on will be stored in
> + * deleg_break_ctl, which also tracks who is breaking the delegation in
> + * the NFS case.  The caller can then retry the operation (possibly on a
> + * different inode, since a new lookup may have been required after
> + * reacquiring locks.)
> + */
> +
>  static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *deleg_break_ctl)
>  {
>  	int ret;
> @@ -2299,6 +2318,22 @@ static inline int try_break_deleg(struct inode *inode, struct deleg_break_ctl *d
>  	return ret;
>  }
>  
> +/**
> + * break_deleg_wait - wait on a delegation recall if necessary
> + * @deleg_break_ctl: delegation state
> + * @error: error to use if there is no delegation to wait on
> + *
> + * This should be called with the deleg_break_ctl previously passed to
> + * try_break_deleg().
> + *
> + * If the previous try_break_deleg() found no delegation in need of
> + * breaking, this is a no-op that just returns the given error.
> + *
> + * Otherwise it will wait for the delegation recall.  If the wait is
> + * succesful, it will return a positive value to indicate to the caller
> + * that it should retry the operation that originally prompted the
> + * break.
> + */
>  static inline int break_deleg_wait(struct deleg_break_ctl *deleg_break_ctl, int error)
>  {
>  	if (!deleg_break_ctl->delegated_inode)

No, I like it. This is tricky code, and having the rationale and
detailed behavior spelled out in detail is a good thing.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-31 19:05             ` J. Bruce Fields
@ 2017-08-31 23:27               ` NeilBrown
  2017-09-01 16:18                 ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-08-31 23:27 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

On Thu, Aug 31 2017, J. Bruce Fields wrote:

> On Thu, Aug 31, 2017 at 09:26:28AM +1000, NeilBrown wrote:
>> On Wed, Aug 30 2017, J. Bruce Fields wrote:
>> 
>> > On Wed, Aug 30, 2017 at 10:43:52AM +1000, NeilBrown wrote:
>> >> I'm suggesting that nfsd have a local "struct deleg_break_ctl" (or
>> >> whatever name you like) which contains a 'struct inode *delegated_inode'
>> >> plus whatever else is useful to nfsd.
>> >> Then nfsd/vfs.c, when it calls things like vfs_unlink(), passes
>> >>  &dbc.delegated_inode
>> >> (where 'struct deleg_break_ctl dbc').
>> >> So the vfs codes doesn't know about 'struct deleg_break_ctl', it just
>> >> knows about the 'struct inode ** inodep' like it does now, though with the
>> >> understanding that "DELEG_NO_WAIT" in the **inodep means that same as
>> >> inodep==NULL.
>> >> 
>> >> The vfs passes this same 'struct **inode' to lm_breaker_owns_lease() and
>> >> the nfsd code uses
>> >>    dbc = container_of(inodep, struct deleg_break_ctl, delegated_inode)
>> >> to get the dbc, and it can use the other fields however it likes.
>> >
>> > Oh, now I understand.  That's an interesting idea.  I don't *think* it
>> > works on its own, because I don't think we've got a way in that case to
>> > know whether the passed-down delegated inode came from nfsd (and thus is
>> > contained in a deleg_break_ctl structure).  We get the
>> > lm_breaker_owns_lease operation from the lease that's already set on the
>> > inode, but we don't know who that breaking operation is coming from.
>> 
>> That is a perfectly valid criticism and one that, I think, applies
>> equally to your original code.
>> 
>>  +static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
>>  +{
>>  +	struct svc_rqst *rqst = who;
>> 
>> How does nfsd know that 'who' is an svc_rqst??
>
> Only nfsd fills in the "who" field of deleg_break_ctl.  But non-nfsd
> users do need to pass a non-NULL delegated_inode.

Yes, of course...

So having been wrong about this code twice, I'm starting to get a feel
for what it does and why.  I still wonder if there might be a better
approach though.

You are changing the interface to pass a magic cookie with the meaning
"Don't bother breaking a delegation which matches this magic cookie".
Would it not be better to pass a delegation, and say "Don't bother
breaking this delegation".  And if it were a WRITE delegation, that
could be optimised as "don't bother breaking any delegation, I have a
write delegation so I have exclusive access".

Whenever we call a vfs_* function that will need to break delegations we
have already done the lookup and have the dentry and inode, so finding a
delegation shouldn't be prohibitive.

nfsd would need to find that delegation, prevent further delegations
being handed out, and check that there aren't already conflicting
delegations.  If there are conflicts, recall them.  Once there are no
conflicting delegations, make the vfs_ request.
One downside of this is that nfsd delegations would be recalled before
any others, rather than doing them all in parallel.  This could be
addressed by calling try_break_deleg() when recalling the nfsd
delegations.

This approach seems to be half-way between your original attempt that
you described, which is racy, and the attempt you posted which adds the
callback that I don't particularly like.

???

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-08-31 23:27               ` NeilBrown
@ 2017-09-01 16:18                 ` J. Bruce Fields
  2017-09-04  4:52                   ` NeilBrown
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-01 16:18 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Fri, Sep 01, 2017 at 09:27:54AM +1000, NeilBrown wrote:
> On Thu, Aug 31 2017, J. Bruce Fields wrote:
> > Only nfsd fills in the "who" field of deleg_break_ctl.  But non-nfsd
> > users do need to pass a non-NULL delegated_inode.
> 
> Yes, of course...

Just to be clear, I've taken your suggestion to pass the nfs4_client
instead of the request so we only have to do a comparison in callback.
Among other things that should be a safe approach if we ever have
non-nfs breakers that want to pass a "who".  Patch appended below if
you're curious.

> So having been wrong about this code twice, I'm starting to get a feel
> for what it does and why.  I still wonder if there might be a better
> approach though.
> 
> You are changing the interface to pass a magic cookie with the meaning
> "Don't bother breaking a delegation which matches this magic cookie".
> Would it not be better to pass a delegation, and say "Don't bother
> breaking this delegation".  And if it were a WRITE delegation, that
> could be optimised as "don't bother breaking any delegation, I have a
> write delegation so I have exclusive access".
> 
> Whenever we call a vfs_* function that will need to break delegations we
> have already done the lookup and have the dentry and inode, so finding a
> delegation shouldn't be prohibitive.
> 
> nfsd would need to find that delegation, prevent further delegations
> being handed out, and check that there aren't already conflicting
> delegations.  If there are conflicts, recall them.  Once there are no
> conflicting delegations, make the vfs_ request.

The way that we currently serialize setting, unsetting, and breaking
delegations is by locks on the delegated inodes which aren't taken till
deeper in the vfs code.

I guess you're suggesting adding a second mechanism to prevent
delegations being given out on the inode.  We could add an atomic
counter taken by each nfsd breaker while it's in progress.  Hrm.

--b.

> One downside of this is that nfsd delegations would be recalled before
> any others, rather than doing them all in parallel.  This could be
> addressed by calling try_break_deleg() when recalling the nfsd
> delegations.
> 
> This approach seems to be half-way between your original attempt that
> you described, which is racy, and the attempt you posted which adds the
> callback that I don't particularly like.

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fb15efcc4e08..b50a7492f47f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3825,7 +3825,7 @@ nfsd_break_deleg_cb(struct file_lock *fl)
 	return ret;
 }
 
-static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
+void *nfsd4_client_from_rqst(struct svc_rqst *rqst)
 {
 	struct nfsd4_compoundres *resp;
 
@@ -3843,19 +3843,14 @@ static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
 
 static bool nfsd_breaker_owns_lease(void *who, struct file_lock *fl)
 {
-	struct svc_rqst *rqst = who;
 	struct nfs4_client *cl;
 	struct nfs4_delegation *dl;
 	struct nfs4_file *fi = (struct nfs4_file *)fl->fl_owner;
 	bool ret = true;
 
-	cl = nfsd4_client_from_rqst(rqst);
-	if (!cl)
-		return false;
-
 	spin_lock(&fi->fi_lock);
 	list_for_each_entry(dl, &fi->fi_delegations, dl_perfile) {
-		if (dl->dl_stid.sc_client != cl) {
+		if (dl->dl_stid.sc_client != who) {
 			ret = false;
 			break;
 		}
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index b9c538ab7a59..f7819ce6c817 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -125,6 +125,7 @@ void nfs4_reset_lease(time_t leasetime);
 int nfs4_reset_recoverydir(char *recdir);
 char * nfs4_recoverydir(void);
 bool nfsd4_spo_must_allow(struct svc_rqst *rqstp);
+void *nfsd4_client_from_rqst(struct svc_rqst *rqst);
 #else
 static inline int nfsd4_init_slabs(void) { return 0; }
 static inline void nfsd4_free_slabs(void) { }
@@ -139,6 +140,7 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
 {
 	return false;
 }
+void *nfsd4_client_from_rqst(struct svc_rqst *rqst) { return NULL; }
 #endif
 
 /*
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fe62bc744143..fc9b9ad1d444 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -395,8 +395,8 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	bool		get_write_count;
 	bool		size_change = (iap->ia_valid & ATTR_SIZE);
 	struct deleg_break_ctl deleg_break_ctl = {
-			.delegated_inode = DELEG_NO_WAIT,
-			.who = rqstp
+		.delegated_inode = DELEG_NO_WAIT,
+		.who = nfsd4_client_from_rqst(rqstp),
 	};
 
 	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
@@ -1559,7 +1559,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	int		host_err;
 	struct deleg_break_ctl deleg_break_ctl = {
 		.delegated_inode = DELEG_NO_WAIT,
-		.who = rqstp
+		.who = nfsd4_client_from_rqst(rqstp),
 	};
 
 	err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_CREATE);
@@ -1636,7 +1636,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	int		host_err;
 	struct deleg_break_ctl deleg_break_ctl = {
 		.delegated_inode = DELEG_NO_WAIT,
-		.who = rqstp
+		.who = nfsd4_client_from_rqst(rqstp),
 	};
 
 	err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
@@ -1736,7 +1736,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	int		host_err;
 	struct deleg_break_ctl deleg_break_ctl = {
 		.delegated_inode = DELEG_NO_WAIT,
-		.who = rqstp
+		.who = nfsd4_client_from_rqst(rqstp),
 	};
 
 	err = nfserr_acces;

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-01 16:18                 ` J. Bruce Fields
@ 2017-09-04  4:52                   ` NeilBrown
  2017-09-05 19:56                     ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-09-04  4:52 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Fri, Sep 01 2017, J. Bruce Fields wrote:

>> 
>> nfsd would need to find that delegation, prevent further delegations
>> being handed out, and check that there aren't already conflicting
>> delegations.  If there are conflicts, recall them.  Once there are no
>> conflicting delegations, make the vfs_ request.
>
> The way that we currently serialize setting, unsetting, and breaking
> delegations is by locks on the delegated inodes which aren't taken till
> deeper in the vfs code.

Do we?
I can see nfs4_set_delegation adding a new delegation for a new client
without entering the vfs at all if there is already a lease held.
If there isn't a lease already, vfs_setlease() is called, which doesn't
its own internal locking of course.  Much the same applies to unsetting
delegations.
Breaking delegations involves nfsd_break_deleg_cb() which has a comment
that it is called with i_lock held.... that seems to be used to
be sure that it is safe to a reference to the delegation state id.
Is that the only dependency on the vfs locking, or did I miss something?

>
> I guess you're suggesting adding a second mechanism to prevent
> delegations being given out on the inode.  We could add an atomic
> counter taken by each nfsd breaker while it's in progress.  Hrm.

Something like that.
We would also need to be able to look up an nfs4_file by inode (why
*are* they hashed by file handle??) and add some wait queue somewhere
so the breaker could wait for a delegation to be returned.

My big-picture point is that any complexity created by NFSD's choice to
merge delegations to multiple clients into a single vfs-level delegation
should be handled by NFSD, and not imposed on the VFS.
It certainly makes sense for the VFS to understand that certain
operations are being performed by an fl_owner_t, and to allow
delegations to that owner to remain.  It doesn't make as much sense for
the VFS to understand that there is a finer granularity of ownership
than the one that it already supports.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-04  4:52                   ` NeilBrown
@ 2017-09-05 19:56                     ` J. Bruce Fields
  2017-09-05 21:35                       ` NeilBrown
  0 siblings, 1 reply; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-05 19:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Mon, Sep 04, 2017 at 02:52:43PM +1000, NeilBrown wrote:
> On Fri, Sep 01 2017, J. Bruce Fields wrote:
> 
> >> 
> >> nfsd would need to find that delegation, prevent further delegations
> >> being handed out, and check that there aren't already conflicting
> >> delegations.  If there are conflicts, recall them.  Once there are no
> >> conflicting delegations, make the vfs_ request.
> >
> > The way that we currently serialize setting, unsetting, and breaking
> > delegations is by locks on the delegated inodes which aren't taken till
> > deeper in the vfs code.
> 
> Do we?
> I can see nfs4_set_delegation adding a new delegation for a new client
> without entering the vfs at all if there is already a lease held.

By "delegations", I meant locks of type FL_DELEG.  But even then I was
wrong, apologies.

There is an inode_trylock in generic_add_lease that will prevent any new
delegations from being given while the inode's locked.

> If there isn't a lease already, vfs_setlease() is called, which doesn't
> its own internal locking of course.  Much the same applies to unsetting
> delegations.
> Breaking delegations involves nfsd_break_deleg_cb() which has a comment
> that it is called with i_lock held.... that seems to be used to
> be sure that it is safe to a reference to the delegation state id.
> Is that the only dependency on the vfs locking, or did I miss something?
> 
> >
> > I guess you're suggesting adding a second mechanism to prevent
> > delegations being given out on the inode.  We could add an atomic
> > counter taken by each nfsd breaker while it's in progress.  Hrm.
> 
> Something like that.
> We would also need to be able to look up an nfs4_file by inode (why
> *are* they hashed by file handle??)

Grepping the logs....  That was ca9432178378 "nfsd: Use the filehandle
to look up the struct nfs4_file instead of inode" which doesn't give a
full justification.  Later commits suggest it might be about keeping
nfsv4 state in many-to-one filehandle->inode cases (spec requirement, I
believe) and preventing the nfs4_file from pinning the inode (not seeing
immediately why that was an issue).

Anyway, I can't think of a reason why hashing the filehandle's a
problem.

> and add some wait queue somewhere
> so the breaker could wait for a delegation to be returned.

In the nfsd case we're just returning to the client immediately, so
that's not really necessary, though maybe it could be useful.

> My big-picture point is that any complexity created by NFSD's choice to
> merge delegations to multiple clients into a single vfs-level delegation
> should be handled by NFSD, and not imposed on the VFS.
> It certainly makes sense for the VFS to understand that certain
> operations are being performed by an fl_owner_t, and to allow
> delegations to that owner to remain.  It doesn't make as much sense for
> the VFS to understand that there is a finer granularity of ownership
> than the one that it already supports.

Fair enough, I'll think about that.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-05 19:56                     ` J. Bruce Fields
@ 2017-09-05 21:35                       ` NeilBrown
  2017-09-06 16:03                         ` J. Bruce Fields
  0 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-09-05 21:35 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 27513 bytes --]

On Tue, Sep 05 2017, J. Bruce Fields wrote:

> On Mon, Sep 04, 2017 at 02:52:43PM +1000, NeilBrown wrote:
>> On Fri, Sep 01 2017, J. Bruce Fields wrote:
>> 
>> >> 
>> >> nfsd would need to find that delegation, prevent further delegations
>> >> being handed out, and check that there aren't already conflicting
>> >> delegations.  If there are conflicts, recall them.  Once there are no
>> >> conflicting delegations, make the vfs_ request.
>> >
>> > The way that we currently serialize setting, unsetting, and breaking
>> > delegations is by locks on the delegated inodes which aren't taken till
>> > deeper in the vfs code.
>> 
>> Do we?
>> I can see nfs4_set_delegation adding a new delegation for a new client
>> without entering the vfs at all if there is already a lease held.
>
> By "delegations", I meant locks of type FL_DELEG.  But even then I was
> wrong, apologies.
>
> There is an inode_trylock in generic_add_lease that will prevent any new
> delegations from being given while the inode's locked.
>
>> If there isn't a lease already, vfs_setlease() is called, which doesn't
>> its own internal locking of course.  Much the same applies to unsetting
>> delegations.
>> Breaking delegations involves nfsd_break_deleg_cb() which has a comment
>> that it is called with i_lock held.... that seems to be used to
>> be sure that it is safe to a reference to the delegation state id.
>> Is that the only dependency on the vfs locking, or did I miss something?
>> 
>> >
>> > I guess you're suggesting adding a second mechanism to prevent
>> > delegations being given out on the inode.  We could add an atomic
>> > counter taken by each nfsd breaker while it's in progress.  Hrm.
>> 
>> Something like that.
>> We would also need to be able to look up an nfs4_file by inode (why
>> *are* they hashed by file handle??)
>
> Grepping the logs....  That was ca9432178378 "nfsd: Use the filehandle
> to look up the struct nfs4_file instead of inode" which doesn't give a
> full justification.  Later commits suggest it might be about keeping
> nfsv4 state in many-to-one filehandle->inode cases (spec requirement, I
> believe) and preventing the nfs4_file from pinning the inode (not seeing
> immediately why that was an issue).
>
> Anyway, I can't think of a reason why hashing the filehandle's a
> problem.

Thanks for the background.  I didn't see it as a problem exactly,
though I did wonder about different filehandles mapping to the same
nfs4_file (unlikely but possible).  You say that is required and I can
see how that might be.

My perspective was more that I do want to perform a lookup by inode.
When an UNLINK arrives we lookup the dentry and so know the inode.  Then
we want to see if the client holds a delegation.  So we want to find the
nfs4_file given the dentry/inode.  We could, of course, use
export_encode_fh, but that seems a bit round-about.
We could add a second index, but would need to allow that there could be
multiple nfs4_files for a given inode.

>
>> and add some wait queue somewhere
>> so the breaker could wait for a delegation to be returned.
>
> In the nfsd case we're just returning to the client immediately, so
> that's not really necessary, though maybe it could be useful.

Ah yes, so we do.  I inverted the logic in my mind.  That makes it easier.

>
>> My big-picture point is that any complexity created by NFSD's choice to
>> merge delegations to multiple clients into a single vfs-level delegation
>> should be handled by NFSD, and not imposed on the VFS.
>> It certainly makes sense for the VFS to understand that certain
>> operations are being performed by an fl_owner_t, and to allow
>> delegations to that owner to remain.  It doesn't make as much sense for
>> the VFS to understand that there is a finer granularity of ownership
>> than the one that it already supports.
>
> Fair enough, I'll think about that.

Thanks.  Below is a patch that does compile but is probably wrong is
various ways and definitely needs cleanliness work at least.  I provide
it just to be more concrete about my thinking.

NeilBrown

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index d2fb9c8ed205..c823a244f8b2 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -216,7 +216,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
 		newattrs.ia_gid = gid;
 		newattrs.ia_valid = ATTR_MODE|ATTR_UID|ATTR_GID;
 		inode_lock(d_inode(dentry));
-		notify_change(dentry, &newattrs, NULL);
+		notify_change(dentry, &newattrs, NULL, NULL);
 		inode_unlock(d_inode(dentry));
 
 		/* mark as kernel-created inode */
@@ -323,9 +323,9 @@ static int handle_remove(const char *nodename, struct device *dev)
 			newattrs.ia_valid =
 				ATTR_UID|ATTR_GID|ATTR_MODE;
 			inode_lock(d_inode(dentry));
-			notify_change(dentry, &newattrs, NULL);
+			notify_change(dentry, &newattrs, NULL, NULL);
 			inode_unlock(d_inode(dentry));
-			err = vfs_unlink(d_inode(parent.dentry), dentry, NULL);
+			err = vfs_unlink(d_inode(parent.dentry), dentry, NULL, NULL);
 			if (!err || err == -ENOENT)
 				deleted = 1;
 		}
diff --git a/fs/attr.c b/fs/attr.c
index 135304146120..d94e516070af 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -185,6 +185,7 @@ EXPORT_SYMBOL(setattr_copy);
  * notify_change - modify attributes of a filesytem object
  * @dentry:	object affected
  * @iattr:	new attributes
+ * @owner:      allow delegations to this owner to remain
  * @delegated_inode: returns inode, if the inode is delegated
  *
  * The caller must hold the i_mutex on the affected object.
@@ -201,7 +202,7 @@ EXPORT_SYMBOL(setattr_copy);
  * the file open for write, as there can be no conflicting delegation in
  * that case.
  */
-int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
+int notify_change(struct dentry * dentry, struct iattr * attr, fl_owner_t owner, struct inode **delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -304,7 +305,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
 	error = security_inode_setattr(dentry, attr);
 	if (error)
 		return error;
-	error = try_break_deleg(inode, delegated_inode);
+	error = try_break_deleg(inode, owner, delegated_inode);
 	if (error)
 		return error;
 
diff --git a/fs/inode.c b/fs/inode.c
index 50370599e371..c28fbb91b863 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1788,7 +1788,7 @@ static int __remove_privs(struct dentry *dentry, int kill)
 	 * Note we call this on write, so notify_change will not
 	 * encounter any conflicting delegations:
 	 */
-	return notify_change(dentry, &newattrs, NULL);
+	return notify_change(dentry, &newattrs, NULL, NULL);
 }
 
 /*
diff --git a/fs/locks.c b/fs/locks.c
index afefeb4ad6de..231d93bfbdc1 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1408,6 +1408,8 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 		return false;
 	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
 		return false;
+	if (breaker->fl_owner && breaker->fl_owner == lease->fl_owner)
+		return false;
 	return locks_conflict(breaker, lease);
 }
 
@@ -1429,6 +1431,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
+ *      @owner: if non-NULL, ignore leases held by this owner.
  *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
  *	    break all leases
  *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
@@ -1439,7 +1442,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
  *	a call to open() or truncate().  This function can sleep unless you
  *	specified %O_NONBLOCK to your open().
  */
-int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+int __break_lease(struct inode *inode, fl_owner_t owner, unsigned int mode, unsigned int type)
 {
 	int error = 0;
 	struct file_lock_context *ctx;
@@ -1452,6 +1455,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
 	new_fl->fl_flags = type;
+	new_fl->fl_owner = owner;
 
 	/* typically we will check that ctx is non-NULL before calling */
 	ctx = smp_load_acquire(&inode->i_flctx);
diff --git a/fs/namei.c b/fs/namei.c
index ddb6a7c2b3d4..a1bf2ccdabb5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3941,6 +3941,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
  * vfs_unlink - unlink a filesystem object
  * @dir:	parent directory
  * @dentry:	victim
+ * @owner:	allow delegation to this owner to remain.
  * @delegated_inode: returns victim inode, if the inode is delegated.
  *
  * The caller must hold dir->i_mutex.
@@ -3955,7 +3956,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.
  */
-int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegated_inode)
+int vfs_unlink(struct inode *dir, struct dentry *dentry, fl_owner_t owner, struct inode **delegated_inode)
 {
 	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
@@ -3972,7 +3973,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = try_break_deleg(target, delegated_inode);
+			error = try_break_deleg(target, owner, delegated_inode);
 			if (error)
 				goto out;
 			error = dir->i_op->unlink(dir, dentry);
@@ -4040,7 +4041,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 		error = security_path_unlink(&path, dentry);
 		if (error)
 			goto exit2;
-		error = vfs_unlink(path.dentry->d_inode, dentry, &delegated_inode);
+		error = vfs_unlink(path.dentry->d_inode, dentry, NULL, &delegated_inode);
 exit2:
 		dput(dentry);
 	}
@@ -4049,7 +4050,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
@@ -4152,6 +4153,7 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
  * @old_dentry:	object to be linked
  * @dir:	new parent
  * @new_dentry:	where to create the new link
+ * @owner:	allow delegation to this owner to remain
  * @delegated_inode: returns inode needing a delegation break
  *
  * The caller must hold dir->i_mutex
@@ -4166,7 +4168,8 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
  * be appropriate for callers that expect the underlying filesystem not
  * to be NFS exported.
  */
-int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct inode **delegated_inode)
+int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry,
+	     fl_owner_t owner, struct inode **delegated_inode)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4210,7 +4213,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
 	else {
-		error = try_break_deleg(inode, delegated_inode);
+		error = try_break_deleg(inode, owner, delegated_inode);
 		if (!error)
 			error = dir->i_op->link(old_dentry, dir, new_dentry);
 	}
@@ -4280,11 +4283,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 	error = security_path_link(old_path.dentry, &new_path, new_dentry);
 	if (error)
 		goto out_dput;
-	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
+	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, NULL, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error) {
 			path_put(&old_path);
 			goto retry;
@@ -4312,6 +4315,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  * @old_dentry:	source
  * @new_dir:	parent of destination
  * @new_dentry:	destination
+ * @owner:	allow delegation to this owner to remain
  * @delegated_inode: returns an inode needing a delegation break
  * @flags:	rename flags
  *
@@ -4358,6 +4362,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  */
 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	       struct inode *new_dir, struct dentry *new_dentry,
+	       fl_owner_t owner,
 	       struct inode **delegated_inode, unsigned int flags)
 {
 	int error;
@@ -4435,12 +4440,12 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (is_dir && !(flags & RENAME_EXCHANGE) && target)
 		shrink_dcache_parent(new_dentry);
 	if (!is_dir) {
-		error = try_break_deleg(source, delegated_inode);
+		error = try_break_deleg(source, owner, delegated_inode);
 		if (error)
 			goto out;
 	}
 	if (target && !new_is_dir) {
-		error = try_break_deleg(target, delegated_inode);
+		error = try_break_deleg(target, owner, delegated_inode);
 		if (error)
 			goto out;
 	}
@@ -4594,7 +4599,7 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 		goto exit5;
 	error = vfs_rename(old_path.dentry->d_inode, old_dentry,
 			   new_path.dentry->d_inode, new_dentry,
-			   &delegated_inode, flags);
+			   NULL, &delegated_inode, flags);
 exit5:
 	dput(new_dentry);
 exit4:
@@ -4602,7 +4607,7 @@ SYSCALL_DEFINE5(renameat2, int, olddfd, const char __user *, oldname,
 exit3:
 	unlock_rename(new_path.dentry, old_path.dentry);
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c04f81aa63b..e713484b93b3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3654,6 +3654,27 @@ find_file_locked(struct knfsd_fh *fh, unsigned int hashval)
 	return NULL;
 }
 
+static struct nfs4_file *
+find_deleg_file_by_inode(struct inode *ino)
+{
+	int i;
+	struct nfs4_file *fp;
+
+	if (!ino)
+		return NULL;
+
+	rcu_read_lock();
+	for (i = 0; i < FILE_HASH_SIZE; i++)
+		hlist_for_each_entry_rcu(fp, &file_hashtbl[i], fi_hash)
+			if (fp->fi_deleg_file && file_inode(fp->fi_deleg_file) == ino)
+				if (atomic_inc_not_zero(&fp->fi_ref)) {
+					rcu_read_unlock();
+					return fp;
+				}
+
+	return NULL;
+}
+
 struct nfs4_file *
 find_file(struct knfsd_fh *fh)
 {
@@ -3825,6 +3846,49 @@ nfsd_break_deleg_cb(struct file_lock *fl)
 	return ret;
 }
 
+static struct nfs4_client *nfsd4_client_from_rqst(struct svc_rqst *rqst)
+{
+	struct nfsd4_compoundres *resp;
+
+	/*
+	 * In case it's possible we could be called from NLM or ACL
+	 * code?:
+	 */
+	if (rqst->rq_prog != NFS_PROGRAM)
+		return NULL;
+	if (rqst->rq_vers != 4)
+		return NULL;
+	resp = rqst->rq_resp;
+	return resp->cstate.clp;
+}
+
+int nfsd_conflicting_leases(struct dentry *dentry, struct svc_rqst *rqstp)
+{
+	struct nfs4_client *cl;
+	struct nfs4_delegation *dl;
+	struct nfs4_file *fi;
+	bool conflict;
+
+	cl = nfsd4_client_from_rqst(rqstp);
+	if (!cl)
+		return 0;
+	fi = find_deleg_file_by_inode(d_inode(dentry));
+	if (!fi)
+		return 0;
+
+	spin_lock(&fi->fi_lock);
+	conflict = false;
+	list_for_each_entry(dl, &fi->fi_delegations, dl_perfile) {
+		if (dl->dl_stid.sc_client != cl) {
+			fi->fi_had_conflict = true;
+			nfsd_break_one_deleg(dl);
+			conflict = true;
+		}
+	}
+	spin_unlock(&fi->fi_lock);
+	return conflict ? -EWOULDBLOCK : 0;
+}
+
 static int
 nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 		     struct list_head *dispose)
@@ -4137,6 +4201,8 @@ static bool nfsd4_cb_channel_good(struct nfs4_client *clp)
 	return clp->cl_minorversion && clp->cl_cb_state == NFSD4_CB_UNKNOWN;
 }
 
+char nfsd_deleg_owner[1];
+
 static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 {
 	struct file_lock *fl;
@@ -4148,7 +4214,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)fp;
+	fl->fl_owner = (fl_owner_t)nfsd_deleg_owner;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index bc69d40c4e8b..c091633fe441 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -75,6 +75,9 @@ struct raparm_hbucket {
 #define RAPARM_HASH_MASK	(RAPARM_HASH_SIZE-1)
 static struct raparm_hbucket	raparm_hash[RAPARM_HASH_SIZE];
 
+bool nfsd_conflicting_leases(struct dentry *dentry, struct svc_rqst *rqstp);
+extern char nfsd_deleg_owner[1];
+
 /* 
  * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
  * a mount point.
@@ -455,7 +458,8 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 			.ia_size	= iap->ia_size,
 		};
 
-		host_err = notify_change(dentry, &size_attr, NULL);
+		host_err = nfsd_conflicting_leases(dentry, rqstp);
+		host_err = host_err ?: notify_change(dentry, &size_attr, nfsd_deleg_owner, NULL);
 		if (host_err)
 			goto out_unlock;
 		iap->ia_valid &= ~ATTR_SIZE;
@@ -470,7 +474,8 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	}
 
 	iap->ia_valid |= ATTR_CTIME;
-	host_err = notify_change(dentry, iap, NULL);
+	host_err = nfsd_conflicting_leases(dentry, rqstp);
+	host_err = host_err ?: notify_change(dentry, iap, nfsd_deleg_owner, NULL);
 
 out_unlock:
 	fh_unlock(fhp);
@@ -1590,7 +1595,8 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	err = nfserr_noent;
 	if (d_really_is_negative(dold))
 		goto out_dput;
-	host_err = vfs_link(dold, dirp, dnew, NULL);
+	host_err = nfsd_conflicting_leases(dold, rqstp);
+	host_err = host_err ?: vfs_link(dold, dirp, dnew, nfsd_deleg_owner, NULL);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
 		if (!err)
@@ -1683,7 +1689,9 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
 		goto out_dput_new;
 
-	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL, 0);
+	host_err = nfsd_conflicting_leases(odentry, rqstp);
+	host_err |= nfsd_conflicting_leases(ndentry, rqstp);
+	host_err = host_err ?: vfs_rename(fdir, odentry, tdir, ndentry, nfsd_deleg_owner, NULL, 0);
 	if (!host_err) {
 		host_err = commit_metadata(tfhp);
 		if (!host_err)
@@ -1752,9 +1760,10 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	if (!type)
 		type = d_inode(rdentry)->i_mode & S_IFMT;
 
-	if (type != S_IFDIR)
-		host_err = vfs_unlink(dirp, rdentry, NULL);
-	else
+	if (type != S_IFDIR) {
+		host_err = nfsd_conflicting_leases(dentry, rqstp);
+		host_err = host_err ?: vfs_unlink(dirp, rdentry, nfsd_deleg_owner, NULL);
+	} else
 		host_err = vfs_rmdir(dirp, rdentry);
 	if (!host_err)
 		host_err = commit_metadata(fhp);
diff --git a/fs/open.c b/fs/open.c
index 35bb784763a4..fad27de55ec0 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -60,7 +60,7 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
 
 	inode_lock(dentry->d_inode);
 	/* Note any delegations or leases have already been broken: */
-	ret = notify_change(dentry, &newattrs, NULL);
+	ret = notify_change(dentry, &newattrs, NULL, NULL);
 	inode_unlock(dentry->d_inode);
 	return ret;
 }
@@ -529,11 +529,11 @@ static int chmod_common(const struct path *path, umode_t mode)
 		goto out_unlock;
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	error = notify_change(path->dentry, &newattrs, &delegated_inode);
+	error = notify_change(path->dentry, &newattrs, NULL, &delegated_inode);
 out_unlock:
 	inode_unlock(inode);
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
@@ -609,10 +609,10 @@ static int chown_common(const struct path *path, uid_t user, gid_t group)
 	inode_lock(inode);
 	error = security_path_chown(path, uid, gid);
 	if (!error)
-		error = notify_change(path->dentry, &newattrs, &delegated_inode);
+		error = notify_change(path->dentry, &newattrs, NULL, &delegated_inode);
 	inode_unlock(inode);
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/fs/utimes.c b/fs/utimes.c
index 6571d8c848a0..c7b53d3602ce 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -87,10 +87,10 @@ static int utimes_common(const struct path *path, struct timespec *times)
 	}
 retry_deleg:
 	inode_lock(inode);
-	error = notify_change(path->dentry, &newattrs, &delegated_inode);
+	error = notify_change(path->dentry, &newattrs, NULL, &delegated_inode);
 	inode_unlock(inode);
 	if (delegated_inode) {
-		error = break_deleg_wait(&delegated_inode);
+		error = break_deleg_wait(NULL, &delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cc2e0f5a8fd1..6e434c677e4c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1084,7 +1084,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int locks_lock_inode_wait(struct inode *inode, struct file_lock *fl);
-extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
+extern int __break_lease(struct inode *inode, fl_owner_t owner, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **, void **priv);
 extern int vfs_setlease(struct file *, long, struct file_lock **, void **);
@@ -1195,7 +1195,7 @@ static inline int locks_lock_inode_wait(struct inode *inode, struct file_lock *f
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+static inline int __break_lease(struct inode *inode, fl_owner_t owner, unsigned int mode, unsigned int type)
 {
 	return 0;
 }
@@ -1573,10 +1573,10 @@ extern int vfs_create(struct inode *, struct dentry *, umode_t, bool);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
-extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
+extern int vfs_link(struct dentry *, struct inode *, struct dentry *, fl_owner_t, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
-extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
-extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
+extern int vfs_unlink(struct inode *, struct dentry *, fl_owner_t, struct inode **);
+extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, fl_owner_t, struct inode **, unsigned int);
 extern int vfs_whiteout(struct inode *, struct dentry *);
 
 extern struct dentry *vfs_tmpfile(struct dentry *dentry, umode_t mode,
@@ -2260,11 +2260,11 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	 */
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode, mode, FL_LEASE);
+		return __break_lease(inode, NULL, mode, FL_LEASE);
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, fl_owner_t owner, unsigned int mode)
 {
 	/*
 	 * Since this check is lockless, we must ensure that any refcounts
@@ -2274,15 +2274,15 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	 */
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode, mode, FL_DELEG);
+		return __break_lease(inode, owner, mode, FL_DELEG);
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+static inline int try_break_deleg(struct inode *inode, fl_owner_t owner, struct inode **delegated_inode)
 {
 	int ret;
 
-	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
+	ret = break_deleg(inode, owner, O_WRONLY|O_NONBLOCK);
 	if (ret == -EWOULDBLOCK && delegated_inode) {
 		*delegated_inode = inode;
 		ihold(inode);
@@ -2290,11 +2290,11 @@ static inline int try_break_deleg(struct inode *inode, struct inode **delegated_
 	return ret;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
+static inline int break_deleg_wait(fl_owner_t owner, struct inode **delegated_inode)
 {
 	int ret;
 
-	ret = break_deleg(*delegated_inode, O_WRONLY);
+	ret = break_deleg(*delegated_inode, owner, O_WRONLY);
 	iput(*delegated_inode);
 	*delegated_inode = NULL;
 	return ret;
@@ -2304,7 +2304,7 @@ static inline int break_layout(struct inode *inode, bool wait)
 {
 	smp_mb();
 	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode,
+		return __break_lease(inode, NULL,
 				wait ? O_WRONLY : O_WRONLY | O_NONBLOCK,
 				FL_LAYOUT);
 	return 0;
@@ -2316,17 +2316,17 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, fl_owner_t owner, unsigned int mode)
 {
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+static inline int try_break_deleg(struct inode *inode, fl_owner_t owner, struct inode **delegated_inode)
 {
 	return 0;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
+static inline int break_deleg_wait(fl_owner_t owner, struct inode **delegated_inode)
 {
 	BUG();
 	return 0;
@@ -2643,7 +2643,7 @@ extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
 extern sector_t bmap(struct inode *, sector_t);
 #endif
-extern int notify_change(struct dentry *, struct iattr *, struct inode **);
+extern int notify_change(struct dentry *, struct iattr *, fl_owner_t, struct inode **);
 extern int inode_permission(struct inode *, int);
 extern int __inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-05 21:35                       ` NeilBrown
@ 2017-09-06 16:03                         ` J. Bruce Fields
  2017-09-07  0:43                           ` NeilBrown
  2018-03-16 14:42                           ` J. Bruce Fields
  0 siblings, 2 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-06 16:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

On Wed, Sep 06, 2017 at 07:35:47AM +1000, NeilBrown wrote:
> On Tue, Sep 05 2017, J. Bruce Fields wrote:
> 
> > On Mon, Sep 04, 2017 at 02:52:43PM +1000, NeilBrown wrote:
> >> and add some wait queue somewhere
> >> so the breaker could wait for a delegation to be returned.
> >
> > In the nfsd case we're just returning to the client immediately, so
> > that's not really necessary, though maybe it could be useful.
> 
> Ah yes, so we do.  I inverted the logic in my mind.  That makes it easier.

(Minor derail: it might be worth waiting briefly before returning
NFS4ERR_DELAY.

It would be easy enough to implement, the hard part would be testing
whether it helped.  I think the initial client retry time is 100ms
(NFS4_POLL_RETRY_MIN), so it'd have to beat that frequently enough.)

> Thanks.  Below is a patch that does compile but is probably wrong is
> various ways and definitely needs cleanliness work at least.  I provide
> it just to be more concrete about my thinking.

Gah, I hate having to patch every notify_change caller.  But maybe I
should get over that, the resulting logic is simpler.  Anyway, stripping
away all those callers:

Right, the advantage is that this makes checking for conflicts simple
and obvious:

> diff --git a/fs/locks.c b/fs/locks.c
> index afefeb4ad6de..231d93bfbdc1 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1408,6 +1408,8 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
>  		return false;
>  	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
>  		return false;
> +	if (breaker->fl_owner && breaker->fl_owner == lease->fl_owner)
> +		return false;
>  	return locks_conflict(breaker, lease);
>  }

notify_change, vfs_unlink, etc., all get a new argument:

> + * @owner:	allow delegation to this owner to remain

And, right, we need a way to lookup nfs4_file by inode:

> +static struct nfs4_file *
> +find_deleg_file_by_inode(struct inode *ino)

(ignoring how we do it for now).

>  /* 
>   * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
>   * a mount point.
> @@ -455,7 +458,8 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  			.ia_size	= iap->ia_size,
>  		};
>  
> -		host_err = notify_change(dentry, &size_attr, NULL);
> +		host_err = nfsd_conflicting_leases(dentry, rqstp);
> +		host_err = host_err ?: notify_change(dentry, &size_attr, nfsd_deleg_owner, NULL);

And then you recall nfsd delegations and delegations held by
(hypothetical) non-nfsd users separately, OK (also ignoring how).

There are no such users currently, so nfsd could just pass NULL.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-06 16:03                         ` J. Bruce Fields
@ 2017-09-07  0:43                           ` NeilBrown
  2017-09-08 15:06                             ` J. Bruce Fields
  2018-03-16 14:42                           ` J. Bruce Fields
  1 sibling, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-09-07  0:43 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 486 bytes --]


>> +		host_err = nfsd_conflicting_leases(dentry, rqstp);
>> +		host_err = host_err ?: notify_change(dentry, &size_attr, nfsd_deleg_owner, NULL);
>
> And then you recall nfsd delegations and delegations held by
> (hypothetical) non-nfsd users separately, OK (also ignoring how).
>
> There are no such users currently, so nfsd could just pass NULL.

I don't think so.  If we pass NULL (as the owner), when VFS will recall
the one nfsd delegation that we want to preserve. ???

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-08-28  4:32   ` NeilBrown
@ 2017-09-07 22:01       ` J. Bruce Fields
  2017-09-07 22:01       ` J. Bruce Fields
  1 sibling, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-07 22:01 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> /* legacy typedef, should eventually be removed */
> typedef void *fl_owner_t;
> 
> 
> Maybe you could do the world a favor and remove fl_owner_t in a
> preliminary patch :-)

Partly scripted, still a bit tedious, but I think it's right.  Honestly
I don't know what the motivation for the comment was, though.  Are there
no documentation or type-checking benefits to having the typdef?

The main annoyance was having this defined as a files_struct pointer,
which Christoph fixed some time ago.

--b.

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 73e7d91f03dc..758ca1591b0c 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -864,7 +864,7 @@ struct file_operations {
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*mremap)(struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 09f86ebfcc7b..35a185ac0de3 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -1809,7 +1809,7 @@ pfm_syswide_cleanup_other_cpu(pfm_context_t *ctx)
  * When caller is self-monitoring, the context is unloaded.
  */
 static int
-pfm_flush(struct file *filp, fl_owner_t id)
+pfm_flush(struct file *filp, void *id)
 {
 	pfm_context_t *ctx;
 	struct task_struct *task;
diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c
index ae2f740a82f1..8783335b3b85 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1720,7 +1720,7 @@ static unsigned int spufs_mfc_poll(struct file *file,poll_table *wait)
 	return mask;
 }
 
-static int spufs_mfc_flush(struct file *file, fl_owner_t id)
+static int spufs_mfc_flush(struct file *file, void *id)
 {
 	struct spu_context *ctx = file->private_data;
 	int ret;
diff --git a/arch/tile/kernel/hardwall.c b/arch/tile/kernel/hardwall.c
index 2fd1694ac1d0..b2cf21d1edb0 100644
--- a/arch/tile/kernel/hardwall.c
+++ b/arch/tile/kernel/hardwall.c
@@ -1030,7 +1030,7 @@ static long hardwall_compat_ioctl(struct file *file,
 #endif
 
 /* The user process closed the file; revoke access to user networks. */
-static int hardwall_flush(struct file *file, fl_owner_t owner)
+static int hardwall_flush(struct file *file, void *owner)
 {
 	struct hardwall_info *info = file->private_data;
 	struct task_struct *task, *tmp;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f7665c31feca..02bfce18c912 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3503,7 +3503,7 @@ static int binder_open(struct inode *nodp, struct file *filp)
 	return 0;
 }
 
-static int binder_flush(struct file *filp, fl_owner_t id)
+static int binder_flush(struct file *filp, void *id)
 {
 	struct binder_proc *proc = filp->private_data;
 
diff --git a/drivers/char/ps3flash.c b/drivers/char/ps3flash.c
index b526dc15c271..5a03dd0eb2f1 100644
--- a/drivers/char/ps3flash.c
+++ b/drivers/char/ps3flash.c
@@ -281,7 +281,7 @@ static ssize_t ps3flash_kernel_write(const void *buf, size_t count,
 	return res;
 }
 
-static int ps3flash_flush(struct file *file, fl_owner_t id)
+static int ps3flash_flush(struct file *file, void *id)
 {
 	return ps3flash_writeback(ps3flash_dev);
 }
diff --git a/drivers/char/xillybus/xillybus_core.c b/drivers/char/xillybus/xillybus_core.c
index b6c9cdead7f3..7e04a9df51e3 100644
--- a/drivers/char/xillybus/xillybus_core.c
+++ b/drivers/char/xillybus/xillybus_core.c
@@ -1156,7 +1156,7 @@ static int xillybus_myflush(struct xilly_channel *channel, long timeout)
 	return rc;
 }
 
-static int xillybus_flush(struct file *filp, fl_owner_t id)
+static int xillybus_flush(struct file *filp, void *id)
 {
 	if (!(filp->f_mode & FMODE_WRITE))
 		return 0;
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index ec8ac5c4dd84..f4d0c4805ec7 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -225,7 +225,7 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
  *	will be treated as upload termination and will free those completed
  *	buffer pages and -ECANCELED will be returned.
  **/
-static int efi_capsule_flush(struct file *file, fl_owner_t id)
+static int efi_capsule_flush(struct file *file, void *id)
 {
 	int ret = 0;
 	struct capsule_info *cap_info = file->private_data;
diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 925571475005..85117fecc292 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -342,7 +342,7 @@ static int evdev_fasync(int fd, struct file *file, int on)
 	return fasync_helper(fd, file, on, &client->fasync);
 }
 
-static int evdev_flush(struct file *file, fl_owner_t id)
+static int evdev_flush(struct file *file, void *id)
 {
 	struct evdev_client *client = file->private_data;
 	struct evdev *evdev = client->evdev;
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index f7e826142a72..c446a1450714 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -48,7 +48,7 @@ static unsigned int scif_fdpoll(struct file *f, poll_table *wait)
 	return __scif_pollfd(f, wait, priv);
 }
 
-static int scif_fdflush(struct file *f, fl_owner_t id)
+static int scif_fdflush(struct file *f, void *id)
 {
 	struct scif_endpt *ep = f->private_data;
 
diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
index 929ee7e88120..366bc57f0bfb 100644
--- a/drivers/scsi/osst.c
+++ b/drivers/scsi/osst.c
@@ -4822,7 +4822,7 @@ static int os_scsi_tape_open(struct inode * inode, struct file * filp)
 
 
 /* Flush the tape buffer before close */
-static int os_scsi_tape_flush(struct file * filp, fl_owner_t id)
+static int os_scsi_tape_flush(struct file * filp, void *id)
 {
 	int		      result = 0, result2;
 	struct osst_tape    * STp    = filp->private_data;
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 8e5013d9cad4..7d97641fcca9 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1338,7 +1338,7 @@ static int st_open(struct inode *inode, struct file *filp)
 \f

 
 /* Flush the tape buffer before close */
-static int st_flush(struct file *filp, fl_owner_t id)
+static int st_flush(struct file *filp, void *id)
 {
 	int result = 0, result2;
 	unsigned char cmd[MAX_COMMAND_SIZE];
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index ab1c85c1ed38..5f12ea3f26d3 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2278,7 +2278,7 @@ static loff_t ll_file_seek(struct file *file, loff_t offset, int origin)
 					ll_file_maxbytes(inode), eof);
 }
 
-static int ll_flush(struct file *file, fl_owner_t id)
+static int ll_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct ll_inode_info *lli = ll_i2info(inode);
diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 8f972247b1c1..cb248f3d9565 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -582,7 +582,7 @@ static ssize_t wdm_read
 	return rv;
 }
 
-static int wdm_flush(struct file *file, fl_owner_t id)
+static int wdm_flush(struct file *file, void *id)
 {
 	struct wdm_device *desc = file->private_data;
 
diff --git a/drivers/usb/usb-skeleton.c b/drivers/usb/usb-skeleton.c
index bb0bd732e29a..547fe43678e8 100644
--- a/drivers/usb/usb-skeleton.c
+++ b/drivers/usb/usb-skeleton.c
@@ -136,7 +136,7 @@ static int skel_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int skel_flush(struct file *file, fl_owner_t id)
+static int skel_flush(struct file *file, void *id)
 {
 	struct usb_skel *dev;
 	int res;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 82e16556afea..210838d2fa85 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -728,7 +728,7 @@ extern int afs_writepages(struct address_space *, struct writeback_control *);
 extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
 extern ssize_t afs_file_write(struct kiocb *, struct iov_iter *);
 extern int afs_writeback_all(struct afs_vnode *);
-extern int afs_flush(struct file *, fl_owner_t);
+extern int afs_flush(struct file *, void *);
 extern int afs_fsync(struct file *, loff_t, loff_t, int);
 
 /*
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 2d2fccd5044b..a38f03bd6859 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -767,7 +767,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * Flush out all outstanding writes on a file opened for writing when it is
  * closed.
  */
-int afs_flush(struct file *file, fl_owner_t id)
+int afs_flush(struct file *file, void *id)
 {
 	_enter("");
 
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 30bf89b1fd9a..914e5af73d32 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -108,7 +108,7 @@ extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
 extern int cifs_lock(struct file *, int, struct file_lock *);
 extern int cifs_fsync(struct file *, loff_t, loff_t, int);
 extern int cifs_strict_fsync(struct file *, loff_t, loff_t, int);
-extern int cifs_flush(struct file *, fl_owner_t id);
+extern int cifs_flush(struct file *, void *id);
 extern int cifs_file_mmap(struct file * , struct vm_area_struct *);
 extern int cifs_file_strict_mmap(struct file * , struct vm_area_struct *);
 extern const struct file_operations cifs_dir_ops;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bc09df6b473a..bda1fdf46937 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1165,7 +1165,7 @@ cifs_push_mandatory_locks(struct cifsFileInfo *cfile)
 }
 
 static __u32
-hash_lockowner(fl_owner_t owner)
+hash_lockowner(void *owner)
 {
 	return cifs_lock_secret ^ hash32_ptr((const void *)owner);
 }
@@ -2399,7 +2399,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * As file closes, flush all cached write data for this inode checking
  * for write behind errors.
  */
-int cifs_flush(struct file *file, fl_owner_t id)
+int cifs_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	int rc = 0;
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index ca4e83750214..535039b38da8 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -290,7 +290,7 @@ static int ecryptfs_dir_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int ecryptfs_flush(struct file *file, fl_owner_t td)
+static int ecryptfs_flush(struct file *file, void *td)
 {
 	struct file *lower_file = ecryptfs_file_to_lower(file);
 
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index 28645f0640f7..d9e3c7ca3a0c 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -58,7 +58,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end,
 	return ret;
 }
 
-static int exofs_flush(struct file *file, fl_owner_t id)
+static int exofs_flush(struct file *file, void *id)
 {
 	int ret = vfs_fsync(file, 0);
 	/* TODO: Flush the OSD target */
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 3ee4fdc3da9e..89fe98020374 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -254,8 +254,7 @@ void fuse_release_common(struct file *file, int opcode)
 	if (ff->flock) {
 		struct fuse_release_in *inarg = &req->misc.release.in;
 		inarg->release_flags |= FUSE_RELEASE_FLOCK_UNLOCK;
-		inarg->lock_owner = fuse_lock_owner_id(ff->fc,
-						       (fl_owner_t) file);
+		inarg->lock_owner = fuse_lock_owner_id(ff->fc, file);
 	}
 	/* Hold inode until release is finished */
 	req->misc.release.inode = igrab(file_inode(file));
@@ -307,7 +306,7 @@ EXPORT_SYMBOL_GPL(fuse_sync_release);
  * Scramble the ID space with XTEA, so that the value of the files_struct
  * pointer is not exposed to userspace.
  */
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id)
 {
 	u32 *k = fc->scramble_key;
 	u64 v = (unsigned long) id;
@@ -390,7 +389,7 @@ static void fuse_sync_writes(struct inode *inode)
 	fuse_release_nowrite(inode);
 }
 
-static int fuse_flush(struct file *file, fl_owner_t id)
+static int fuse_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct fuse_conn *fc = get_fuse_conn(inode);
@@ -643,7 +642,7 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 }
 
 static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
-			     loff_t pos, size_t count, fl_owner_t owner)
+			     loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -955,7 +954,7 @@ static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
 }
 
 static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
-			      loff_t pos, size_t count, fl_owner_t owner)
+			      loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -1348,7 +1347,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 
 	while (count) {
 		size_t nres;
-		fl_owner_t owner = current->files;
+		void *owner = current->files;
 		size_t nbytes = min(count, nmax);
 		err = fuse_get_user_pages(req, iter, &nbytes, write);
 		if (err && !nbytes)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1bd7ffdad593..5f28f493b40a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -900,7 +900,7 @@ int fuse_valid_type(int m);
  */
 int fuse_allow_current_process(struct fuse_conn *fc);
 
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id);
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id);
 
 void fuse_update_ctime(struct inode *inode);
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 066ac313ae5c..1734c05f3182 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -81,7 +81,7 @@ static inline uint32_t __nlm_alloc_pid(struct nlm_host *host)
 	return res;
 }
 
-static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *lockowner;
 	list_for_each_entry(lockowner, &host->h_lockowners, list) {
@@ -92,7 +92,7 @@ static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owne
 	return NULL;
 }
 
-static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *res, *new = NULL;
 
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 82925f17ec45..2d72631ea968 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -45,7 +45,7 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 07915162581d..655a8daee20e 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -75,7 +75,7 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/locks.c b/fs/locks.c
index a3de5b96c81c..9a3c476f163a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2429,7 +2429,7 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
  * from the task's fd array.  POSIX locks belonging to this task
  * are deleted at this time.
  */
-void locks_remove_posix(struct file *filp, fl_owner_t owner)
+void locks_remove_posix(struct file *filp, void *owner)
 {
 	int error;
 	struct inode *inode = locks_inode(filp);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index af330c31f627..bca440fc09ff 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -136,7 +136,7 @@ EXPORT_SYMBOL_GPL(nfs_file_llseek);
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs_file_flush(struct file *file, fl_owner_t id)
+nfs_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 109279d6d91b..bb2cc33c3753 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -901,7 +901,7 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
 	ctx->mode = f_mode;
 	ctx->flags = 0;
 	ctx->error = 0;
-	ctx->flock_owner = (fl_owner_t)filp;
+	ctx->flock_owner = filp;
 	nfs_init_lock_context(&ctx->lock_context);
 	ctx->lock_context.open_context = ctx;
 	INIT_LIST_HEAD(&ctx->list);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 40bd05f05e74..0fbe16684519 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -145,7 +145,7 @@ struct nfs4_lock_state {
 	struct nfs_seqid_counter	ls_seqid;
 	nfs4_stateid		ls_stateid;
 	atomic_t		ls_count;
-	fl_owner_t		ls_owner;
+	void *			ls_owner;
 };
 
 /* bits for nfs4_state->flags */
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 0efba77789b9..1e09d9e4dd20 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -107,7 +107,7 @@ nfs4_file_open(struct inode *inode, struct file *filp)
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs4_file_flush(struct file *file, fl_owner_t id)
+nfs4_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 0378e2257ca7..f04a501a6ae9 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -813,7 +813,7 @@ void nfs4_close_sync(struct nfs4_state *state, fmode_t fmode)
  */
 static struct nfs4_lock_state *
 __nfs4_find_lock_state(struct nfs4_state *state,
-		       fl_owner_t fl_owner, fl_owner_t fl_owner2)
+		       void *fl_owner, void *fl_owner2)
 {
 	struct nfs4_lock_state *pos, *ret = NULL;
 	list_for_each_entry(pos, &state->lock_states, ls_locks) {
@@ -834,7 +834,7 @@ __nfs4_find_lock_state(struct nfs4_state *state,
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, fl_owner_t fl_owner)
+static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, void *fl_owner)
 {
 	struct nfs4_lock_state *lsp;
 	struct nfs_server *server = state->owner->so_server;
@@ -868,7 +868,7 @@ void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, fl_owner_t owner)
+static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, void *owner)
 {
 	struct nfs4_lock_state *lsp, *new = NULL;
 	
@@ -959,7 +959,7 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
 		const struct nfs_lock_context *l_ctx)
 {
 	struct nfs4_lock_state *lsp;
-	fl_owner_t fl_owner, fl_flock_owner;
+	void *fl_owner, fl_flock_owner;
 	int ret = -ENOENT;
 
 	if (l_ctx == NULL)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b50a7492f47f..b0126767b5b5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1195,7 +1195,7 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
 
 	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
-		filp_close(file, (fl_owner_t)lo);
+		filp_close(file, lo);
 	nfs4_free_ol_stateid(stid);
 }
 
@@ -4183,7 +4183,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)fp;
+	fl->fl_owner = fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -5446,8 +5446,8 @@ nfs4_transform_lock_offset(struct file_lock *lock)
 		lock->fl_end = OFFSET_MAX;
 }
 
-static fl_owner_t
-nfsd4_fl_get_owner(fl_owner_t owner)
+static void *
+nfsd4_fl_get_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5456,7 +5456,7 @@ nfsd4_fl_get_owner(fl_owner_t owner)
 }
 
 static void
-nfsd4_fl_put_owner(fl_owner_t owner)
+nfsd4_fl_put_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5886,7 +5886,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	file_lock = &nbl->nbl_lock;
 	file_lock->fl_type = fl_type;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = fl_flags;
@@ -6040,7 +6040,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	lo = find_lockowner_str(cstate->clp, &lockt->lt_owner);
 	if (lo)
-		file_lock->fl_owner = (fl_owner_t)lo;
+		file_lock->fl_owner = lo;
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_flags = FL_POSIX;
 
@@ -6102,7 +6102,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 
 	file_lock->fl_type = F_UNLCK;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(stp->st_stateowner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(stp->st_stateowner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = FL_POSIX;
@@ -6161,7 +6161,7 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 	if (flctx && !list_empty_careful(&flctx->flc_posix)) {
 		spin_lock(&flctx->flc_lock);
 		list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
-			if (fl->fl_owner == (fl_owner_t)lowner) {
+			if (fl->fl_owner == lowner) {
 				status = true;
 				break;
 			}
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 2430a0415995..a0c804f75435 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -145,7 +145,7 @@ static struct fsnotify_ops dnotify_fsnotify_ops = {
  * dnotify_struct.  If that was the last dnotify_struct also remove the
  * fsnotify_mark.
  */
-void dnotify_flush(struct file *filp, fl_owner_t id)
+void dnotify_flush(struct file *filp, void *id)
 {
 	struct fsnotify_mark *fsn_mark;
 	struct dnotify_mark *dn_mark;
@@ -223,7 +223,7 @@ static __u32 convert_arg(unsigned long arg)
  * that list, or it |= the mask onto an existing dnofiy_struct.
  */
 static int attach_dn(struct dnotify_struct *dn, struct dnotify_mark *dn_mark,
-		     fl_owner_t id, int fd, struct file *filp, __u32 mask)
+		     void *id, int fd, struct file *filp, __u32 mask)
 {
 	struct dnotify_struct *odn;
 
@@ -259,7 +259,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 	struct fsnotify_mark *new_fsn_mark, *fsn_mark;
 	struct dnotify_struct *dn;
 	struct inode *inode;
-	fl_owner_t id = current->files;
+	void *id = current->files;
 	struct file *f;
 	int destroy = 0, error = 0;
 	__u32 mask;
diff --git a/fs/open.c b/fs/open.c
index 946c646b39b0..7e9d537e1d32 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1119,7 +1119,7 @@ SYSCALL_DEFINE2(creat, const char __user *, pathname, umode_t, mode)
  * "id" is the POSIX thread ID. We use the
  * files pointer for this..
  */
-int filp_close(struct file *filp, fl_owner_t id)
+int filp_close(struct file *filp, void *id)
 {
 	int retval = 0;
 
diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
index 3290555a52ee..5c6b6004f2ca 100644
--- a/include/linux/dnotify.h
+++ b/include/linux/dnotify.h
@@ -13,7 +13,7 @@ struct dnotify_struct {
 	__u32			dn_mask;
 	int			dn_fd;
 	struct file *		dn_filp;
-	fl_owner_t		dn_owner;
+	void *			dn_owner;
 };
 
 #ifdef __KERNEL__
@@ -29,12 +29,12 @@ struct dnotify_struct {
 			    FS_MOVED_FROM | FS_MOVED_TO)
 
 extern int dir_notify_enable;
-extern void dnotify_flush(struct file *, fl_owner_t);
+extern void dnotify_flush(struct file *, void *);
 extern int fcntl_dirnotify(int, struct file *, unsigned long);
 
 #else
 
-static inline void dnotify_flush(struct file *filp, fl_owner_t id)
+static inline void dnotify_flush(struct file *filp, void *id)
 {
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2261728cc900..d0c19bb27f79 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -933,9 +933,6 @@ static inline struct file *get_file(struct file *f)
  */
 #define FILE_LOCK_DEFERRED 1
 
-/* legacy typedef, should eventually be removed */
-typedef void *fl_owner_t;
-
 struct file_lock;
 
 struct file_lock_operations {
@@ -946,8 +943,8 @@ struct file_lock_operations {
 struct lock_manager_operations {
 	int (*lm_compare_owner)(struct file_lock *, struct file_lock *);
 	unsigned long (*lm_owner_key)(struct file_lock *);
-	fl_owner_t (*lm_get_owner)(fl_owner_t);
-	void (*lm_put_owner)(fl_owner_t);
+	void *(*lm_get_owner)(void *);
+	void (*lm_put_owner)(void *);
 	void (*lm_notify)(struct file_lock *);	/* unblock callback */
 	int (*lm_grant)(struct file_lock *, int);
 	bool (*lm_break)(struct file_lock *);
@@ -996,7 +993,7 @@ struct file_lock {
 	struct list_head fl_list;	/* link into file_lock_context */
 	struct hlist_node fl_link;	/* node in global lists */
 	struct list_head fl_block;	/* circular list of blocked processes */
-	fl_owner_t fl_owner;
+	void *fl_owner;
 	unsigned int fl_flags;
 	unsigned char fl_type;
 	unsigned int fl_pid;
@@ -1073,7 +1070,7 @@ extern void locks_init_lock(struct file_lock *);
 extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
-extern void locks_remove_posix(struct file *, fl_owner_t);
+extern void locks_remove_posix(struct file *, void *);
 extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
@@ -1147,7 +1144,7 @@ static inline void locks_copy_lock(struct file_lock *new, struct file_lock *fl)
 	return;
 }
 
-static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
+static inline void locks_remove_posix(struct file *filp, void *owner)
 {
 	return;
 }
@@ -1682,7 +1679,7 @@ struct file_operations {
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
@@ -2412,7 +2409,7 @@ extern struct file *filp_open(const char *, int, umode_t);
 extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern int filp_close(struct file *, fl_owner_t id);
+extern int filp_close(struct file *, void *id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
 extern struct filename *getname(const char __user *);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 3eca67728366..c7340e4bcd23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -117,14 +117,14 @@ static inline struct sockaddr *nlm_srcaddr(const struct nlm_host *host)
 }
 
 /*
- * Map an fl_owner_t into a unique 32-bit "pid"
+ * Map a lock owner into a unique 32-bit "pid"
  */
 struct nlm_lockowner {
 	struct list_head list;
 	atomic_t count;
 
 	struct nlm_host *host;
-	fl_owner_t owner;
+	void *owner;
 	uint32_t pid;
 };
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 5cc91d6381a3..cfd37076fc4b 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -59,14 +59,14 @@ struct nfs_lock_context {
 	atomic_t count;
 	struct list_head list;
 	struct nfs_open_context *open_context;
-	fl_owner_t lockowner;
+	void *lockowner;
 	atomic_t io_count;
 };
 
 struct nfs4_state;
 struct nfs_open_context {
 	struct nfs_lock_context lock_context;
-	fl_owner_t flock_owner;
+	void *flock_owner;
 	struct dentry *dentry;
 	struct rpc_cred *cred;
 	struct nfs4_state *state;
diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
index 63a7680347cb..10a8b5a1b235 100644
--- a/include/trace/events/filelock.h
+++ b/include/trace/events/filelock.h
@@ -68,7 +68,7 @@ DECLARE_EVENT_CLASS(filelock_lock,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_pid)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
@@ -122,7 +122,7 @@ DECLARE_EVENT_CLASS(filelock_lease,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 		__field(unsigned long, fl_break_time)
@@ -175,7 +175,7 @@ TRACE_EVENT(generic_add_lease,
 		__field(int, dcount)
 		__field(int, icount)
 		__field(dev_t, s_dev)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 	),
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index eb1391b52c6f..d339f60223b5 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -507,7 +507,7 @@ static ssize_t mqueue_read_file(struct file *filp, char __user *u_data,
 	return ret;
 }
 
-static int mqueue_flush_file(struct file *filp, fl_owner_t id)
+static int mqueue_flush_file(struct file *filp, void *id)
 {
 	struct mqueue_inode_info *info = MQUEUE_I(file_inode(filp));
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
@ 2017-09-07 22:01       ` J. Bruce Fields
  0 siblings, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-07 22:01 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> /* legacy typedef, should eventually be removed */
> typedef void *fl_owner_t;
> 
> 
> Maybe you could do the world a favor and remove fl_owner_t in a
> preliminary patch :-)

Partly scripted, still a bit tedious, but I think it's right.  Honestly
I don't know what the motivation for the comment was, though.  Are there
no documentation or type-checking benefits to having the typdef?

The main annoyance was having this defined as a files_struct pointer,
which Christoph fixed some time ago.

--b.

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 73e7d91f03dc..758ca1591b0c 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -864,7 +864,7 @@ struct file_operations {
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*mremap)(struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 09f86ebfcc7b..35a185ac0de3 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -1809,7 +1809,7 @@ pfm_syswide_cleanup_other_cpu(pfm_context_t *ctx)
  * When caller is self-monitoring, the context is unloaded.
  */
 static int
-pfm_flush(struct file *filp, fl_owner_t id)
+pfm_flush(struct file *filp, void *id)
 {
 	pfm_context_t *ctx;
 	struct task_struct *task;
diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c
index ae2f740a82f1..8783335b3b85 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1720,7 +1720,7 @@ static unsigned int spufs_mfc_poll(struct file *file,poll_table *wait)
 	return mask;
 }
 
-static int spufs_mfc_flush(struct file *file, fl_owner_t id)
+static int spufs_mfc_flush(struct file *file, void *id)
 {
 	struct spu_context *ctx = file->private_data;
 	int ret;
diff --git a/arch/tile/kernel/hardwall.c b/arch/tile/kernel/hardwall.c
index 2fd1694ac1d0..b2cf21d1edb0 100644
--- a/arch/tile/kernel/hardwall.c
+++ b/arch/tile/kernel/hardwall.c
@@ -1030,7 +1030,7 @@ static long hardwall_compat_ioctl(struct file *file,
 #endif
 
 /* The user process closed the file; revoke access to user networks. */
-static int hardwall_flush(struct file *file, fl_owner_t owner)
+static int hardwall_flush(struct file *file, void *owner)
 {
 	struct hardwall_info *info = file->private_data;
 	struct task_struct *task, *tmp;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f7665c31feca..02bfce18c912 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3503,7 +3503,7 @@ static int binder_open(struct inode *nodp, struct file *filp)
 	return 0;
 }
 
-static int binder_flush(struct file *filp, fl_owner_t id)
+static int binder_flush(struct file *filp, void *id)
 {
 	struct binder_proc *proc = filp->private_data;
 
diff --git a/drivers/char/ps3flash.c b/drivers/char/ps3flash.c
index b526dc15c271..5a03dd0eb2f1 100644
--- a/drivers/char/ps3flash.c
+++ b/drivers/char/ps3flash.c
@@ -281,7 +281,7 @@ static ssize_t ps3flash_kernel_write(const void *buf, size_t count,
 	return res;
 }
 
-static int ps3flash_flush(struct file *file, fl_owner_t id)
+static int ps3flash_flush(struct file *file, void *id)
 {
 	return ps3flash_writeback(ps3flash_dev);
 }
diff --git a/drivers/char/xillybus/xillybus_core.c b/drivers/char/xillybus/xillybus_core.c
index b6c9cdead7f3..7e04a9df51e3 100644
--- a/drivers/char/xillybus/xillybus_core.c
+++ b/drivers/char/xillybus/xillybus_core.c
@@ -1156,7 +1156,7 @@ static int xillybus_myflush(struct xilly_channel *channel, long timeout)
 	return rc;
 }
 
-static int xillybus_flush(struct file *filp, fl_owner_t id)
+static int xillybus_flush(struct file *filp, void *id)
 {
 	if (!(filp->f_mode & FMODE_WRITE))
 		return 0;
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index ec8ac5c4dd84..f4d0c4805ec7 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -225,7 +225,7 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
  *	will be treated as upload termination and will free those completed
  *	buffer pages and -ECANCELED will be returned.
  **/
-static int efi_capsule_flush(struct file *file, fl_owner_t id)
+static int efi_capsule_flush(struct file *file, void *id)
 {
 	int ret = 0;
 	struct capsule_info *cap_info = file->private_data;
diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 925571475005..85117fecc292 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -342,7 +342,7 @@ static int evdev_fasync(int fd, struct file *file, int on)
 	return fasync_helper(fd, file, on, &client->fasync);
 }
 
-static int evdev_flush(struct file *file, fl_owner_t id)
+static int evdev_flush(struct file *file, void *id)
 {
 	struct evdev_client *client = file->private_data;
 	struct evdev *evdev = client->evdev;
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index f7e826142a72..c446a1450714 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -48,7 +48,7 @@ static unsigned int scif_fdpoll(struct file *f, poll_table *wait)
 	return __scif_pollfd(f, wait, priv);
 }
 
-static int scif_fdflush(struct file *f, fl_owner_t id)
+static int scif_fdflush(struct file *f, void *id)
 {
 	struct scif_endpt *ep = f->private_data;
 
diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
index 929ee7e88120..366bc57f0bfb 100644
--- a/drivers/scsi/osst.c
+++ b/drivers/scsi/osst.c
@@ -4822,7 +4822,7 @@ static int os_scsi_tape_open(struct inode * inode, struct file * filp)
 
 
 /* Flush the tape buffer before close */
-static int os_scsi_tape_flush(struct file * filp, fl_owner_t id)
+static int os_scsi_tape_flush(struct file * filp, void *id)
 {
 	int		      result = 0, result2;
 	struct osst_tape    * STp    = filp->private_data;
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 8e5013d9cad4..7d97641fcca9 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1338,7 +1338,7 @@ static int st_open(struct inode *inode, struct file *filp)
 \f
 
 /* Flush the tape buffer before close */
-static int st_flush(struct file *filp, fl_owner_t id)
+static int st_flush(struct file *filp, void *id)
 {
 	int result = 0, result2;
 	unsigned char cmd[MAX_COMMAND_SIZE];
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index ab1c85c1ed38..5f12ea3f26d3 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2278,7 +2278,7 @@ static loff_t ll_file_seek(struct file *file, loff_t offset, int origin)
 					ll_file_maxbytes(inode), eof);
 }
 
-static int ll_flush(struct file *file, fl_owner_t id)
+static int ll_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct ll_inode_info *lli = ll_i2info(inode);
diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 8f972247b1c1..cb248f3d9565 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -582,7 +582,7 @@ static ssize_t wdm_read
 	return rv;
 }
 
-static int wdm_flush(struct file *file, fl_owner_t id)
+static int wdm_flush(struct file *file, void *id)
 {
 	struct wdm_device *desc = file->private_data;
 
diff --git a/drivers/usb/usb-skeleton.c b/drivers/usb/usb-skeleton.c
index bb0bd732e29a..547fe43678e8 100644
--- a/drivers/usb/usb-skeleton.c
+++ b/drivers/usb/usb-skeleton.c
@@ -136,7 +136,7 @@ static int skel_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int skel_flush(struct file *file, fl_owner_t id)
+static int skel_flush(struct file *file, void *id)
 {
 	struct usb_skel *dev;
 	int res;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 82e16556afea..210838d2fa85 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -728,7 +728,7 @@ extern int afs_writepages(struct address_space *, struct writeback_control *);
 extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
 extern ssize_t afs_file_write(struct kiocb *, struct iov_iter *);
 extern int afs_writeback_all(struct afs_vnode *);
-extern int afs_flush(struct file *, fl_owner_t);
+extern int afs_flush(struct file *, void *);
 extern int afs_fsync(struct file *, loff_t, loff_t, int);
 
 /*
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 2d2fccd5044b..a38f03bd6859 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -767,7 +767,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * Flush out all outstanding writes on a file opened for writing when it is
  * closed.
  */
-int afs_flush(struct file *file, fl_owner_t id)
+int afs_flush(struct file *file, void *id)
 {
 	_enter("");
 
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 30bf89b1fd9a..914e5af73d32 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -108,7 +108,7 @@ extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
 extern int cifs_lock(struct file *, int, struct file_lock *);
 extern int cifs_fsync(struct file *, loff_t, loff_t, int);
 extern int cifs_strict_fsync(struct file *, loff_t, loff_t, int);
-extern int cifs_flush(struct file *, fl_owner_t id);
+extern int cifs_flush(struct file *, void *id);
 extern int cifs_file_mmap(struct file * , struct vm_area_struct *);
 extern int cifs_file_strict_mmap(struct file * , struct vm_area_struct *);
 extern const struct file_operations cifs_dir_ops;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bc09df6b473a..bda1fdf46937 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1165,7 +1165,7 @@ cifs_push_mandatory_locks(struct cifsFileInfo *cfile)
 }
 
 static __u32
-hash_lockowner(fl_owner_t owner)
+hash_lockowner(void *owner)
 {
 	return cifs_lock_secret ^ hash32_ptr((const void *)owner);
 }
@@ -2399,7 +2399,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * As file closes, flush all cached write data for this inode checking
  * for write behind errors.
  */
-int cifs_flush(struct file *file, fl_owner_t id)
+int cifs_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	int rc = 0;
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index ca4e83750214..535039b38da8 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -290,7 +290,7 @@ static int ecryptfs_dir_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int ecryptfs_flush(struct file *file, fl_owner_t td)
+static int ecryptfs_flush(struct file *file, void *td)
 {
 	struct file *lower_file = ecryptfs_file_to_lower(file);
 
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index 28645f0640f7..d9e3c7ca3a0c 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -58,7 +58,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end,
 	return ret;
 }
 
-static int exofs_flush(struct file *file, fl_owner_t id)
+static int exofs_flush(struct file *file, void *id)
 {
 	int ret = vfs_fsync(file, 0);
 	/* TODO: Flush the OSD target */
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 3ee4fdc3da9e..89fe98020374 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -254,8 +254,7 @@ void fuse_release_common(struct file *file, int opcode)
 	if (ff->flock) {
 		struct fuse_release_in *inarg = &req->misc.release.in;
 		inarg->release_flags |= FUSE_RELEASE_FLOCK_UNLOCK;
-		inarg->lock_owner = fuse_lock_owner_id(ff->fc,
-						       (fl_owner_t) file);
+		inarg->lock_owner = fuse_lock_owner_id(ff->fc, file);
 	}
 	/* Hold inode until release is finished */
 	req->misc.release.inode = igrab(file_inode(file));
@@ -307,7 +306,7 @@ EXPORT_SYMBOL_GPL(fuse_sync_release);
  * Scramble the ID space with XTEA, so that the value of the files_struct
  * pointer is not exposed to userspace.
  */
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id)
 {
 	u32 *k = fc->scramble_key;
 	u64 v = (unsigned long) id;
@@ -390,7 +389,7 @@ static void fuse_sync_writes(struct inode *inode)
 	fuse_release_nowrite(inode);
 }
 
-static int fuse_flush(struct file *file, fl_owner_t id)
+static int fuse_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct fuse_conn *fc = get_fuse_conn(inode);
@@ -643,7 +642,7 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 }
 
 static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
-			     loff_t pos, size_t count, fl_owner_t owner)
+			     loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -955,7 +954,7 @@ static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
 }
 
 static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
-			      loff_t pos, size_t count, fl_owner_t owner)
+			      loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -1348,7 +1347,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 
 	while (count) {
 		size_t nres;
-		fl_owner_t owner = current->files;
+		void *owner = current->files;
 		size_t nbytes = min(count, nmax);
 		err = fuse_get_user_pages(req, iter, &nbytes, write);
 		if (err && !nbytes)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1bd7ffdad593..5f28f493b40a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -900,7 +900,7 @@ int fuse_valid_type(int m);
  */
 int fuse_allow_current_process(struct fuse_conn *fc);
 
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id);
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id);
 
 void fuse_update_ctime(struct inode *inode);
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 066ac313ae5c..1734c05f3182 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -81,7 +81,7 @@ static inline uint32_t __nlm_alloc_pid(struct nlm_host *host)
 	return res;
 }
 
-static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *lockowner;
 	list_for_each_entry(lockowner, &host->h_lockowners, list) {
@@ -92,7 +92,7 @@ static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owne
 	return NULL;
 }
 
-static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *res, *new = NULL;
 
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 82925f17ec45..2d72631ea968 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -45,7 +45,7 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 07915162581d..655a8daee20e 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -75,7 +75,7 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/locks.c b/fs/locks.c
index a3de5b96c81c..9a3c476f163a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2429,7 +2429,7 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
  * from the task's fd array.  POSIX locks belonging to this task
  * are deleted at this time.
  */
-void locks_remove_posix(struct file *filp, fl_owner_t owner)
+void locks_remove_posix(struct file *filp, void *owner)
 {
 	int error;
 	struct inode *inode = locks_inode(filp);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index af330c31f627..bca440fc09ff 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -136,7 +136,7 @@ EXPORT_SYMBOL_GPL(nfs_file_llseek);
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs_file_flush(struct file *file, fl_owner_t id)
+nfs_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 109279d6d91b..bb2cc33c3753 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -901,7 +901,7 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
 	ctx->mode = f_mode;
 	ctx->flags = 0;
 	ctx->error = 0;
-	ctx->flock_owner = (fl_owner_t)filp;
+	ctx->flock_owner = filp;
 	nfs_init_lock_context(&ctx->lock_context);
 	ctx->lock_context.open_context = ctx;
 	INIT_LIST_HEAD(&ctx->list);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 40bd05f05e74..0fbe16684519 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -145,7 +145,7 @@ struct nfs4_lock_state {
 	struct nfs_seqid_counter	ls_seqid;
 	nfs4_stateid		ls_stateid;
 	atomic_t		ls_count;
-	fl_owner_t		ls_owner;
+	void *			ls_owner;
 };
 
 /* bits for nfs4_state->flags */
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 0efba77789b9..1e09d9e4dd20 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -107,7 +107,7 @@ nfs4_file_open(struct inode *inode, struct file *filp)
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs4_file_flush(struct file *file, fl_owner_t id)
+nfs4_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 0378e2257ca7..f04a501a6ae9 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -813,7 +813,7 @@ void nfs4_close_sync(struct nfs4_state *state, fmode_t fmode)
  */
 static struct nfs4_lock_state *
 __nfs4_find_lock_state(struct nfs4_state *state,
-		       fl_owner_t fl_owner, fl_owner_t fl_owner2)
+		       void *fl_owner, void *fl_owner2)
 {
 	struct nfs4_lock_state *pos, *ret = NULL;
 	list_for_each_entry(pos, &state->lock_states, ls_locks) {
@@ -834,7 +834,7 @@ __nfs4_find_lock_state(struct nfs4_state *state,
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, fl_owner_t fl_owner)
+static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, void *fl_owner)
 {
 	struct nfs4_lock_state *lsp;
 	struct nfs_server *server = state->owner->so_server;
@@ -868,7 +868,7 @@ void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, fl_owner_t owner)
+static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, void *owner)
 {
 	struct nfs4_lock_state *lsp, *new = NULL;
 	
@@ -959,7 +959,7 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
 		const struct nfs_lock_context *l_ctx)
 {
 	struct nfs4_lock_state *lsp;
-	fl_owner_t fl_owner, fl_flock_owner;
+	void *fl_owner, fl_flock_owner;
 	int ret = -ENOENT;
 
 	if (l_ctx == NULL)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b50a7492f47f..b0126767b5b5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1195,7 +1195,7 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
 
 	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
-		filp_close(file, (fl_owner_t)lo);
+		filp_close(file, lo);
 	nfs4_free_ol_stateid(stid);
 }
 
@@ -4183,7 +4183,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)fp;
+	fl->fl_owner = fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -5446,8 +5446,8 @@ nfs4_transform_lock_offset(struct file_lock *lock)
 		lock->fl_end = OFFSET_MAX;
 }
 
-static fl_owner_t
-nfsd4_fl_get_owner(fl_owner_t owner)
+static void *
+nfsd4_fl_get_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5456,7 +5456,7 @@ nfsd4_fl_get_owner(fl_owner_t owner)
 }
 
 static void
-nfsd4_fl_put_owner(fl_owner_t owner)
+nfsd4_fl_put_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5886,7 +5886,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	file_lock = &nbl->nbl_lock;
 	file_lock->fl_type = fl_type;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = fl_flags;
@@ -6040,7 +6040,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	lo = find_lockowner_str(cstate->clp, &lockt->lt_owner);
 	if (lo)
-		file_lock->fl_owner = (fl_owner_t)lo;
+		file_lock->fl_owner = lo;
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_flags = FL_POSIX;
 
@@ -6102,7 +6102,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 
 	file_lock->fl_type = F_UNLCK;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(stp->st_stateowner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(stp->st_stateowner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = FL_POSIX;
@@ -6161,7 +6161,7 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 	if (flctx && !list_empty_careful(&flctx->flc_posix)) {
 		spin_lock(&flctx->flc_lock);
 		list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
-			if (fl->fl_owner == (fl_owner_t)lowner) {
+			if (fl->fl_owner == lowner) {
 				status = true;
 				break;
 			}
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 2430a0415995..a0c804f75435 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -145,7 +145,7 @@ static struct fsnotify_ops dnotify_fsnotify_ops = {
  * dnotify_struct.  If that was the last dnotify_struct also remove the
  * fsnotify_mark.
  */
-void dnotify_flush(struct file *filp, fl_owner_t id)
+void dnotify_flush(struct file *filp, void *id)
 {
 	struct fsnotify_mark *fsn_mark;
 	struct dnotify_mark *dn_mark;
@@ -223,7 +223,7 @@ static __u32 convert_arg(unsigned long arg)
  * that list, or it |= the mask onto an existing dnofiy_struct.
  */
 static int attach_dn(struct dnotify_struct *dn, struct dnotify_mark *dn_mark,
-		     fl_owner_t id, int fd, struct file *filp, __u32 mask)
+		     void *id, int fd, struct file *filp, __u32 mask)
 {
 	struct dnotify_struct *odn;
 
@@ -259,7 +259,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 	struct fsnotify_mark *new_fsn_mark, *fsn_mark;
 	struct dnotify_struct *dn;
 	struct inode *inode;
-	fl_owner_t id = current->files;
+	void *id = current->files;
 	struct file *f;
 	int destroy = 0, error = 0;
 	__u32 mask;
diff --git a/fs/open.c b/fs/open.c
index 946c646b39b0..7e9d537e1d32 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1119,7 +1119,7 @@ SYSCALL_DEFINE2(creat, const char __user *, pathname, umode_t, mode)
  * "id" is the POSIX thread ID. We use the
  * files pointer for this..
  */
-int filp_close(struct file *filp, fl_owner_t id)
+int filp_close(struct file *filp, void *id)
 {
 	int retval = 0;
 
diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
index 3290555a52ee..5c6b6004f2ca 100644
--- a/include/linux/dnotify.h
+++ b/include/linux/dnotify.h
@@ -13,7 +13,7 @@ struct dnotify_struct {
 	__u32			dn_mask;
 	int			dn_fd;
 	struct file *		dn_filp;
-	fl_owner_t		dn_owner;
+	void *			dn_owner;
 };
 
 #ifdef __KERNEL__
@@ -29,12 +29,12 @@ struct dnotify_struct {
 			    FS_MOVED_FROM | FS_MOVED_TO)
 
 extern int dir_notify_enable;
-extern void dnotify_flush(struct file *, fl_owner_t);
+extern void dnotify_flush(struct file *, void *);
 extern int fcntl_dirnotify(int, struct file *, unsigned long);
 
 #else
 
-static inline void dnotify_flush(struct file *filp, fl_owner_t id)
+static inline void dnotify_flush(struct file *filp, void *id)
 {
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2261728cc900..d0c19bb27f79 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -933,9 +933,6 @@ static inline struct file *get_file(struct file *f)
  */
 #define FILE_LOCK_DEFERRED 1
 
-/* legacy typedef, should eventually be removed */
-typedef void *fl_owner_t;
-
 struct file_lock;
 
 struct file_lock_operations {
@@ -946,8 +943,8 @@ struct file_lock_operations {
 struct lock_manager_operations {
 	int (*lm_compare_owner)(struct file_lock *, struct file_lock *);
 	unsigned long (*lm_owner_key)(struct file_lock *);
-	fl_owner_t (*lm_get_owner)(fl_owner_t);
-	void (*lm_put_owner)(fl_owner_t);
+	void *(*lm_get_owner)(void *);
+	void (*lm_put_owner)(void *);
 	void (*lm_notify)(struct file_lock *);	/* unblock callback */
 	int (*lm_grant)(struct file_lock *, int);
 	bool (*lm_break)(struct file_lock *);
@@ -996,7 +993,7 @@ struct file_lock {
 	struct list_head fl_list;	/* link into file_lock_context */
 	struct hlist_node fl_link;	/* node in global lists */
 	struct list_head fl_block;	/* circular list of blocked processes */
-	fl_owner_t fl_owner;
+	void *fl_owner;
 	unsigned int fl_flags;
 	unsigned char fl_type;
 	unsigned int fl_pid;
@@ -1073,7 +1070,7 @@ extern void locks_init_lock(struct file_lock *);
 extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
-extern void locks_remove_posix(struct file *, fl_owner_t);
+extern void locks_remove_posix(struct file *, void *);
 extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
@@ -1147,7 +1144,7 @@ static inline void locks_copy_lock(struct file_lock *new, struct file_lock *fl)
 	return;
 }
 
-static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
+static inline void locks_remove_posix(struct file *filp, void *owner)
 {
 	return;
 }
@@ -1682,7 +1679,7 @@ struct file_operations {
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
@@ -2412,7 +2409,7 @@ extern struct file *filp_open(const char *, int, umode_t);
 extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern int filp_close(struct file *, fl_owner_t id);
+extern int filp_close(struct file *, void *id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
 extern struct filename *getname(const char __user *);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 3eca67728366..c7340e4bcd23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -117,14 +117,14 @@ static inline struct sockaddr *nlm_srcaddr(const struct nlm_host *host)
 }
 
 /*
- * Map an fl_owner_t into a unique 32-bit "pid"
+ * Map a lock owner into a unique 32-bit "pid"
  */
 struct nlm_lockowner {
 	struct list_head list;
 	atomic_t count;
 
 	struct nlm_host *host;
-	fl_owner_t owner;
+	void *owner;
 	uint32_t pid;
 };
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 5cc91d6381a3..cfd37076fc4b 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -59,14 +59,14 @@ struct nfs_lock_context {
 	atomic_t count;
 	struct list_head list;
 	struct nfs_open_context *open_context;
-	fl_owner_t lockowner;
+	void *lockowner;
 	atomic_t io_count;
 };
 
 struct nfs4_state;
 struct nfs_open_context {
 	struct nfs_lock_context lock_context;
-	fl_owner_t flock_owner;
+	void *flock_owner;
 	struct dentry *dentry;
 	struct rpc_cred *cred;
 	struct nfs4_state *state;
diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
index 63a7680347cb..10a8b5a1b235 100644
--- a/include/trace/events/filelock.h
+++ b/include/trace/events/filelock.h
@@ -68,7 +68,7 @@ DECLARE_EVENT_CLASS(filelock_lock,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_pid)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
@@ -122,7 +122,7 @@ DECLARE_EVENT_CLASS(filelock_lease,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 		__field(unsigned long, fl_break_time)
@@ -175,7 +175,7 @@ TRACE_EVENT(generic_add_lease,
 		__field(int, dcount)
 		__field(int, icount)
 		__field(dev_t, s_dev)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 	),
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index eb1391b52c6f..d339f60223b5 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -507,7 +507,7 @@ static ssize_t mqueue_read_file(struct file *filp, char __user *u_data,
 	return ret;
 }
 
-static int mqueue_flush_file(struct file *filp, fl_owner_t id)
+static int mqueue_flush_file(struct file *filp, void *id)
 {
 	struct mqueue_inode_info *info = MQUEUE_I(file_inode(filp));
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-09-07 22:01       ` J. Bruce Fields
  (?)
@ 2017-09-08  5:06       ` NeilBrown
  2017-09-08 15:05           ` J. Bruce Fields
  -1 siblings, 1 reply; 35+ messages in thread
From: NeilBrown @ 2017-09-08  5:06 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 1334 bytes --]

On Thu, Sep 07 2017, J. Bruce Fields wrote:

> On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
>> /* legacy typedef, should eventually be removed */
>> typedef void *fl_owner_t;
>> 
>> 
>> Maybe you could do the world a favor and remove fl_owner_t in a
>> preliminary patch :-)
>
> Partly scripted, still a bit tedious, but I think it's right.  Honestly
> I don't know what the motivation for the comment was, though.  Are there
> no documentation or type-checking benefits to having the typdef?

If it was an established practice throughout the kernel to use typedefs
to differentiate different 'void *', then maybe there would be a
documentation benefit.  Given the wide use of casts (you removed 9 I
think), I don't think there are significant type-checking benefits.

I don't like fl_owner_t because when you see it in the general context
of the kernel, you are likely to think that it means something
important.  Then you go hunting and find "Oh, it is just a void*". (That
is what happened to me:-).  The second reason that I don't like it is
that it requires all those casts that you removed.

Reviewed-by: NeilBrown <neilb@suse.com>

Thanks,
NeilBrown


>
> The main annoyance was having this defined as a files_struct pointer,
> which Christoph fixed some time ago.
>
> --b.
>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-09-08  5:06       ` NeilBrown
@ 2017-09-08 15:05           ` J. Bruce Fields
  0 siblings, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-08 15:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

On Fri, Sep 08, 2017 at 03:06:24PM +1000, NeilBrown wrote:
> On Thu, Sep 07 2017, J. Bruce Fields wrote:
> 
> > On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> >> /* legacy typedef, should eventually be removed */
> >> typedef void *fl_owner_t;
> >> 
> >> 
> >> Maybe you could do the world a favor and remove fl_owner_t in a
> >> preliminary patch :-)
> >
> > Partly scripted, still a bit tedious, but I think it's right.  Honestly
> > I don't know what the motivation for the comment was, though.  Are there
> > no documentation or type-checking benefits to having the typdef?
> 
> If it was an established practice throughout the kernel to use typedefs
> to differentiate different 'void *', then maybe there would be a
> documentation benefit.  Given the wide use of casts (you removed 9 I
> think), I don't think there are significant type-checking benefits.
> 
> I don't like fl_owner_t because when you see it in the general context
> of the kernel, you are likely to think that it means something
> important.  Then you go hunting and find "Oh, it is just a void*". (That
> is what happened to me:-).  The second reason that I don't like it is
> that it requires all those casts that you removed.
> 
> Reviewed-by: NeilBrown <neilb@suse.com>

OK, I'll give it a shot.

Turns out I forgot to fold a couple small fixes into the patch before
posting; fixed version follows.

--b.

Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Sep 7 17:45:21 2017 -0400

    vfs: remove unnecessary fl_owner_t typedef
    
    The convention is to avoid this kind of typedef.  It doesn't
    seem useful, and it requires a lot of casts.
    
    Reviewed-by: NeilBrown <neilb@suse.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 73e7d91f03dc..758ca1591b0c 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -864,7 +864,7 @@ struct file_operations {
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*mremap)(struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 09f86ebfcc7b..35a185ac0de3 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -1809,7 +1809,7 @@ pfm_syswide_cleanup_other_cpu(pfm_context_t *ctx)
  * When caller is self-monitoring, the context is unloaded.
  */
 static int
-pfm_flush(struct file *filp, fl_owner_t id)
+pfm_flush(struct file *filp, void *id)
 {
 	pfm_context_t *ctx;
 	struct task_struct *task;
diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c
index ae2f740a82f1..8783335b3b85 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1720,7 +1720,7 @@ static unsigned int spufs_mfc_poll(struct file *file,poll_table *wait)
 	return mask;
 }
 
-static int spufs_mfc_flush(struct file *file, fl_owner_t id)
+static int spufs_mfc_flush(struct file *file, void *id)
 {
 	struct spu_context *ctx = file->private_data;
 	int ret;
diff --git a/arch/tile/kernel/hardwall.c b/arch/tile/kernel/hardwall.c
index 2fd1694ac1d0..b2cf21d1edb0 100644
--- a/arch/tile/kernel/hardwall.c
+++ b/arch/tile/kernel/hardwall.c
@@ -1030,7 +1030,7 @@ static long hardwall_compat_ioctl(struct file *file,
 #endif
 
 /* The user process closed the file; revoke access to user networks. */
-static int hardwall_flush(struct file *file, fl_owner_t owner)
+static int hardwall_flush(struct file *file, void *owner)
 {
 	struct hardwall_info *info = file->private_data;
 	struct task_struct *task, *tmp;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f7665c31feca..02bfce18c912 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3503,7 +3503,7 @@ static int binder_open(struct inode *nodp, struct file *filp)
 	return 0;
 }
 
-static int binder_flush(struct file *filp, fl_owner_t id)
+static int binder_flush(struct file *filp, void *id)
 {
 	struct binder_proc *proc = filp->private_data;
 
diff --git a/drivers/char/ps3flash.c b/drivers/char/ps3flash.c
index b526dc15c271..5a03dd0eb2f1 100644
--- a/drivers/char/ps3flash.c
+++ b/drivers/char/ps3flash.c
@@ -281,7 +281,7 @@ static ssize_t ps3flash_kernel_write(const void *buf, size_t count,
 	return res;
 }
 
-static int ps3flash_flush(struct file *file, fl_owner_t id)
+static int ps3flash_flush(struct file *file, void *id)
 {
 	return ps3flash_writeback(ps3flash_dev);
 }
diff --git a/drivers/char/xillybus/xillybus_core.c b/drivers/char/xillybus/xillybus_core.c
index b6c9cdead7f3..7e04a9df51e3 100644
--- a/drivers/char/xillybus/xillybus_core.c
+++ b/drivers/char/xillybus/xillybus_core.c
@@ -1156,7 +1156,7 @@ static int xillybus_myflush(struct xilly_channel *channel, long timeout)
 	return rc;
 }
 
-static int xillybus_flush(struct file *filp, fl_owner_t id)
+static int xillybus_flush(struct file *filp, void *id)
 {
 	if (!(filp->f_mode & FMODE_WRITE))
 		return 0;
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index ec8ac5c4dd84..f4d0c4805ec7 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -225,7 +225,7 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
  *	will be treated as upload termination and will free those completed
  *	buffer pages and -ECANCELED will be returned.
  **/
-static int efi_capsule_flush(struct file *file, fl_owner_t id)
+static int efi_capsule_flush(struct file *file, void *id)
 {
 	int ret = 0;
 	struct capsule_info *cap_info = file->private_data;
diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 925571475005..85117fecc292 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -342,7 +342,7 @@ static int evdev_fasync(int fd, struct file *file, int on)
 	return fasync_helper(fd, file, on, &client->fasync);
 }
 
-static int evdev_flush(struct file *file, fl_owner_t id)
+static int evdev_flush(struct file *file, void *id)
 {
 	struct evdev_client *client = file->private_data;
 	struct evdev *evdev = client->evdev;
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index f7e826142a72..c446a1450714 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -48,7 +48,7 @@ static unsigned int scif_fdpoll(struct file *f, poll_table *wait)
 	return __scif_pollfd(f, wait, priv);
 }
 
-static int scif_fdflush(struct file *f, fl_owner_t id)
+static int scif_fdflush(struct file *f, void *id)
 {
 	struct scif_endpt *ep = f->private_data;
 
diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
index 929ee7e88120..366bc57f0bfb 100644
--- a/drivers/scsi/osst.c
+++ b/drivers/scsi/osst.c
@@ -4822,7 +4822,7 @@ static int os_scsi_tape_open(struct inode * inode, struct file * filp)
 
 
 /* Flush the tape buffer before close */
-static int os_scsi_tape_flush(struct file * filp, fl_owner_t id)
+static int os_scsi_tape_flush(struct file * filp, void *id)
 {
 	int		      result = 0, result2;
 	struct osst_tape    * STp    = filp->private_data;
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 8e5013d9cad4..7d97641fcca9 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1338,7 +1338,7 @@ static int st_open(struct inode *inode, struct file *filp)
 \f

 
 /* Flush the tape buffer before close */
-static int st_flush(struct file *filp, fl_owner_t id)
+static int st_flush(struct file *filp, void *id)
 {
 	int result = 0, result2;
 	unsigned char cmd[MAX_COMMAND_SIZE];
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index ab1c85c1ed38..5f12ea3f26d3 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2278,7 +2278,7 @@ static loff_t ll_file_seek(struct file *file, loff_t offset, int origin)
 					ll_file_maxbytes(inode), eof);
 }
 
-static int ll_flush(struct file *file, fl_owner_t id)
+static int ll_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct ll_inode_info *lli = ll_i2info(inode);
diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 8f972247b1c1..cb248f3d9565 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -582,7 +582,7 @@ static ssize_t wdm_read
 	return rv;
 }
 
-static int wdm_flush(struct file *file, fl_owner_t id)
+static int wdm_flush(struct file *file, void *id)
 {
 	struct wdm_device *desc = file->private_data;
 
diff --git a/drivers/usb/usb-skeleton.c b/drivers/usb/usb-skeleton.c
index bb0bd732e29a..547fe43678e8 100644
--- a/drivers/usb/usb-skeleton.c
+++ b/drivers/usb/usb-skeleton.c
@@ -136,7 +136,7 @@ static int skel_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int skel_flush(struct file *file, fl_owner_t id)
+static int skel_flush(struct file *file, void *id)
 {
 	struct usb_skel *dev;
 	int res;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 82e16556afea..210838d2fa85 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -728,7 +728,7 @@ extern int afs_writepages(struct address_space *, struct writeback_control *);
 extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
 extern ssize_t afs_file_write(struct kiocb *, struct iov_iter *);
 extern int afs_writeback_all(struct afs_vnode *);
-extern int afs_flush(struct file *, fl_owner_t);
+extern int afs_flush(struct file *, void *);
 extern int afs_fsync(struct file *, loff_t, loff_t, int);
 
 /*
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 2d2fccd5044b..a38f03bd6859 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -767,7 +767,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * Flush out all outstanding writes on a file opened for writing when it is
  * closed.
  */
-int afs_flush(struct file *file, fl_owner_t id)
+int afs_flush(struct file *file, void *id)
 {
 	_enter("");
 
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 30bf89b1fd9a..914e5af73d32 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -108,7 +108,7 @@ extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
 extern int cifs_lock(struct file *, int, struct file_lock *);
 extern int cifs_fsync(struct file *, loff_t, loff_t, int);
 extern int cifs_strict_fsync(struct file *, loff_t, loff_t, int);
-extern int cifs_flush(struct file *, fl_owner_t id);
+extern int cifs_flush(struct file *, void *id);
 extern int cifs_file_mmap(struct file * , struct vm_area_struct *);
 extern int cifs_file_strict_mmap(struct file * , struct vm_area_struct *);
 extern const struct file_operations cifs_dir_ops;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bc09df6b473a..bda1fdf46937 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1165,7 +1165,7 @@ cifs_push_mandatory_locks(struct cifsFileInfo *cfile)
 }
 
 static __u32
-hash_lockowner(fl_owner_t owner)
+hash_lockowner(void *owner)
 {
 	return cifs_lock_secret ^ hash32_ptr((const void *)owner);
 }
@@ -2399,7 +2399,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * As file closes, flush all cached write data for this inode checking
  * for write behind errors.
  */
-int cifs_flush(struct file *file, fl_owner_t id)
+int cifs_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	int rc = 0;
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index ca4e83750214..535039b38da8 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -290,7 +290,7 @@ static int ecryptfs_dir_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int ecryptfs_flush(struct file *file, fl_owner_t td)
+static int ecryptfs_flush(struct file *file, void *td)
 {
 	struct file *lower_file = ecryptfs_file_to_lower(file);
 
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index 28645f0640f7..d9e3c7ca3a0c 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -58,7 +58,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end,
 	return ret;
 }
 
-static int exofs_flush(struct file *file, fl_owner_t id)
+static int exofs_flush(struct file *file, void *id)
 {
 	int ret = vfs_fsync(file, 0);
 	/* TODO: Flush the OSD target */
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 3ee4fdc3da9e..89fe98020374 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -254,8 +254,7 @@ void fuse_release_common(struct file *file, int opcode)
 	if (ff->flock) {
 		struct fuse_release_in *inarg = &req->misc.release.in;
 		inarg->release_flags |= FUSE_RELEASE_FLOCK_UNLOCK;
-		inarg->lock_owner = fuse_lock_owner_id(ff->fc,
-						       (fl_owner_t) file);
+		inarg->lock_owner = fuse_lock_owner_id(ff->fc, file);
 	}
 	/* Hold inode until release is finished */
 	req->misc.release.inode = igrab(file_inode(file));
@@ -307,7 +306,7 @@ EXPORT_SYMBOL_GPL(fuse_sync_release);
  * Scramble the ID space with XTEA, so that the value of the files_struct
  * pointer is not exposed to userspace.
  */
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id)
 {
 	u32 *k = fc->scramble_key;
 	u64 v = (unsigned long) id;
@@ -390,7 +389,7 @@ static void fuse_sync_writes(struct inode *inode)
 	fuse_release_nowrite(inode);
 }
 
-static int fuse_flush(struct file *file, fl_owner_t id)
+static int fuse_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct fuse_conn *fc = get_fuse_conn(inode);
@@ -643,7 +642,7 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 }
 
 static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
-			     loff_t pos, size_t count, fl_owner_t owner)
+			     loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -955,7 +954,7 @@ static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
 }
 
 static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
-			      loff_t pos, size_t count, fl_owner_t owner)
+			      loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -1348,7 +1347,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 
 	while (count) {
 		size_t nres;
-		fl_owner_t owner = current->files;
+		void *owner = current->files;
 		size_t nbytes = min(count, nmax);
 		err = fuse_get_user_pages(req, iter, &nbytes, write);
 		if (err && !nbytes)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1bd7ffdad593..5f28f493b40a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -900,7 +900,7 @@ int fuse_valid_type(int m);
  */
 int fuse_allow_current_process(struct fuse_conn *fc);
 
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id);
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id);
 
 void fuse_update_ctime(struct inode *inode);
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 066ac313ae5c..1734c05f3182 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -81,7 +81,7 @@ static inline uint32_t __nlm_alloc_pid(struct nlm_host *host)
 	return res;
 }
 
-static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *lockowner;
 	list_for_each_entry(lockowner, &host->h_lockowners, list) {
@@ -92,7 +92,7 @@ static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owne
 	return NULL;
 }
 
-static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *res, *new = NULL;
 
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 82925f17ec45..2d72631ea968 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -45,7 +45,7 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 07915162581d..655a8daee20e 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -75,7 +75,7 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/locks.c b/fs/locks.c
index afefeb4ad6de..df7971a0175b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2424,7 +2424,7 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
  * from the task's fd array.  POSIX locks belonging to this task
  * are deleted at this time.
  */
-void locks_remove_posix(struct file *filp, fl_owner_t owner)
+void locks_remove_posix(struct file *filp, void *owner)
 {
 	int error;
 	struct inode *inode = locks_inode(filp);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index af330c31f627..bca440fc09ff 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -136,7 +136,7 @@ EXPORT_SYMBOL_GPL(nfs_file_llseek);
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs_file_flush(struct file *file, fl_owner_t id)
+nfs_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 109279d6d91b..bb2cc33c3753 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -901,7 +901,7 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
 	ctx->mode = f_mode;
 	ctx->flags = 0;
 	ctx->error = 0;
-	ctx->flock_owner = (fl_owner_t)filp;
+	ctx->flock_owner = filp;
 	nfs_init_lock_context(&ctx->lock_context);
 	ctx->lock_context.open_context = ctx;
 	INIT_LIST_HEAD(&ctx->list);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 40bd05f05e74..0fbe16684519 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -145,7 +145,7 @@ struct nfs4_lock_state {
 	struct nfs_seqid_counter	ls_seqid;
 	nfs4_stateid		ls_stateid;
 	atomic_t		ls_count;
-	fl_owner_t		ls_owner;
+	void *			ls_owner;
 };
 
 /* bits for nfs4_state->flags */
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 0efba77789b9..1e09d9e4dd20 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -107,7 +107,7 @@ nfs4_file_open(struct inode *inode, struct file *filp)
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs4_file_flush(struct file *file, fl_owner_t id)
+nfs4_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 0378e2257ca7..8293afb5b5b7 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -813,7 +813,7 @@ void nfs4_close_sync(struct nfs4_state *state, fmode_t fmode)
  */
 static struct nfs4_lock_state *
 __nfs4_find_lock_state(struct nfs4_state *state,
-		       fl_owner_t fl_owner, fl_owner_t fl_owner2)
+		       void *fl_owner, void *fl_owner2)
 {
 	struct nfs4_lock_state *pos, *ret = NULL;
 	list_for_each_entry(pos, &state->lock_states, ls_locks) {
@@ -834,7 +834,7 @@ __nfs4_find_lock_state(struct nfs4_state *state,
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, fl_owner_t fl_owner)
+static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, void *fl_owner)
 {
 	struct nfs4_lock_state *lsp;
 	struct nfs_server *server = state->owner->so_server;
@@ -868,7 +868,7 @@ void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, fl_owner_t owner)
+static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, void *owner)
 {
 	struct nfs4_lock_state *lsp, *new = NULL;
 	
@@ -959,7 +959,7 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
 		const struct nfs_lock_context *l_ctx)
 {
 	struct nfs4_lock_state *lsp;
-	fl_owner_t fl_owner, fl_flock_owner;
+	void *fl_owner, *fl_flock_owner;
 	int ret = -ENOENT;
 
 	if (l_ctx == NULL)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c04f81aa63b..28e5bded6a4b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1195,7 +1195,7 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
 
 	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
-		filp_close(file, (fl_owner_t)lo);
+		filp_close(file, lo);
 	nfs4_free_ol_stateid(stid);
 }
 
@@ -4148,7 +4148,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)fp;
+	fl->fl_owner = fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -5411,8 +5411,8 @@ nfs4_transform_lock_offset(struct file_lock *lock)
 		lock->fl_end = OFFSET_MAX;
 }
 
-static fl_owner_t
-nfsd4_fl_get_owner(fl_owner_t owner)
+static void *
+nfsd4_fl_get_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5421,7 +5421,7 @@ nfsd4_fl_get_owner(fl_owner_t owner)
 }
 
 static void
-nfsd4_fl_put_owner(fl_owner_t owner)
+nfsd4_fl_put_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5851,7 +5851,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	file_lock = &nbl->nbl_lock;
 	file_lock->fl_type = fl_type;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = fl_flags;
@@ -6005,7 +6005,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	lo = find_lockowner_str(cstate->clp, &lockt->lt_owner);
 	if (lo)
-		file_lock->fl_owner = (fl_owner_t)lo;
+		file_lock->fl_owner = lo;
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_flags = FL_POSIX;
 
@@ -6067,7 +6067,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 
 	file_lock->fl_type = F_UNLCK;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(stp->st_stateowner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(stp->st_stateowner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = FL_POSIX;
@@ -6126,7 +6126,7 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 	if (flctx && !list_empty_careful(&flctx->flc_posix)) {
 		spin_lock(&flctx->flc_lock);
 		list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
-			if (fl->fl_owner == (fl_owner_t)lowner) {
+			if (fl->fl_owner == lowner) {
 				status = true;
 				break;
 			}
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 2430a0415995..a0c804f75435 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -145,7 +145,7 @@ static struct fsnotify_ops dnotify_fsnotify_ops = {
  * dnotify_struct.  If that was the last dnotify_struct also remove the
  * fsnotify_mark.
  */
-void dnotify_flush(struct file *filp, fl_owner_t id)
+void dnotify_flush(struct file *filp, void *id)
 {
 	struct fsnotify_mark *fsn_mark;
 	struct dnotify_mark *dn_mark;
@@ -223,7 +223,7 @@ static __u32 convert_arg(unsigned long arg)
  * that list, or it |= the mask onto an existing dnofiy_struct.
  */
 static int attach_dn(struct dnotify_struct *dn, struct dnotify_mark *dn_mark,
-		     fl_owner_t id, int fd, struct file *filp, __u32 mask)
+		     void *id, int fd, struct file *filp, __u32 mask)
 {
 	struct dnotify_struct *odn;
 
@@ -259,7 +259,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 	struct fsnotify_mark *new_fsn_mark, *fsn_mark;
 	struct dnotify_struct *dn;
 	struct inode *inode;
-	fl_owner_t id = current->files;
+	void *id = current->files;
 	struct file *f;
 	int destroy = 0, error = 0;
 	__u32 mask;
diff --git a/fs/open.c b/fs/open.c
index 35bb784763a4..a2330170ad7c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1123,7 +1123,7 @@ SYSCALL_DEFINE2(creat, const char __user *, pathname, umode_t, mode)
  * "id" is the POSIX thread ID. We use the
  * files pointer for this..
  */
-int filp_close(struct file *filp, fl_owner_t id)
+int filp_close(struct file *filp, void *id)
 {
 	int retval = 0;
 
diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
index 3290555a52ee..5c6b6004f2ca 100644
--- a/include/linux/dnotify.h
+++ b/include/linux/dnotify.h
@@ -13,7 +13,7 @@ struct dnotify_struct {
 	__u32			dn_mask;
 	int			dn_fd;
 	struct file *		dn_filp;
-	fl_owner_t		dn_owner;
+	void *			dn_owner;
 };
 
 #ifdef __KERNEL__
@@ -29,12 +29,12 @@ struct dnotify_struct {
 			    FS_MOVED_FROM | FS_MOVED_TO)
 
 extern int dir_notify_enable;
-extern void dnotify_flush(struct file *, fl_owner_t);
+extern void dnotify_flush(struct file *, void *);
 extern int fcntl_dirnotify(int, struct file *, unsigned long);
 
 #else
 
-static inline void dnotify_flush(struct file *filp, fl_owner_t id)
+static inline void dnotify_flush(struct file *filp, void *id)
 {
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..a561d1a33e8f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -933,9 +933,6 @@ static inline struct file *get_file(struct file *f)
  */
 #define FILE_LOCK_DEFERRED 1
 
-/* legacy typedef, should eventually be removed */
-typedef void *fl_owner_t;
-
 struct file_lock;
 
 struct file_lock_operations {
@@ -946,8 +943,8 @@ struct file_lock_operations {
 struct lock_manager_operations {
 	int (*lm_compare_owner)(struct file_lock *, struct file_lock *);
 	unsigned long (*lm_owner_key)(struct file_lock *);
-	fl_owner_t (*lm_get_owner)(fl_owner_t);
-	void (*lm_put_owner)(fl_owner_t);
+	void *(*lm_get_owner)(void *);
+	void (*lm_put_owner)(void *);
 	void (*lm_notify)(struct file_lock *);	/* unblock callback */
 	int (*lm_grant)(struct file_lock *, int);
 	bool (*lm_break)(struct file_lock *);
@@ -995,7 +992,7 @@ struct file_lock {
 	struct list_head fl_list;	/* link into file_lock_context */
 	struct hlist_node fl_link;	/* node in global lists */
 	struct list_head fl_block;	/* circular list of blocked processes */
-	fl_owner_t fl_owner;
+	void *fl_owner;
 	unsigned int fl_flags;
 	unsigned char fl_type;
 	unsigned int fl_pid;
@@ -1072,7 +1069,7 @@ extern void locks_init_lock(struct file_lock *);
 extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
-extern void locks_remove_posix(struct file *, fl_owner_t);
+extern void locks_remove_posix(struct file *, void *);
 extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
@@ -1146,7 +1143,7 @@ static inline void locks_copy_lock(struct file_lock *new, struct file_lock *fl)
 	return;
 }
 
-static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
+static inline void locks_remove_posix(struct file *filp, void *owner)
 {
 	return;
 }
@@ -1675,7 +1672,7 @@ struct file_operations {
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
@@ -2359,7 +2356,7 @@ extern struct file *filp_open(const char *, int, umode_t);
 extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern int filp_close(struct file *, fl_owner_t id);
+extern int filp_close(struct file *, void *id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
 extern struct filename *getname(const char __user *);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 3eca67728366..c7340e4bcd23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -117,14 +117,14 @@ static inline struct sockaddr *nlm_srcaddr(const struct nlm_host *host)
 }
 
 /*
- * Map an fl_owner_t into a unique 32-bit "pid"
+ * Map a lock owner into a unique 32-bit "pid"
  */
 struct nlm_lockowner {
 	struct list_head list;
 	atomic_t count;
 
 	struct nlm_host *host;
-	fl_owner_t owner;
+	void *owner;
 	uint32_t pid;
 };
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 5cc91d6381a3..cfd37076fc4b 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -59,14 +59,14 @@ struct nfs_lock_context {
 	atomic_t count;
 	struct list_head list;
 	struct nfs_open_context *open_context;
-	fl_owner_t lockowner;
+	void *lockowner;
 	atomic_t io_count;
 };
 
 struct nfs4_state;
 struct nfs_open_context {
 	struct nfs_lock_context lock_context;
-	fl_owner_t flock_owner;
+	void *flock_owner;
 	struct dentry *dentry;
 	struct rpc_cred *cred;
 	struct nfs4_state *state;
diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
index 63a7680347cb..10a8b5a1b235 100644
--- a/include/trace/events/filelock.h
+++ b/include/trace/events/filelock.h
@@ -68,7 +68,7 @@ DECLARE_EVENT_CLASS(filelock_lock,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_pid)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
@@ -122,7 +122,7 @@ DECLARE_EVENT_CLASS(filelock_lease,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 		__field(unsigned long, fl_break_time)
@@ -175,7 +175,7 @@ TRACE_EVENT(generic_add_lease,
 		__field(int, dcount)
 		__field(int, icount)
 		__field(dev_t, s_dev)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 	),
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index eb1391b52c6f..d339f60223b5 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -507,7 +507,7 @@ static ssize_t mqueue_read_file(struct file *filp, char __user *u_data,
 	return ret;
 }
 
-static int mqueue_flush_file(struct file *filp, fl_owner_t id)
+static int mqueue_flush_file(struct file *filp, void *id)
 {
 	struct mqueue_inode_info *info = MQUEUE_I(file_inode(filp));
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
@ 2017-09-08 15:05           ` J. Bruce Fields
  0 siblings, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-08 15:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

On Fri, Sep 08, 2017 at 03:06:24PM +1000, NeilBrown wrote:
> On Thu, Sep 07 2017, J. Bruce Fields wrote:
> 
> > On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> >> /* legacy typedef, should eventually be removed */
> >> typedef void *fl_owner_t;
> >> 
> >> 
> >> Maybe you could do the world a favor and remove fl_owner_t in a
> >> preliminary patch :-)
> >
> > Partly scripted, still a bit tedious, but I think it's right.  Honestly
> > I don't know what the motivation for the comment was, though.  Are there
> > no documentation or type-checking benefits to having the typdef?
> 
> If it was an established practice throughout the kernel to use typedefs
> to differentiate different 'void *', then maybe there would be a
> documentation benefit.  Given the wide use of casts (you removed 9 I
> think), I don't think there are significant type-checking benefits.
> 
> I don't like fl_owner_t because when you see it in the general context
> of the kernel, you are likely to think that it means something
> important.  Then you go hunting and find "Oh, it is just a void*". (That
> is what happened to me:-).  The second reason that I don't like it is
> that it requires all those casts that you removed.
> 
> Reviewed-by: NeilBrown <neilb@suse.com>

OK, I'll give it a shot.

Turns out I forgot to fold a couple small fixes into the patch before
posting; fixed version follows.

--b.

Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Sep 7 17:45:21 2017 -0400

    vfs: remove unnecessary fl_owner_t typedef
    
    The convention is to avoid this kind of typedef.  It doesn't
    seem useful, and it requires a lot of casts.
    
    Reviewed-by: NeilBrown <neilb@suse.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 73e7d91f03dc..758ca1591b0c 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -864,7 +864,7 @@ struct file_operations {
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*mremap)(struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 09f86ebfcc7b..35a185ac0de3 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -1809,7 +1809,7 @@ pfm_syswide_cleanup_other_cpu(pfm_context_t *ctx)
  * When caller is self-monitoring, the context is unloaded.
  */
 static int
-pfm_flush(struct file *filp, fl_owner_t id)
+pfm_flush(struct file *filp, void *id)
 {
 	pfm_context_t *ctx;
 	struct task_struct *task;
diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c
index ae2f740a82f1..8783335b3b85 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1720,7 +1720,7 @@ static unsigned int spufs_mfc_poll(struct file *file,poll_table *wait)
 	return mask;
 }
 
-static int spufs_mfc_flush(struct file *file, fl_owner_t id)
+static int spufs_mfc_flush(struct file *file, void *id)
 {
 	struct spu_context *ctx = file->private_data;
 	int ret;
diff --git a/arch/tile/kernel/hardwall.c b/arch/tile/kernel/hardwall.c
index 2fd1694ac1d0..b2cf21d1edb0 100644
--- a/arch/tile/kernel/hardwall.c
+++ b/arch/tile/kernel/hardwall.c
@@ -1030,7 +1030,7 @@ static long hardwall_compat_ioctl(struct file *file,
 #endif
 
 /* The user process closed the file; revoke access to user networks. */
-static int hardwall_flush(struct file *file, fl_owner_t owner)
+static int hardwall_flush(struct file *file, void *owner)
 {
 	struct hardwall_info *info = file->private_data;
 	struct task_struct *task, *tmp;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f7665c31feca..02bfce18c912 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3503,7 +3503,7 @@ static int binder_open(struct inode *nodp, struct file *filp)
 	return 0;
 }
 
-static int binder_flush(struct file *filp, fl_owner_t id)
+static int binder_flush(struct file *filp, void *id)
 {
 	struct binder_proc *proc = filp->private_data;
 
diff --git a/drivers/char/ps3flash.c b/drivers/char/ps3flash.c
index b526dc15c271..5a03dd0eb2f1 100644
--- a/drivers/char/ps3flash.c
+++ b/drivers/char/ps3flash.c
@@ -281,7 +281,7 @@ static ssize_t ps3flash_kernel_write(const void *buf, size_t count,
 	return res;
 }
 
-static int ps3flash_flush(struct file *file, fl_owner_t id)
+static int ps3flash_flush(struct file *file, void *id)
 {
 	return ps3flash_writeback(ps3flash_dev);
 }
diff --git a/drivers/char/xillybus/xillybus_core.c b/drivers/char/xillybus/xillybus_core.c
index b6c9cdead7f3..7e04a9df51e3 100644
--- a/drivers/char/xillybus/xillybus_core.c
+++ b/drivers/char/xillybus/xillybus_core.c
@@ -1156,7 +1156,7 @@ static int xillybus_myflush(struct xilly_channel *channel, long timeout)
 	return rc;
 }
 
-static int xillybus_flush(struct file *filp, fl_owner_t id)
+static int xillybus_flush(struct file *filp, void *id)
 {
 	if (!(filp->f_mode & FMODE_WRITE))
 		return 0;
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index ec8ac5c4dd84..f4d0c4805ec7 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -225,7 +225,7 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
  *	will be treated as upload termination and will free those completed
  *	buffer pages and -ECANCELED will be returned.
  **/
-static int efi_capsule_flush(struct file *file, fl_owner_t id)
+static int efi_capsule_flush(struct file *file, void *id)
 {
 	int ret = 0;
 	struct capsule_info *cap_info = file->private_data;
diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 925571475005..85117fecc292 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -342,7 +342,7 @@ static int evdev_fasync(int fd, struct file *file, int on)
 	return fasync_helper(fd, file, on, &client->fasync);
 }
 
-static int evdev_flush(struct file *file, fl_owner_t id)
+static int evdev_flush(struct file *file, void *id)
 {
 	struct evdev_client *client = file->private_data;
 	struct evdev *evdev = client->evdev;
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index f7e826142a72..c446a1450714 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -48,7 +48,7 @@ static unsigned int scif_fdpoll(struct file *f, poll_table *wait)
 	return __scif_pollfd(f, wait, priv);
 }
 
-static int scif_fdflush(struct file *f, fl_owner_t id)
+static int scif_fdflush(struct file *f, void *id)
 {
 	struct scif_endpt *ep = f->private_data;
 
diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
index 929ee7e88120..366bc57f0bfb 100644
--- a/drivers/scsi/osst.c
+++ b/drivers/scsi/osst.c
@@ -4822,7 +4822,7 @@ static int os_scsi_tape_open(struct inode * inode, struct file * filp)
 
 
 /* Flush the tape buffer before close */
-static int os_scsi_tape_flush(struct file * filp, fl_owner_t id)
+static int os_scsi_tape_flush(struct file * filp, void *id)
 {
 	int		      result = 0, result2;
 	struct osst_tape    * STp    = filp->private_data;
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 8e5013d9cad4..7d97641fcca9 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1338,7 +1338,7 @@ static int st_open(struct inode *inode, struct file *filp)
 \f
 
 /* Flush the tape buffer before close */
-static int st_flush(struct file *filp, fl_owner_t id)
+static int st_flush(struct file *filp, void *id)
 {
 	int result = 0, result2;
 	unsigned char cmd[MAX_COMMAND_SIZE];
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index ab1c85c1ed38..5f12ea3f26d3 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2278,7 +2278,7 @@ static loff_t ll_file_seek(struct file *file, loff_t offset, int origin)
 					ll_file_maxbytes(inode), eof);
 }
 
-static int ll_flush(struct file *file, fl_owner_t id)
+static int ll_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct ll_inode_info *lli = ll_i2info(inode);
diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 8f972247b1c1..cb248f3d9565 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -582,7 +582,7 @@ static ssize_t wdm_read
 	return rv;
 }
 
-static int wdm_flush(struct file *file, fl_owner_t id)
+static int wdm_flush(struct file *file, void *id)
 {
 	struct wdm_device *desc = file->private_data;
 
diff --git a/drivers/usb/usb-skeleton.c b/drivers/usb/usb-skeleton.c
index bb0bd732e29a..547fe43678e8 100644
--- a/drivers/usb/usb-skeleton.c
+++ b/drivers/usb/usb-skeleton.c
@@ -136,7 +136,7 @@ static int skel_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int skel_flush(struct file *file, fl_owner_t id)
+static int skel_flush(struct file *file, void *id)
 {
 	struct usb_skel *dev;
 	int res;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 82e16556afea..210838d2fa85 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -728,7 +728,7 @@ extern int afs_writepages(struct address_space *, struct writeback_control *);
 extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
 extern ssize_t afs_file_write(struct kiocb *, struct iov_iter *);
 extern int afs_writeback_all(struct afs_vnode *);
-extern int afs_flush(struct file *, fl_owner_t);
+extern int afs_flush(struct file *, void *);
 extern int afs_fsync(struct file *, loff_t, loff_t, int);
 
 /*
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 2d2fccd5044b..a38f03bd6859 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -767,7 +767,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * Flush out all outstanding writes on a file opened for writing when it is
  * closed.
  */
-int afs_flush(struct file *file, fl_owner_t id)
+int afs_flush(struct file *file, void *id)
 {
 	_enter("");
 
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 30bf89b1fd9a..914e5af73d32 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -108,7 +108,7 @@ extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
 extern int cifs_lock(struct file *, int, struct file_lock *);
 extern int cifs_fsync(struct file *, loff_t, loff_t, int);
 extern int cifs_strict_fsync(struct file *, loff_t, loff_t, int);
-extern int cifs_flush(struct file *, fl_owner_t id);
+extern int cifs_flush(struct file *, void *id);
 extern int cifs_file_mmap(struct file * , struct vm_area_struct *);
 extern int cifs_file_strict_mmap(struct file * , struct vm_area_struct *);
 extern const struct file_operations cifs_dir_ops;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index bc09df6b473a..bda1fdf46937 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1165,7 +1165,7 @@ cifs_push_mandatory_locks(struct cifsFileInfo *cfile)
 }
 
 static __u32
-hash_lockowner(fl_owner_t owner)
+hash_lockowner(void *owner)
 {
 	return cifs_lock_secret ^ hash32_ptr((const void *)owner);
 }
@@ -2399,7 +2399,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  * As file closes, flush all cached write data for this inode checking
  * for write behind errors.
  */
-int cifs_flush(struct file *file, fl_owner_t id)
+int cifs_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	int rc = 0;
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index ca4e83750214..535039b38da8 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -290,7 +290,7 @@ static int ecryptfs_dir_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int ecryptfs_flush(struct file *file, fl_owner_t td)
+static int ecryptfs_flush(struct file *file, void *td)
 {
 	struct file *lower_file = ecryptfs_file_to_lower(file);
 
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index 28645f0640f7..d9e3c7ca3a0c 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -58,7 +58,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end,
 	return ret;
 }
 
-static int exofs_flush(struct file *file, fl_owner_t id)
+static int exofs_flush(struct file *file, void *id)
 {
 	int ret = vfs_fsync(file, 0);
 	/* TODO: Flush the OSD target */
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 3ee4fdc3da9e..89fe98020374 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -254,8 +254,7 @@ void fuse_release_common(struct file *file, int opcode)
 	if (ff->flock) {
 		struct fuse_release_in *inarg = &req->misc.release.in;
 		inarg->release_flags |= FUSE_RELEASE_FLOCK_UNLOCK;
-		inarg->lock_owner = fuse_lock_owner_id(ff->fc,
-						       (fl_owner_t) file);
+		inarg->lock_owner = fuse_lock_owner_id(ff->fc, file);
 	}
 	/* Hold inode until release is finished */
 	req->misc.release.inode = igrab(file_inode(file));
@@ -307,7 +306,7 @@ EXPORT_SYMBOL_GPL(fuse_sync_release);
  * Scramble the ID space with XTEA, so that the value of the files_struct
  * pointer is not exposed to userspace.
  */
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id)
 {
 	u32 *k = fc->scramble_key;
 	u64 v = (unsigned long) id;
@@ -390,7 +389,7 @@ static void fuse_sync_writes(struct inode *inode)
 	fuse_release_nowrite(inode);
 }
 
-static int fuse_flush(struct file *file, fl_owner_t id)
+static int fuse_flush(struct file *file, void *id)
 {
 	struct inode *inode = file_inode(file);
 	struct fuse_conn *fc = get_fuse_conn(inode);
@@ -643,7 +642,7 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 }
 
 static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
-			     loff_t pos, size_t count, fl_owner_t owner)
+			     loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -955,7 +954,7 @@ static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
 }
 
 static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
-			      loff_t pos, size_t count, fl_owner_t owner)
+			      loff_t pos, size_t count, void *owner)
 {
 	struct file *file = io->file;
 	struct fuse_file *ff = file->private_data;
@@ -1348,7 +1347,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 
 	while (count) {
 		size_t nres;
-		fl_owner_t owner = current->files;
+		void *owner = current->files;
 		size_t nbytes = min(count, nmax);
 		err = fuse_get_user_pages(req, iter, &nbytes, write);
 		if (err && !nbytes)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1bd7ffdad593..5f28f493b40a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -900,7 +900,7 @@ int fuse_valid_type(int m);
  */
 int fuse_allow_current_process(struct fuse_conn *fc);
 
-u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id);
+u64 fuse_lock_owner_id(struct fuse_conn *fc, void *id);
 
 void fuse_update_ctime(struct inode *inode);
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 066ac313ae5c..1734c05f3182 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -81,7 +81,7 @@ static inline uint32_t __nlm_alloc_pid(struct nlm_host *host)
 	return res;
 }
 
-static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *lockowner;
 	list_for_each_entry(lockowner, &host->h_lockowners, list) {
@@ -92,7 +92,7 @@ static struct nlm_lockowner *__nlm_find_lockowner(struct nlm_host *host, fl_owne
 	return NULL;
 }
 
-static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, fl_owner_t owner)
+static struct nlm_lockowner *nlm_find_lockowner(struct nlm_host *host, void *owner)
 {
 	struct nlm_lockowner *res, *new = NULL;
 
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 82925f17ec45..2d72631ea968 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -45,7 +45,7 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 07915162581d..655a8daee20e 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -75,7 +75,7 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 		/* Set up the missing parts of the file_lock structure */
 		lock->fl.fl_file  = file->f_file;
-		lock->fl.fl_owner = (fl_owner_t) host;
+		lock->fl.fl_owner = host;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 	}
 
diff --git a/fs/locks.c b/fs/locks.c
index afefeb4ad6de..df7971a0175b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2424,7 +2424,7 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
  * from the task's fd array.  POSIX locks belonging to this task
  * are deleted at this time.
  */
-void locks_remove_posix(struct file *filp, fl_owner_t owner)
+void locks_remove_posix(struct file *filp, void *owner)
 {
 	int error;
 	struct inode *inode = locks_inode(filp);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index af330c31f627..bca440fc09ff 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -136,7 +136,7 @@ EXPORT_SYMBOL_GPL(nfs_file_llseek);
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs_file_flush(struct file *file, fl_owner_t id)
+nfs_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 109279d6d91b..bb2cc33c3753 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -901,7 +901,7 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
 	ctx->mode = f_mode;
 	ctx->flags = 0;
 	ctx->error = 0;
-	ctx->flock_owner = (fl_owner_t)filp;
+	ctx->flock_owner = filp;
 	nfs_init_lock_context(&ctx->lock_context);
 	ctx->lock_context.open_context = ctx;
 	INIT_LIST_HEAD(&ctx->list);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 40bd05f05e74..0fbe16684519 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -145,7 +145,7 @@ struct nfs4_lock_state {
 	struct nfs_seqid_counter	ls_seqid;
 	nfs4_stateid		ls_stateid;
 	atomic_t		ls_count;
-	fl_owner_t		ls_owner;
+	void *			ls_owner;
 };
 
 /* bits for nfs4_state->flags */
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 0efba77789b9..1e09d9e4dd20 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -107,7 +107,7 @@ nfs4_file_open(struct inode *inode, struct file *filp)
  * Flush all dirty pages, and check for write errors.
  */
 static int
-nfs4_file_flush(struct file *file, fl_owner_t id)
+nfs4_file_flush(struct file *file, void *id)
 {
 	struct inode	*inode = file_inode(file);
 
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 0378e2257ca7..8293afb5b5b7 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -813,7 +813,7 @@ void nfs4_close_sync(struct nfs4_state *state, fmode_t fmode)
  */
 static struct nfs4_lock_state *
 __nfs4_find_lock_state(struct nfs4_state *state,
-		       fl_owner_t fl_owner, fl_owner_t fl_owner2)
+		       void *fl_owner, void *fl_owner2)
 {
 	struct nfs4_lock_state *pos, *ret = NULL;
 	list_for_each_entry(pos, &state->lock_states, ls_locks) {
@@ -834,7 +834,7 @@ __nfs4_find_lock_state(struct nfs4_state *state,
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, fl_owner_t fl_owner)
+static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, void *fl_owner)
 {
 	struct nfs4_lock_state *lsp;
 	struct nfs_server *server = state->owner->so_server;
@@ -868,7 +868,7 @@ void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp
  * exists, return an uninitialized one.
  *
  */
-static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, fl_owner_t owner)
+static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, void *owner)
 {
 	struct nfs4_lock_state *lsp, *new = NULL;
 	
@@ -959,7 +959,7 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
 		const struct nfs_lock_context *l_ctx)
 {
 	struct nfs4_lock_state *lsp;
-	fl_owner_t fl_owner, fl_flock_owner;
+	void *fl_owner, *fl_flock_owner;
 	int ret = -ENOENT;
 
 	if (l_ctx == NULL)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c04f81aa63b..28e5bded6a4b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1195,7 +1195,7 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
 
 	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
-		filp_close(file, (fl_owner_t)lo);
+		filp_close(file, lo);
 	nfs4_free_ol_stateid(stid);
 }
 
@@ -4148,7 +4148,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)fp;
+	fl->fl_owner = fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -5411,8 +5411,8 @@ nfs4_transform_lock_offset(struct file_lock *lock)
 		lock->fl_end = OFFSET_MAX;
 }
 
-static fl_owner_t
-nfsd4_fl_get_owner(fl_owner_t owner)
+static void *
+nfsd4_fl_get_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5421,7 +5421,7 @@ nfsd4_fl_get_owner(fl_owner_t owner)
 }
 
 static void
-nfsd4_fl_put_owner(fl_owner_t owner)
+nfsd4_fl_put_owner(void *owner)
 {
 	struct nfs4_lockowner *lo = (struct nfs4_lockowner *)owner;
 
@@ -5851,7 +5851,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	file_lock = &nbl->nbl_lock;
 	file_lock->fl_type = fl_type;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = fl_flags;
@@ -6005,7 +6005,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	lo = find_lockowner_str(cstate->clp, &lockt->lt_owner);
 	if (lo)
-		file_lock->fl_owner = (fl_owner_t)lo;
+		file_lock->fl_owner = lo;
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_flags = FL_POSIX;
 
@@ -6067,7 +6067,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 
 	file_lock->fl_type = F_UNLCK;
-	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(stp->st_stateowner));
+	file_lock->fl_owner = lockowner(nfs4_get_stateowner(stp->st_stateowner));
 	file_lock->fl_pid = current->tgid;
 	file_lock->fl_file = filp;
 	file_lock->fl_flags = FL_POSIX;
@@ -6126,7 +6126,7 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 	if (flctx && !list_empty_careful(&flctx->flc_posix)) {
 		spin_lock(&flctx->flc_lock);
 		list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
-			if (fl->fl_owner == (fl_owner_t)lowner) {
+			if (fl->fl_owner == lowner) {
 				status = true;
 				break;
 			}
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 2430a0415995..a0c804f75435 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -145,7 +145,7 @@ static struct fsnotify_ops dnotify_fsnotify_ops = {
  * dnotify_struct.  If that was the last dnotify_struct also remove the
  * fsnotify_mark.
  */
-void dnotify_flush(struct file *filp, fl_owner_t id)
+void dnotify_flush(struct file *filp, void *id)
 {
 	struct fsnotify_mark *fsn_mark;
 	struct dnotify_mark *dn_mark;
@@ -223,7 +223,7 @@ static __u32 convert_arg(unsigned long arg)
  * that list, or it |= the mask onto an existing dnofiy_struct.
  */
 static int attach_dn(struct dnotify_struct *dn, struct dnotify_mark *dn_mark,
-		     fl_owner_t id, int fd, struct file *filp, __u32 mask)
+		     void *id, int fd, struct file *filp, __u32 mask)
 {
 	struct dnotify_struct *odn;
 
@@ -259,7 +259,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 	struct fsnotify_mark *new_fsn_mark, *fsn_mark;
 	struct dnotify_struct *dn;
 	struct inode *inode;
-	fl_owner_t id = current->files;
+	void *id = current->files;
 	struct file *f;
 	int destroy = 0, error = 0;
 	__u32 mask;
diff --git a/fs/open.c b/fs/open.c
index 35bb784763a4..a2330170ad7c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1123,7 +1123,7 @@ SYSCALL_DEFINE2(creat, const char __user *, pathname, umode_t, mode)
  * "id" is the POSIX thread ID. We use the
  * files pointer for this..
  */
-int filp_close(struct file *filp, fl_owner_t id)
+int filp_close(struct file *filp, void *id)
 {
 	int retval = 0;
 
diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
index 3290555a52ee..5c6b6004f2ca 100644
--- a/include/linux/dnotify.h
+++ b/include/linux/dnotify.h
@@ -13,7 +13,7 @@ struct dnotify_struct {
 	__u32			dn_mask;
 	int			dn_fd;
 	struct file *		dn_filp;
-	fl_owner_t		dn_owner;
+	void *			dn_owner;
 };
 
 #ifdef __KERNEL__
@@ -29,12 +29,12 @@ struct dnotify_struct {
 			    FS_MOVED_FROM | FS_MOVED_TO)
 
 extern int dir_notify_enable;
-extern void dnotify_flush(struct file *, fl_owner_t);
+extern void dnotify_flush(struct file *, void *);
 extern int fcntl_dirnotify(int, struct file *, unsigned long);
 
 #else
 
-static inline void dnotify_flush(struct file *filp, fl_owner_t id)
+static inline void dnotify_flush(struct file *filp, void *id)
 {
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..a561d1a33e8f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -933,9 +933,6 @@ static inline struct file *get_file(struct file *f)
  */
 #define FILE_LOCK_DEFERRED 1
 
-/* legacy typedef, should eventually be removed */
-typedef void *fl_owner_t;
-
 struct file_lock;
 
 struct file_lock_operations {
@@ -946,8 +943,8 @@ struct file_lock_operations {
 struct lock_manager_operations {
 	int (*lm_compare_owner)(struct file_lock *, struct file_lock *);
 	unsigned long (*lm_owner_key)(struct file_lock *);
-	fl_owner_t (*lm_get_owner)(fl_owner_t);
-	void (*lm_put_owner)(fl_owner_t);
+	void *(*lm_get_owner)(void *);
+	void (*lm_put_owner)(void *);
 	void (*lm_notify)(struct file_lock *);	/* unblock callback */
 	int (*lm_grant)(struct file_lock *, int);
 	bool (*lm_break)(struct file_lock *);
@@ -995,7 +992,7 @@ struct file_lock {
 	struct list_head fl_list;	/* link into file_lock_context */
 	struct hlist_node fl_link;	/* node in global lists */
 	struct list_head fl_block;	/* circular list of blocked processes */
-	fl_owner_t fl_owner;
+	void *fl_owner;
 	unsigned int fl_flags;
 	unsigned char fl_type;
 	unsigned int fl_pid;
@@ -1072,7 +1069,7 @@ extern void locks_init_lock(struct file_lock *);
 extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
-extern void locks_remove_posix(struct file *, fl_owner_t);
+extern void locks_remove_posix(struct file *, void *);
 extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
@@ -1146,7 +1143,7 @@ static inline void locks_copy_lock(struct file_lock *new, struct file_lock *fl)
 	return;
 }
 
-static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
+static inline void locks_remove_posix(struct file *filp, void *owner)
 {
 	return;
 }
@@ -1675,7 +1672,7 @@ struct file_operations {
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
-	int (*flush) (struct file *, fl_owner_t id);
+	int (*flush) (struct file *, void *id);
 	int (*release) (struct inode *, struct file *);
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
@@ -2359,7 +2356,7 @@ extern struct file *filp_open(const char *, int, umode_t);
 extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern int filp_close(struct file *, fl_owner_t id);
+extern int filp_close(struct file *, void *id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
 extern struct filename *getname(const char __user *);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 3eca67728366..c7340e4bcd23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -117,14 +117,14 @@ static inline struct sockaddr *nlm_srcaddr(const struct nlm_host *host)
 }
 
 /*
- * Map an fl_owner_t into a unique 32-bit "pid"
+ * Map a lock owner into a unique 32-bit "pid"
  */
 struct nlm_lockowner {
 	struct list_head list;
 	atomic_t count;
 
 	struct nlm_host *host;
-	fl_owner_t owner;
+	void *owner;
 	uint32_t pid;
 };
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 5cc91d6381a3..cfd37076fc4b 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -59,14 +59,14 @@ struct nfs_lock_context {
 	atomic_t count;
 	struct list_head list;
 	struct nfs_open_context *open_context;
-	fl_owner_t lockowner;
+	void *lockowner;
 	atomic_t io_count;
 };
 
 struct nfs4_state;
 struct nfs_open_context {
 	struct nfs_lock_context lock_context;
-	fl_owner_t flock_owner;
+	void *flock_owner;
 	struct dentry *dentry;
 	struct rpc_cred *cred;
 	struct nfs4_state *state;
diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
index 63a7680347cb..10a8b5a1b235 100644
--- a/include/trace/events/filelock.h
+++ b/include/trace/events/filelock.h
@@ -68,7 +68,7 @@ DECLARE_EVENT_CLASS(filelock_lock,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_pid)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
@@ -122,7 +122,7 @@ DECLARE_EVENT_CLASS(filelock_lease,
 		__field(unsigned long, i_ino)
 		__field(dev_t, s_dev)
 		__field(struct file_lock *, fl_next)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 		__field(unsigned long, fl_break_time)
@@ -175,7 +175,7 @@ TRACE_EVENT(generic_add_lease,
 		__field(int, dcount)
 		__field(int, icount)
 		__field(dev_t, s_dev)
-		__field(fl_owner_t, fl_owner)
+		__field(void *, fl_owner)
 		__field(unsigned int, fl_flags)
 		__field(unsigned char, fl_type)
 	),
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index eb1391b52c6f..d339f60223b5 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -507,7 +507,7 @@ static ssize_t mqueue_read_file(struct file *filp, char __user *u_data,
 	return ret;
 }
 
-static int mqueue_flush_file(struct file *filp, fl_owner_t id)
+static int mqueue_flush_file(struct file *filp, void *id)
 {
 	struct mqueue_inode_info *info = MQUEUE_I(file_inode(filp));
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-07  0:43                           ` NeilBrown
@ 2017-09-08 15:06                             ` J. Bruce Fields
  0 siblings, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2017-09-08 15:06 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Trond Myklebust

On Thu, Sep 07, 2017 at 10:43:29AM +1000, NeilBrown wrote:
> 
> >> +		host_err = nfsd_conflicting_leases(dentry, rqstp);
> >> +		host_err = host_err ?: notify_change(dentry, &size_attr, nfsd_deleg_owner, NULL);
> >
> > And then you recall nfsd delegations and delegations held by
> > (hypothetical) non-nfsd users separately, OK (also ignoring how).
> >
> > There are no such users currently, so nfsd could just pass NULL.
> 
> I don't think so.  If we pass NULL (as the owner), when VFS will recall
> the one nfsd delegation that we want to preserve. ???

Oops, right.--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] fs: hide another detail of delegation logic
  2017-09-06 16:03                         ` J. Bruce Fields
  2017-09-07  0:43                           ` NeilBrown
@ 2018-03-16 14:42                           ` J. Bruce Fields
  1 sibling, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2018-03-16 14:42 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NeilBrown, linux-nfs, linux-fsdevel, Trond Myklebust

It's taken me a while to get back to this, sorry!  I'll try not to
assume anyone remembers the previous discussion....

On Wed, Sep 06, 2017 at 12:03:42PM -0400, J. Bruce Fields wrote:
> Gah, I hate having to patch every notify_change caller.  But maybe I
> should get over that, the resulting logic is simpler.

I went a little further down that path and now I think that we missed
some callers, and that this is a bigger problem than I previously
thought.

So to decide whether a given file operation conflicts with a
lease/delegation, we need to know "who" is performing the operation.  If
we track that by explicitly passing that "who", then we're adding a new
argument to every vfs function between nfsd and the break_lease call.
That'll only get worse, since:

	1. I want to add write delegations next, and that'll introduce
	   more delegation-breaking operations; and
	2. I imagine we'll want to make this feature available to
	   userspace eventually too (for servers like Samba and Ganesha),
	   and that'll mean propagating this even further.

We also considered doing the lease break in nfsd/vfs.c and adding some
new locking to prevent grants of new leases/delegations over an nfsd
operation.  I haven't tried that yet, but I don't like it being so
nfsd-specific, and I don't like having to do this on each operation.

So, I'm experimenting with passing the identity of the lease breaker in
the task instead--actually the struct cred.  Does that make sense?

It certainly makes for a short patch.  The below applies after a bunch
of nfsd delegation code cleanup.

--b.

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 75d2d57e2c44..739c55825330 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -366,6 +366,7 @@ prototypes:
 	int (*lm_grant)(struct file_lock *, struct file_lock *, int);
 	void (*lm_break)(struct file_lock *); /* break_lease callback */
 	int (*lm_change)(struct file_lock **, int);
+	bool (*lm_breaker_owns_lease)(void *, struct file_lock *);
 
 locking rules:
 
@@ -376,6 +377,7 @@ lm_notify:		yes		yes			no
 lm_grant:		no		no			no
 lm_break:		yes		no			no
 lm_change		yes		no			no
+lm_breaker_owns_lease:	no		no			no
 
 [1]:	->lm_compare_owner and ->lm_owner_key are generally called with
 *an* inode->i_lock held. It may not be the i_lock of the inode
diff --git a/fs/locks.c b/fs/locks.c
index 63aa52bcdf5a..22ed02b20559 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1414,6 +1414,9 @@ static void time_out_leases(struct inode *inode, struct list_head *dispose)
 
 static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 {
+	if (lease->fl_lmops->lm_breaker_owns_lease && breaker->fl_owner &&
+	    lease->fl_lmops->lm_breaker_owns_lease(breaker->fl_owner, lease))
+		return false;
 	if ((breaker->fl_flags & FL_LAYOUT) != (lease->fl_flags & FL_LAYOUT))
 		return false;
 	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
@@ -1462,6 +1465,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
 	new_fl->fl_flags = type;
+	new_fl->fl_owner = current_cred()->lease_breaker;
 
 	/* typically we will check that ctx is non-NULL before calling */
 	ctx = smp_load_acquire(&inode->i_flctx);
diff --git a/fs/nfsd/auth.c b/fs/nfsd/auth.c
index fdf2aad73470..d2562ae21b8e 100644
--- a/fs/nfsd/auth.c
+++ b/fs/nfsd/auth.c
@@ -81,6 +81,8 @@ int nfsd_setuser(struct svc_rqst *rqstp, struct svc_export *exp)
 	else
 		new->cap_effective = cap_raise_nfsd_set(new->cap_effective,
 							new->cap_permitted);
+	if (rqstp->rq_lease_breaker)
+		new->lease_breaker = *rqstp->rq_lease_breaker;
 	validate_process_creds();
 	put_cred(override_creds(new));
 	put_cred(new);
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a0bed2b2004d..e1341dbaf657 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1713,6 +1713,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
 		op->status = status;
 		goto encode_op;
 	}
+	rqstp->rq_lease_breaker = (void **)&cstate->clp;
 
 	while (!status && resp->opcnt < args->opcnt) {
 		op = &args->ops[resp->opcnt++];
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 87e546f3f792..5d9e9877a49e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3927,6 +3927,13 @@ nfsd_break_deleg_cb(struct file_lock *fl)
 	return ret;
 }
 
+static bool nfsd_breaker_owns_lease(void *breaker, struct file_lock *fl)
+{
+	struct nfs4_delegation *dl = fl->fl_owner;
+
+	return dl->dl_stid.sc_client == breaker;
+}
+
 static int
 nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 		     struct list_head *dispose)
@@ -3938,6 +3945,7 @@ nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 }
 
 static const struct lock_manager_operations nfsd_lease_mng_ops = {
+	.lm_breaker_owns_lease = nfsd_breaker_owns_lease,
 	.lm_break = nfsd_break_deleg_cb,
 	.lm_change = nfsd_change_deleg_cb,
 };
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 89cb484f1cfb..c4e86a0a8dd3 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -803,6 +803,7 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
 		*statp = rpc_garbage_args;
 		return 1;
 	}
+	rqstp->rq_lease_breaker = NULL;
 	/*
 	 * Give the xdr decoder a chance to change this if it wants
 	 * (necessary in the NFSv4.0 compound case)
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 631286535d0f..d567c27eebae 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -139,6 +139,9 @@ struct cred {
 	struct key	*thread_keyring; /* keyring private to this thread */
 	struct key	*request_key_auth; /* assumed request_key authority */
 #endif
+#ifdef CONFIG_FILE_LOCKING
+	void		*lease_breaker; /* identify NFS client breaking a delegation */
+#endif
 #ifdef CONFIG_SECURITY
 	void		*security;	/* subjective LSM security */
 #endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ef9269bf7e69..43e1d2f47cb3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -959,6 +959,7 @@ struct lock_manager_operations {
 	bool (*lm_break)(struct file_lock *);
 	int (*lm_change)(struct file_lock *, int, struct list_head *);
 	void (*lm_setup)(struct file_lock *, void **);
+	bool (*lm_breaker_owns_lease)(void *, struct file_lock *);
 };
 
 struct lock_manager {
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 786ae2255f05..f2bbfee1662a 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -293,6 +293,7 @@ struct svc_rqst {
 	struct svc_cacherep *	rq_cacherep;	/* cache info */
 	struct task_struct	*rq_task;	/* service thread */
 	spinlock_t		rq_lock;	/* per-request lock */
+	void **			rq_lease_breaker; /* The v4 client breaking a lease */
 };
 
 #define SVC_NET(svc_rqst)	(svc_rqst->rq_xprt->xpt_net)

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] nfsd: clients don't need to break their own delegations
  2017-08-29 21:49     ` J. Bruce Fields
@ 2018-03-16 14:43       ` J. Bruce Fields
  0 siblings, 0 replies; 35+ messages in thread
From: J. Bruce Fields @ 2018-03-16 14:43 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NeilBrown, linux-nfs, linux-fsdevel, Trond Myklebust

On Tue, Aug 29, 2017 at 05:49:07PM -0400, J. Bruce Fields wrote:
> Now we get to harder questions....
> 
> On Mon, Aug 28, 2017 at 02:32:53PM +1000, NeilBrown wrote:
> > Would this all turn out to be easier if nfsd took a separate lease for
> > each client?  What would be the cost of that?
> 
> I'll have to remind myself.

OK, done and I think it's actually nicer that way.  I'll post the whole
series later today.

--b.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2018-03-16 14:43 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25 21:52 [PATCH 0/3] Eliminate delegation self-conflicts J. Bruce Fields
2017-08-25 21:52 ` [PATCH 1/3] fs: cleanup to hide some details of delegation logic J. Bruce Fields
2017-08-28  3:54   ` NeilBrown
2017-08-29 21:37     ` J. Bruce Fields
2017-08-30 19:50       ` Jeff Layton
2017-08-31 21:10         ` J. Bruce Fields
2017-08-31 23:13           ` Jeff Layton
2017-08-25 21:52 ` [PATCH 2/3] fs: hide another detail " J. Bruce Fields
2017-08-28  4:43   ` NeilBrown
2017-08-29 21:40     ` J. Bruce Fields
2017-08-30  0:43       ` NeilBrown
2017-08-30 17:09         ` J. Bruce Fields
2017-08-30 23:26           ` NeilBrown
2017-08-31 19:05             ` J. Bruce Fields
2017-08-31 23:27               ` NeilBrown
2017-09-01 16:18                 ` J. Bruce Fields
2017-09-04  4:52                   ` NeilBrown
2017-09-05 19:56                     ` J. Bruce Fields
2017-09-05 21:35                       ` NeilBrown
2017-09-06 16:03                         ` J. Bruce Fields
2017-09-07  0:43                           ` NeilBrown
2017-09-08 15:06                             ` J. Bruce Fields
2018-03-16 14:42                           ` J. Bruce Fields
2017-08-25 21:52 ` [PATCH 3/3] nfsd: clients don't need to break their own delegations J. Bruce Fields
2017-08-28  4:32   ` NeilBrown
2017-08-29 21:49     ` J. Bruce Fields
2018-03-16 14:43       ` J. Bruce Fields
2017-09-07 22:01     ` J. Bruce Fields
2017-09-07 22:01       ` J. Bruce Fields
2017-09-08  5:06       ` NeilBrown
2017-09-08 15:05         ` J. Bruce Fields
2017-09-08 15:05           ` J. Bruce Fields
2017-08-26 18:06 ` [PATCH 0/3] Eliminate delegation self-conflicts Chuck Lever
2017-08-29 21:52   ` J. Bruce Fields
2017-08-29 23:39     ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.