All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] Implement NFSv4 delegations, take 9
@ 2013-07-17 20:50 J. Bruce Fields
  2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
                   ` (9 more replies)
  0 siblings, 10 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Changes since version 8, thanks to dchinner and jlayton for review:

	- additional warnings in lock_two_nondirectories
	- lock_two_nondirectories handles NULL second argument,
	  simplifying vfs_rename_other
	- kerneldoc comments on notify_change, vfs_link, vfs_rename,
	  vfs_unlink, to explain delegated_inode argument.
	- make clear non-support of write delegations in
	  generic_add_lease
	- rebase to 3.11-rc1

Introduction copied from previous posting:

This patch series implements read delegations, which allow NFSv4 clients
to perform read opens without contacting the server, by promising to
call back to clients before modifying the data, metadata, or set of
links pointing to a file.

The main recent change was in response to review from Linus, who didn't
want us to hang under directory i_mutex's on timeouts communicating with
unresponsive clients.

So, this version of the series drops the i_mutex before waiting.  The
logic ends up looking something like:

        acquire locks
        look up inode
        test for delegation; if found:
                take reference on inode
                release locks
                wait for delegation break
                drop reference on inode
                retry

The initial test for a delegation happens after the lock on the
delegated inode is acquired, but additional directory mutexes may have
been acquired further up the call stack.  I therefore add a "struct
inode **" argument to any intervening functions, which we use to pass
the inode back up to the caller in the case it needs to wait for the
delegation to be broken.

I also allow callers to pass in NULL for the "struct inode **" argument
to indicate they'd rather just fail than wait for a delegation.  For
example, as long as ecryptfs isn't exportable I assume they'd rather not
see retry logic there that they won't use.  But I may have misjudged in
some of these cases.

J. Bruce Fields (16):
  vfs: pull ext4's double-i_mutex-locking into common code
  vfs: don't use PARENT/CHILD lock classes for non-directories
  vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
  vfs: take i_mutex on renamed file
  locks: introduce new FL_DELEG lock flag
  locks: implement delegations
  namei: minor vfs_unlink cleanup
  locks: break delegations on unlink
  locks: helper functions for delegation breaking
  locks: break delegations on rename
  locks: break delegations on link
  locks: break delegations on any attribute modification
  nfsd4: minor nfs4_setlease cleanup
  nfsd4: delay setting current_fh in open
  nfsd4: close open-deleg/unlink/rename race
  nfsd4: break only delegations when appropriate

 Documentation/filesystems/directory-locking |   31 ++++--
 drivers/base/devtmpfs.c                     |    6 +-
 fs/attr.c                                   |   25 ++++-
 fs/cachefiles/interface.c                   |    4 +-
 fs/cachefiles/namei.c                       |    4 +-
 fs/ecryptfs/inode.c                         |    6 +-
 fs/ext4/ext4.h                              |    2 -
 fs/ext4/ioctl.c                             |    4 +-
 fs/ext4/move_extent.c                       |   40 +-------
 fs/hpfs/namei.c                             |    2 +-
 fs/inode.c                                  |   42 ++++++++-
 fs/locks.c                                  |   57 ++++++++---
 fs/namei.c                                  |  135 +++++++++++++++++++++++----
 fs/nfsd/nfs4proc.c                          |   36 +++----
 fs/nfsd/nfs4state.c                         |   66 ++++++++++---
 fs/nfsd/vfs.c                               |   41 ++------
 fs/nfsd/xdr4.h                              |    3 +-
 fs/open.c                                   |   22 ++++-
 fs/utimes.c                                 |    9 +-
 include/linux/fs.h                          |   72 +++++++++++---
 ipc/mqueue.c                                |    2 +-
 21 files changed, 433 insertions(+), 176 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] nfsd4: fix minorversion support interface
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 21:08   ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

You can turn on or off support for minorversions using e.g.

	echo "-4.2" >/proc/fs/nfsd/versions

However, the current implementation is a little wonky.  For example, the
above will turn off 4.2 support, but it will also turn *on* 4.1 support.

This didn't matter as long as we only had 2 minorversions, which was
true till very recently.

And do a little cleanup here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4proc.c |    2 +-
 fs/nfsd/nfsd.h     |    1 -
 fs/nfsd/nfssvc.c   |   13 +++++++------
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a7cee86..0d4c410 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1293,7 +1293,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
 	 * According to RFC3010, this takes precedence over all other errors.
 	 */
 	status = nfserr_minor_vers_mismatch;
-	if (args->minorversion > nfsd_supported_minorversion)
+	if (nfsd_minorversion(args->minorversion, NFSD_TEST) <= 0)
 		goto out;
 
 	status = nfs41_check_op_ordering(args);
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 2bbd94e..30f34ab 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -53,7 +53,6 @@ struct readdir_cd {
 extern struct svc_program	nfsd_program;
 extern struct svc_version	nfsd_version2, nfsd_version3,
 				nfsd_version4;
-extern u32			nfsd_supported_minorversion;
 extern struct mutex		nfsd_mutex;
 extern spinlock_t		nfsd_drc_lock;
 extern unsigned long		nfsd_drc_max_mem;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 6b9f48c..760c85a 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -116,7 +116,10 @@ struct svc_program		nfsd_program = {
 
 };
 
-u32 nfsd_supported_minorversion = 1;
+static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
+	[0] = 1,
+	[1] = 1,
+};
 
 int nfsd_vers(int vers, enum vers_op change)
 {
@@ -151,15 +154,13 @@ int nfsd_minorversion(u32 minorversion, enum vers_op change)
 		return -1;
 	switch(change) {
 	case NFSD_SET:
-		nfsd_supported_minorversion = minorversion;
+		nfsd_supported_minorversions[minorversion] = true;
 		break;
 	case NFSD_CLEAR:
-		if (minorversion == 0)
-			return -1;
-		nfsd_supported_minorversion = minorversion - 1;
+		nfsd_supported_minorversions[minorversion] = false;
 		break;
 	case NFSD_TEST:
-		return minorversion <= nfsd_supported_minorversion;
+		return nfsd_supported_minorversions[minorversion];
 	case NFSD_AVAIL:
 		return minorversion <= NFSD_SUPPORTED_MINOR_VERSION;
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 01/16] vfs: pull ext4's double-i_mutex-locking into common code
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields,
	Andreas Dilger

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

We want to do this elsewhere as well.

Also catch any attempts to use it for directories (where this ordering
would conflict with ancestor-first directory ordering in lock_rename).

Cc: Andreas Dilger <adilger.kernel-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
Cc: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Acked-by: "Theodore Ts'o" <tytso-3s7WtUTddSA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/ext4/ext4.h        |    2 --
 fs/ext4/ioctl.c       |    4 ++--
 fs/ext4/move_extent.c |   40 ++--------------------------------------
 fs/inode.c            |   36 ++++++++++++++++++++++++++++++++++++
 include/linux/fs.h    |    3 +++
 5 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b577e45..c8fcdfd 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2699,8 +2699,6 @@ extern void ext4_double_down_write_data_sem(struct inode *first,
 					    struct inode *second);
 extern void ext4_double_up_write_data_sem(struct inode *orig_inode,
 					  struct inode *donor_inode);
-void ext4_inode_double_lock(struct inode *inode1, struct inode *inode2);
-void ext4_inode_double_unlock(struct inode *inode1, struct inode *inode2);
 extern int ext4_move_extents(struct file *o_filp, struct file *d_filp,
 			     __u64 start_orig, __u64 start_donor,
 			     __u64 len, __u64 *moved_len);
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 9491ac0..12048f7 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -129,7 +129,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
 
 	/* Protect orig inodes against a truncate and make sure,
 	 * that only 1 swap_inode_boot_loader is running. */
-	ext4_inode_double_lock(inode, inode_bl);
+	lock_two_nondirectories(inode, inode_bl);
 
 	truncate_inode_pages(&inode->i_data, 0);
 	truncate_inode_pages(&inode_bl->i_data, 0);
@@ -204,7 +204,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
 	ext4_inode_resume_unlocked_dio(inode);
 	ext4_inode_resume_unlocked_dio(inode_bl);
 
-	ext4_inode_double_unlock(inode, inode_bl);
+	unlock_two_nondirectories(inode, inode_bl);
 
 	iput(inode_bl);
 
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index e86dddb..f591a75 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1203,42 +1203,6 @@ mext_check_arguments(struct inode *orig_inode,
 }
 
 /**
- * ext4_inode_double_lock - Lock i_mutex on both @inode1 and @inode2
- *
- * @inode1:	the inode structure
- * @inode2:	the inode structure
- *
- * Lock two inodes' i_mutex
- */
-void
-ext4_inode_double_lock(struct inode *inode1, struct inode *inode2)
-{
-	BUG_ON(inode1 == inode2);
-	if (inode1 < inode2) {
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
-	} else {
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
-	}
-}
-
-/**
- * ext4_inode_double_unlock - Release i_mutex on both @inode1 and @inode2
- *
- * @inode1:     the inode that is released first
- * @inode2:     the inode that is released second
- *
- */
-
-void
-ext4_inode_double_unlock(struct inode *inode1, struct inode *inode2)
-{
-	mutex_unlock(&inode1->i_mutex);
-	mutex_unlock(&inode2->i_mutex);
-}
-
-/**
  * ext4_move_extents - Exchange the specified range of a file
  *
  * @o_filp:		file structure of the original file
@@ -1327,7 +1291,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp,
 		return -EINVAL;
 	}
 	/* Protect orig and donor inodes against a truncate */
-	ext4_inode_double_lock(orig_inode, donor_inode);
+	lock_two_nondirectories(orig_inode, donor_inode);
 
 	/* Wait for all existing dio workers */
 	ext4_inode_block_unlocked_dio(orig_inode);
@@ -1535,7 +1499,7 @@ out:
 	ext4_double_up_write_data_sem(orig_inode, donor_inode);
 	ext4_inode_resume_unlocked_dio(orig_inode);
 	ext4_inode_resume_unlocked_dio(donor_inode);
-	ext4_inode_double_unlock(orig_inode, donor_inode);
+	unlock_two_nondirectories(orig_inode, donor_inode);
 
 	return ret;
 }
diff --git a/fs/inode.c b/fs/inode.c
index d6dfb09..57d273e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -982,6 +982,42 @@ void unlock_new_inode(struct inode *inode)
 EXPORT_SYMBOL(unlock_new_inode);
 
 /**
+ * lock_two_nondirectories - take two i_mutexes on non-directory objects
+ * @inode1: first inode to lock
+ * @inode2: second inode to lock
+ */
+void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	WARN_ON_ONCE(S_ISDIR(inode1->i_mode));
+	if (inode1 == inode2 || !inode2) {
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+		return;
+	}
+	WARN_ON_ONCE(S_ISDIR(inode2->i_mode));
+	if (inode1 < inode2) {
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+	} else {
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+	}
+}
+EXPORT_SYMBOL(lock_two_nondirectories);
+
+/**
+ * unlock_two_nondirectories - release locks from lock_two_nondirectories()
+ * @inode1: first inode to unlock
+ * @inode2: second inode to unlock
+ */
+void unlock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	mutex_unlock(&inode1->i_mutex);
+	if (inode2 && inode2 != inode1)
+		mutex_unlock(&inode2->i_mutex);
+}
+EXPORT_SYMBOL(unlock_two_nondirectories);
+
+/**
  * iget5_locked - obtain an inode from a mounted file system
  * @sb:		super block of file system
  * @hashval:	hash value (usually inode number) to get
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9818747..431427a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -636,6 +636,9 @@ enum inode_i_mutex_lock_class
 	I_MUTEX_QUOTA
 };
 
+void lock_two_nondirectories(struct inode *, struct inode*);
+void unlock_two_nondirectories(struct inode *, struct inode*);
+
 /*
  * NOTE: in a 32bit arch with a preemptable kernel and
  * an UP compile the i_size_read/write must be atomic
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 01/16] vfs: pull ext4's double-i_mutex-locking into common code
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields,
	Andreas Dilger

From: "J. Bruce Fields" <bfields@redhat.com>

We want to do this elsewhere as well.

Also catch any attempts to use it for directories (where this ordering
would conflict with ancestor-first directory ordering in lock_rename).

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/ext4/ext4.h        |    2 --
 fs/ext4/ioctl.c       |    4 ++--
 fs/ext4/move_extent.c |   40 ++--------------------------------------
 fs/inode.c            |   36 ++++++++++++++++++++++++++++++++++++
 include/linux/fs.h    |    3 +++
 5 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b577e45..c8fcdfd 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2699,8 +2699,6 @@ extern void ext4_double_down_write_data_sem(struct inode *first,
 					    struct inode *second);
 extern void ext4_double_up_write_data_sem(struct inode *orig_inode,
 					  struct inode *donor_inode);
-void ext4_inode_double_lock(struct inode *inode1, struct inode *inode2);
-void ext4_inode_double_unlock(struct inode *inode1, struct inode *inode2);
 extern int ext4_move_extents(struct file *o_filp, struct file *d_filp,
 			     __u64 start_orig, __u64 start_donor,
 			     __u64 len, __u64 *moved_len);
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 9491ac0..12048f7 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -129,7 +129,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
 
 	/* Protect orig inodes against a truncate and make sure,
 	 * that only 1 swap_inode_boot_loader is running. */
-	ext4_inode_double_lock(inode, inode_bl);
+	lock_two_nondirectories(inode, inode_bl);
 
 	truncate_inode_pages(&inode->i_data, 0);
 	truncate_inode_pages(&inode_bl->i_data, 0);
@@ -204,7 +204,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
 	ext4_inode_resume_unlocked_dio(inode);
 	ext4_inode_resume_unlocked_dio(inode_bl);
 
-	ext4_inode_double_unlock(inode, inode_bl);
+	unlock_two_nondirectories(inode, inode_bl);
 
 	iput(inode_bl);
 
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index e86dddb..f591a75 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1203,42 +1203,6 @@ mext_check_arguments(struct inode *orig_inode,
 }
 
 /**
- * ext4_inode_double_lock - Lock i_mutex on both @inode1 and @inode2
- *
- * @inode1:	the inode structure
- * @inode2:	the inode structure
- *
- * Lock two inodes' i_mutex
- */
-void
-ext4_inode_double_lock(struct inode *inode1, struct inode *inode2)
-{
-	BUG_ON(inode1 == inode2);
-	if (inode1 < inode2) {
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
-	} else {
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
-	}
-}
-
-/**
- * ext4_inode_double_unlock - Release i_mutex on both @inode1 and @inode2
- *
- * @inode1:     the inode that is released first
- * @inode2:     the inode that is released second
- *
- */
-
-void
-ext4_inode_double_unlock(struct inode *inode1, struct inode *inode2)
-{
-	mutex_unlock(&inode1->i_mutex);
-	mutex_unlock(&inode2->i_mutex);
-}
-
-/**
  * ext4_move_extents - Exchange the specified range of a file
  *
  * @o_filp:		file structure of the original file
@@ -1327,7 +1291,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp,
 		return -EINVAL;
 	}
 	/* Protect orig and donor inodes against a truncate */
-	ext4_inode_double_lock(orig_inode, donor_inode);
+	lock_two_nondirectories(orig_inode, donor_inode);
 
 	/* Wait for all existing dio workers */
 	ext4_inode_block_unlocked_dio(orig_inode);
@@ -1535,7 +1499,7 @@ out:
 	ext4_double_up_write_data_sem(orig_inode, donor_inode);
 	ext4_inode_resume_unlocked_dio(orig_inode);
 	ext4_inode_resume_unlocked_dio(donor_inode);
-	ext4_inode_double_unlock(orig_inode, donor_inode);
+	unlock_two_nondirectories(orig_inode, donor_inode);
 
 	return ret;
 }
diff --git a/fs/inode.c b/fs/inode.c
index d6dfb09..57d273e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -982,6 +982,42 @@ void unlock_new_inode(struct inode *inode)
 EXPORT_SYMBOL(unlock_new_inode);
 
 /**
+ * lock_two_nondirectories - take two i_mutexes on non-directory objects
+ * @inode1: first inode to lock
+ * @inode2: second inode to lock
+ */
+void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	WARN_ON_ONCE(S_ISDIR(inode1->i_mode));
+	if (inode1 == inode2 || !inode2) {
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+		return;
+	}
+	WARN_ON_ONCE(S_ISDIR(inode2->i_mode));
+	if (inode1 < inode2) {
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+	} else {
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+	}
+}
+EXPORT_SYMBOL(lock_two_nondirectories);
+
+/**
+ * unlock_two_nondirectories - release locks from lock_two_nondirectories()
+ * @inode1: first inode to unlock
+ * @inode2: second inode to unlock
+ */
+void unlock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	mutex_unlock(&inode1->i_mutex);
+	if (inode2 && inode2 != inode1)
+		mutex_unlock(&inode2->i_mutex);
+}
+EXPORT_SYMBOL(unlock_two_nondirectories);
+
+/**
  * iget5_locked - obtain an inode from a mounted file system
  * @sb:		super block of file system
  * @hashval:	hash value (usually inode number) to get
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9818747..431427a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -636,6 +636,9 @@ enum inode_i_mutex_lock_class
 	I_MUTEX_QUOTA
 };
 
+void lock_two_nondirectories(struct inode *, struct inode*);
+void unlock_two_nondirectories(struct inode *, struct inode*);
+
 /*
  * NOTE: in a 32bit arch with a preemptable kernel and
  * an UP compile the i_size_read/write must be atomic
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
  2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 03/16] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Reserve I_MUTEX_PARENT and I_MUTEX_CHILD for locking of actual
directories.

(Also I_MUTEX_QUOTA isn't really a meaningful name for this locking
class any more; fixed in a later patch.)

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/inode.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 57d273e..f642102 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -995,11 +995,11 @@ void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
 	}
 	WARN_ON_ONCE(S_ISDIR(inode2->i_mode));
 	if (inode1 < inode2) {
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+		mutex_lock(&inode1->i_mutex);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
 	} else {
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+		mutex_lock(&inode2->i_mutex);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
 	}
 }
 EXPORT_SYMBOL(lock_two_nondirectories);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/16] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
  2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 04/16] vfs: take i_mutex on renamed file J. Bruce Fields
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

I_MUTEX_QUOTA is now just being used whenever we want to lock two
non-directories.  So the name isn't right.  I_MUTEX_NONDIR2 isn't
especially elegant but it's the best I could think of.

Also fix some outdated documentation.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/inode.c         |    4 ++--
 include/linux/fs.h |    9 ++++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index f642102..4178e91 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -996,10 +996,10 @@ void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
 	WARN_ON_ONCE(S_ISDIR(inode2->i_mode));
 	if (inode1 < inode2) {
 		mutex_lock(&inode1->i_mutex);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_NONDIR2);
 	} else {
 		mutex_lock(&inode2->i_mutex);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_NONDIR2);
 	}
 }
 EXPORT_SYMBOL(lock_two_nondirectories);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 431427a..edca504 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -622,10 +622,13 @@ static inline int inode_unhashed(struct inode *inode)
  * 0: the object of the current VFS operation
  * 1: parent
  * 2: child/target
- * 3: quota file
+ * 3: xattr
+ * 4: second non-directory
+ * The last is for certain operations (such as rename) which lock two
+ * non-directories at once.
  *
  * The locking order between these classes is
- * parent -> child -> normal -> xattr -> quota
+ * parent -> child -> normal -> xattr -> second non-directory
  */
 enum inode_i_mutex_lock_class
 {
@@ -633,7 +636,7 @@ enum inode_i_mutex_lock_class
 	I_MUTEX_PARENT,
 	I_MUTEX_CHILD,
 	I_MUTEX_XATTR,
-	I_MUTEX_QUOTA
+	I_MUTEX_NONDIR2
 };
 
 void lock_two_nondirectories(struct inode *, struct inode*);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/16] vfs: take i_mutex on renamed file
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (2 preceding siblings ...)
  2013-07-17 20:50 ` [PATCH 03/16] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 08/16] locks: break delegations on unlink J. Bruce Fields
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

A read delegation is used by NFSv4 as a guarantee that a client can
perform local read opens without informing the server.

The open operation takes the last component of the pathname as an
argument, thus is also a lookup operation, and giving the client the
above guarantee means informing the client before we allow anything that
would change the set of names pointing to the inode.

Therefore, we need to break delegations on rename, link, and unlink.

We also need to prevent new delegations from being acquired while one of
these operations is in progress.

We could add some completely new locking for that purpose, but it's
simpler to use the i_mutex, since that's already taken by all the
operations we care about.

The single exception is rename.  So, modify rename to take the i_mutex
on the file that is being renamed.

Also fix up lockdep and Documentation/filesystems/directory-locking to
reflect the change.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 Documentation/filesystems/directory-locking |   31 +++++++++++++++++++--------
 fs/namei.c                                  |   10 ++++-----
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
index ff7b611..09bbf9a 100644
--- a/Documentation/filesystems/directory-locking
+++ b/Documentation/filesystems/directory-locking
@@ -2,6 +2,10 @@
 kinds of locks - per-inode (->i_mutex) and per-filesystem
 (->s_vfs_rename_mutex).
 
+	When taking the i_mutex on multiple non-directory objects, we
+always acquire the locks in order by increasing address.  We'll call
+that "inode pointer" order in the following.
+
 	For our purposes all operations fall in 5 classes:
 
 1) read access.  Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
 locks victim and calls the method.
 
 4) rename() that is _not_ cross-directory.  Locking rules: caller locks
-the parent, finds source and target, if target already exists - locks it
-and then calls the method.
+the parent and finds source and target.  If target already exists, lock
+it.  If source is a non-directory, lock it.  If that means we need to
+lock both, lock them in inode pointer order.
 
 5) link creation.  Locking rules:
 	* lock parent
@@ -30,7 +35,9 @@ rules:
 		fail with -ENOTEMPTY
 	* if new parent is equal to or is a descendent of source
 		fail with -ELOOP
-	* if target exists - lock it.
+	* If target exists, lock it.  If source is a non-directory, lock
+	  it.  In case that means we need to lock both source and target,
+	  do so in inode pointer order.
 	* call the method.
 
 
@@ -56,9 +63,11 @@ objects - A < B iff A is an ancestor of B.
     renames will be blocked on filesystem lock and we don't start changing
     the order until we had acquired all locks).
 
-(3) any operation holds at most one lock on non-directory object and
-    that lock is acquired after all other locks.  (Proof: see descriptions
-    of operations).
+(3) locks on non-directory objects are acquired only after locks on
+    directory objects, and are acquired in inode pointer order.
+    (Proof: all operations but renames take lock on at most one
+    non-directory object, except renames, which take locks on source and
+    target in inode pointer order in the case they are not directories.)
 
 	Now consider the minimal deadlock.  Each process is blocked on
 attempt to acquire some lock and already holds at least one lock.  Let's
@@ -66,9 +75,13 @@ consider the set of contended locks.  First of all, filesystem lock is
 not contended, since any process blocked on it is not holding any locks.
 Thus all processes are blocked on ->i_mutex.
 
-	Non-directory objects are not contended due to (3).  Thus link
-creation can't be a part of deadlock - it can't be blocked on source
-and it means that it doesn't hold any locks.
+	By (3), any process holding a non-directory lock can only be
+waiting on another non-directory lock with a larger address.  Therefore
+the process holding the "largest" such lock can always make progress, and
+non-directory objects are not included in the set of contended locks.
+
+	Thus link creation can't be a part of deadlock - it can't be
+blocked on source and it means that it doesn't hold any locks.
 
 	Any contended object is either held by cross-directory rename or
 has a child that is also contended.  Indeed, suppose that it is held by
diff --git a/fs/namei.c b/fs/namei.c
index 8b61d10..6c91448 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3730,7 +3730,8 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  *	   That's where 4.4 screws up. Current fix: serialization on
  *	   sb->s_vfs_rename_mutex. We might be more accurate, but that's another
  *	   story.
- *	c) we have to lock _three_ objects - parents and victim (if it exists).
+ *	c) we have to lock _four_ objects - parents and victim (if it exists),
+ *	   and source (if it is not a directory).
  *	   And that - after we got ->i_mutex on parents (until then we don't know
  *	   whether the target exists).  Solution: try to be smart with locking
  *	   order for inodes.  We rely on the fact that tree topology may change
@@ -3806,6 +3807,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 			    struct inode *new_dir, struct dentry *new_dentry)
 {
 	struct inode *target = new_dentry->d_inode;
+	struct inode *source = old_dentry->d_inode;
 	int error;
 
 	error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
@@ -3813,8 +3815,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 		return error;
 
 	dget(new_dentry);
-	if (target)
-		mutex_lock(&target->i_mutex);
+	lock_two_nondirectories(source, target);
 
 	error = -EBUSY;
 	if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
@@ -3829,8 +3830,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 	if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE))
 		d_move(old_dentry, new_dentry);
 out:
-	if (target)
-		mutex_unlock(&target->i_mutex);
+	unlock_two_nondirectories(source, target);
 	dput(new_dentry);
 	return error;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/16] locks: introduce new FL_DELEG lock flag
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

For now FL_DELEG is just a synonym for FL_LEASE.  So this patch doesn't
change behavior.

Next we'll modify break_lease to treat FL_DELEG leases differently, to
account for the fact that NFSv4 delegations should be broken in more
situations than Windows oplocks.

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/locks.c          |    2 +-
 fs/nfsd/nfs4state.c |    2 +-
 include/linux/fs.h  |    1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index b27a300..6e46dede 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -134,7 +134,7 @@
 
 #define IS_POSIX(fl)	(fl->fl_flags & FL_POSIX)
 #define IS_FLOCK(fl)	(fl->fl_flags & FL_FLOCK)
-#define IS_LEASE(fl)	(fl->fl_flags & FL_LEASE)
+#define IS_LEASE(fl)	(fl->fl_flags & (FL_LEASE|FL_DELEG))
 
 static bool lease_breaking(struct file_lock *fl)
 {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 280acef..1698816 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3010,7 +3010,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 		return NULL;
 	locks_init_lock(fl);
 	fl->fl_lmops = &nfsd_lease_mng_ops;
-	fl->fl_flags = FL_LEASE;
+	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
 	fl->fl_owner = (fl_owner_t)(dp->dl_file);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index edca504..f8a2343 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -887,6 +887,7 @@ static inline int file_check_writeable(struct file *filp)
 
 #define FL_POSIX	1
 #define FL_FLOCK	2
+#define FL_DELEG	4	/* NFSv4 delegation */
 #define FL_ACCESS	8	/* not trying to lock, just looking */
 #define FL_EXISTS	16	/* when unlocking, test for existence */
 #define FL_LEASE	32	/* lease held on this file */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/16] locks: introduce new FL_DELEG lock flag
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

For now FL_DELEG is just a synonym for FL_LEASE.  So this patch doesn't
change behavior.

Next we'll modify break_lease to treat FL_DELEG leases differently, to
account for the fact that NFSv4 delegations should be broken in more
situations than Windows oplocks.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c          |    2 +-
 fs/nfsd/nfs4state.c |    2 +-
 include/linux/fs.h  |    1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index b27a300..6e46dede 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -134,7 +134,7 @@
 
 #define IS_POSIX(fl)	(fl->fl_flags & FL_POSIX)
 #define IS_FLOCK(fl)	(fl->fl_flags & FL_FLOCK)
-#define IS_LEASE(fl)	(fl->fl_flags & FL_LEASE)
+#define IS_LEASE(fl)	(fl->fl_flags & (FL_LEASE|FL_DELEG))
 
 static bool lease_breaking(struct file_lock *fl)
 {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 280acef..1698816 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3010,7 +3010,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 		return NULL;
 	locks_init_lock(fl);
 	fl->fl_lmops = &nfsd_lease_mng_ops;
-	fl->fl_flags = FL_LEASE;
+	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
 	fl->fl_owner = (fl_owner_t)(dp->dl_file);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index edca504..f8a2343 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -887,6 +887,7 @@ static inline int file_check_writeable(struct file *filp)
 
 #define FL_POSIX	1
 #define FL_FLOCK	2
+#define FL_DELEG	4	/* NFSv4 delegation */
 #define FL_ACCESS	8	/* not trying to lock, just looking */
 #define FL_EXISTS	16	/* when unlocking, test for existence */
 #define FL_LEASE	32	/* lease held on this file */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/16] locks: implement delegations
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Implement NFSv4 delegations at the vfs level using the new FL_DELEG lock
type.

Note nfsd is the only delegation user and is only using read
delegations.  Warn on any attempt to set a write delegation for now.
We'll come back to that case later.

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/locks.c         |   55 ++++++++++++++++++++++++++++++++++++++++++----------
 include/linux/fs.h |   18 ++++++++++++++---
 2 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 6e46dede..7336920 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1292,28 +1292,40 @@ static void time_out_leases(struct inode *inode)
 	}
 }
 
+static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
+{
+	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
+		return false;
+	return locks_conflict(breaker, lease);
+}
+
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
- *	@mode: the open mode (read or write)
+ *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
+ *	    break all leases
+ *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
+ *	    only delegations
  *
  *	break_lease (inlined for speed) has checked there already is at least
  *	some kind of lock (maybe a lease) on this file.  Leases are broken on
  *	a call to open() or truncate().  This function can sleep unless you
  *	specified %O_NONBLOCK to your open().
  */
-int __break_lease(struct inode *inode, unsigned int mode)
+int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	int error = 0;
 	struct file_lock *new_fl, *flock;
 	struct file_lock *fl;
 	unsigned long break_time;
 	int i_have_this_lease = 0;
+	bool lease_conflict = false;
 	int want_write = (mode & O_ACCMODE) != O_RDONLY;
 
 	new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
+	new_fl->fl_flags = type;
 
 	spin_lock(&inode->i_lock);
 
@@ -1323,13 +1335,16 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	if ((flock == NULL) || !IS_LEASE(flock))
 		goto out;
 
-	if (!locks_conflict(flock, new_fl))
+	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (leases_conflict(fl, new_fl)) {
+			lease_conflict = true;
+			if (fl->fl_owner == current->files)
+				i_have_this_lease = 1;
+		}
+	}
+	if (!lease_conflict)
 		goto out;
 
-	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next)
-		if (fl->fl_owner == current->files)
-			i_have_this_lease = 1;
-
 	break_time = 0;
 	if (lease_break_time > 0) {
 		break_time = jiffies + lease_break_time * HZ;
@@ -1338,6 +1353,8 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	}
 
 	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (!leases_conflict(fl, new_fl))
+			continue;
 		if (want_write) {
 			if (fl->fl_flags & FL_UNLOCK_PENDING)
 				continue;
@@ -1379,7 +1396,7 @@ restart:
 		 */
 		for (flock = inode->i_flock; flock && IS_LEASE(flock);
 				flock = flock->fl_next) {
-			if (locks_conflict(new_fl, flock))
+			if (leases_conflict(new_fl, flock))
 				goto restart;
 		}
 		error = 0;
@@ -1460,9 +1477,26 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
 	struct file_lock *fl, **before, **my_before = NULL, *lease;
 	struct dentry *dentry = filp->f_path.dentry;
 	struct inode *inode = dentry->d_inode;
+	bool is_deleg = (*flp)->fl_flags & FL_DELEG;
 	int error;
 
 	lease = *flp;
+	/*
+	 * In the delegation case we need mutual exclusion with
+	 * a number of operations that take the i_mutex.  We trylock
+	 * because delegations are an optional optimization, and if
+	 * there's some chance of a conflict--we'd rather not
+	 * bother, maybe that's a sign this just isn't a good file to
+	 * hand out a delegation on.
+	 */
+	if (is_deleg && !mutex_trylock(&inode->i_mutex))
+		return -EAGAIN;
+
+	if (is_deleg && arg == F_WRLCK) {
+		/* Write delegations are not currently supported: */
+		WARN_ON_ONCE(1);
+		return -EINVAL;
+	}
 
 	error = -EAGAIN;
 	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
@@ -1514,9 +1548,10 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
 		goto out;
 
 	locks_insert_lock(before, lease);
-	return 0;
-
+	error = 0;
 out:
+	if (is_deleg)
+		mutex_unlock(&inode->i_mutex);
 	return error;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f8a2343..bc1d8b8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1029,7 +1029,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
-extern int __break_lease(struct inode *inode, unsigned int flags);
+extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
 extern int vfs_setlease(struct file *, long, struct file_lock **);
@@ -1138,7 +1138,7 @@ static inline int flock_lock_file_wait(struct file *filp,
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode)
+static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	return 0;
 }
@@ -1963,9 +1963,17 @@ static inline int locks_verify_truncate(struct inode *inode,
 static inline int break_lease(struct inode *inode, unsigned int mode)
 {
 	if (inode->i_flock)
-		return __break_lease(inode, mode);
+		return __break_lease(inode, mode, FL_LEASE);
 	return 0;
 }
+
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	if (inode->i_flock)
+		return __break_lease(inode, mode, FL_DELEG);
+	return 0;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -2005,6 +2013,10 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	return 0;
+}
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/16] locks: implement delegations
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Implement NFSv4 delegations at the vfs level using the new FL_DELEG lock
type.

Note nfsd is the only delegation user and is only using read
delegations.  Warn on any attempt to set a write delegation for now.
We'll come back to that case later.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c         |   55 ++++++++++++++++++++++++++++++++++++++++++----------
 include/linux/fs.h |   18 ++++++++++++++---
 2 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 6e46dede..7336920 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1292,28 +1292,40 @@ static void time_out_leases(struct inode *inode)
 	}
 }
 
+static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
+{
+	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
+		return false;
+	return locks_conflict(breaker, lease);
+}
+
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
- *	@mode: the open mode (read or write)
+ *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
+ *	    break all leases
+ *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
+ *	    only delegations
  *
  *	break_lease (inlined for speed) has checked there already is at least
  *	some kind of lock (maybe a lease) on this file.  Leases are broken on
  *	a call to open() or truncate().  This function can sleep unless you
  *	specified %O_NONBLOCK to your open().
  */
-int __break_lease(struct inode *inode, unsigned int mode)
+int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	int error = 0;
 	struct file_lock *new_fl, *flock;
 	struct file_lock *fl;
 	unsigned long break_time;
 	int i_have_this_lease = 0;
+	bool lease_conflict = false;
 	int want_write = (mode & O_ACCMODE) != O_RDONLY;
 
 	new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
+	new_fl->fl_flags = type;
 
 	spin_lock(&inode->i_lock);
 
@@ -1323,13 +1335,16 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	if ((flock == NULL) || !IS_LEASE(flock))
 		goto out;
 
-	if (!locks_conflict(flock, new_fl))
+	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (leases_conflict(fl, new_fl)) {
+			lease_conflict = true;
+			if (fl->fl_owner == current->files)
+				i_have_this_lease = 1;
+		}
+	}
+	if (!lease_conflict)
 		goto out;
 
-	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next)
-		if (fl->fl_owner == current->files)
-			i_have_this_lease = 1;
-
 	break_time = 0;
 	if (lease_break_time > 0) {
 		break_time = jiffies + lease_break_time * HZ;
@@ -1338,6 +1353,8 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	}
 
 	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (!leases_conflict(fl, new_fl))
+			continue;
 		if (want_write) {
 			if (fl->fl_flags & FL_UNLOCK_PENDING)
 				continue;
@@ -1379,7 +1396,7 @@ restart:
 		 */
 		for (flock = inode->i_flock; flock && IS_LEASE(flock);
 				flock = flock->fl_next) {
-			if (locks_conflict(new_fl, flock))
+			if (leases_conflict(new_fl, flock))
 				goto restart;
 		}
 		error = 0;
@@ -1460,9 +1477,26 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
 	struct file_lock *fl, **before, **my_before = NULL, *lease;
 	struct dentry *dentry = filp->f_path.dentry;
 	struct inode *inode = dentry->d_inode;
+	bool is_deleg = (*flp)->fl_flags & FL_DELEG;
 	int error;
 
 	lease = *flp;
+	/*
+	 * In the delegation case we need mutual exclusion with
+	 * a number of operations that take the i_mutex.  We trylock
+	 * because delegations are an optional optimization, and if
+	 * there's some chance of a conflict--we'd rather not
+	 * bother, maybe that's a sign this just isn't a good file to
+	 * hand out a delegation on.
+	 */
+	if (is_deleg && !mutex_trylock(&inode->i_mutex))
+		return -EAGAIN;
+
+	if (is_deleg && arg == F_WRLCK) {
+		/* Write delegations are not currently supported: */
+		WARN_ON_ONCE(1);
+		return -EINVAL;
+	}
 
 	error = -EAGAIN;
 	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
@@ -1514,9 +1548,10 @@ static int generic_add_lease(struct file *filp, long arg, struct file_lock **flp
 		goto out;
 
 	locks_insert_lock(before, lease);
-	return 0;
-
+	error = 0;
 out:
+	if (is_deleg)
+		mutex_unlock(&inode->i_mutex);
 	return error;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f8a2343..bc1d8b8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1029,7 +1029,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
-extern int __break_lease(struct inode *inode, unsigned int flags);
+extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
 extern int vfs_setlease(struct file *, long, struct file_lock **);
@@ -1138,7 +1138,7 @@ static inline int flock_lock_file_wait(struct file *filp,
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode)
+static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	return 0;
 }
@@ -1963,9 +1963,17 @@ static inline int locks_verify_truncate(struct inode *inode,
 static inline int break_lease(struct inode *inode, unsigned int mode)
 {
 	if (inode->i_flock)
-		return __break_lease(inode, mode);
+		return __break_lease(inode, mode, FL_LEASE);
 	return 0;
 }
+
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	if (inode->i_flock)
+		return __break_lease(inode, mode, FL_DELEG);
+	return 0;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -2005,6 +2013,10 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	return 0;
+}
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/16] namei: minor vfs_unlink cleanup
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

We'll be using dentry->d_inode in one more place.

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/namei.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 6c91448..2b44960 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3433,6 +3433,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
+	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
 
 	if (error)
@@ -3441,7 +3442,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 	if (!dir->i_op->unlink)
 		return -EPERM;
 
-	mutex_lock(&dentry->d_inode->i_mutex);
+	mutex_lock(&target->i_mutex);
 	if (d_mountpoint(dentry))
 		error = -EBUSY;
 	else {
@@ -3452,11 +3453,11 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 				dont_mount(dentry);
 		}
 	}
-	mutex_unlock(&dentry->d_inode->i_mutex);
+	mutex_unlock(&target->i_mutex);
 
 	/* We don't d_delete() NFS sillyrenamed files--they still exist. */
 	if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
-		fsnotify_link_count(dentry->d_inode);
+		fsnotify_link_count(target);
 		d_delete(dentry);
 	}
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/16] namei: minor vfs_unlink cleanup
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We'll be using dentry->d_inode in one more place.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/namei.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 6c91448..2b44960 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3433,6 +3433,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
+	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
 
 	if (error)
@@ -3441,7 +3442,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 	if (!dir->i_op->unlink)
 		return -EPERM;
 
-	mutex_lock(&dentry->d_inode->i_mutex);
+	mutex_lock(&target->i_mutex);
 	if (d_mountpoint(dentry))
 		error = -EBUSY;
 	else {
@@ -3452,11 +3453,11 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 				dont_mount(dentry);
 		}
 	}
-	mutex_unlock(&dentry->d_inode->i_mutex);
+	mutex_unlock(&target->i_mutex);
 
 	/* We don't d_delete() NFS sillyrenamed files--they still exist. */
 	if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
-		fsnotify_link_count(dentry->d_inode);
+		fsnotify_link_count(target);
 		d_delete(dentry);
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/16] locks: break delegations on unlink
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (3 preceding siblings ...)
  2013-07-17 20:50 ` [PATCH 04/16] vfs: take i_mutex on renamed file J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 10/16] locks: break delegations on rename J. Bruce Fields
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields,
	David Howells, Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields@redhat.com>

We need to break delegations on any operation that changes the set of
links pointing to an inode.  Start with unlink.

Such operations also hold the i_mutex on a parent directory.  Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client.  To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:

	acquire locks
	...
	test for delegation; if found:
		take reference on inode
		release locks
		wait for delegation break
		drop reference on inode
		retry

It is possible this could never terminate.  (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.)  But this seems very unlikely.

The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack.  We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.

Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 drivers/base/devtmpfs.c |    2 +-
 fs/cachefiles/namei.c   |    2 +-
 fs/ecryptfs/inode.c     |    2 +-
 fs/namei.c              |   42 +++++++++++++++++++++++++++++++++++++++---
 fs/nfsd/vfs.c           |    2 +-
 include/linux/fs.h      |    2 +-
 ipc/mqueue.c            |    2 +-
 7 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 7413d06..1b8490e 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -324,7 +324,7 @@ static int handle_remove(const char *nodename, struct device *dev)
 			mutex_lock(&dentry->d_inode->i_mutex);
 			notify_change(dentry, &newattrs);
 			mutex_unlock(&dentry->d_inode->i_mutex);
-			err = vfs_unlink(parent.dentry->d_inode, dentry);
+			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
 			if (!err || err == -ENOENT)
 				deleted = 1;
 		}
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 25badd1..6846fcef 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -294,7 +294,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 		if (ret < 0) {
 			cachefiles_io_error(cache, "Unlink security error");
 		} else {
-			ret = vfs_unlink(dir->d_inode, rep);
+			ret = vfs_unlink(dir->d_inode, rep, NULL);
 
 			if (preemptive)
 				cachefiles_mark_object_buried(cache, rep);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 67e9b63..0b8b632 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -153,7 +153,7 @@ static int ecryptfs_do_unlink(struct inode *dir, struct dentry *dentry,
 
 	dget(lower_dentry);
 	lower_dir_dentry = lock_parent(lower_dentry);
-	rc = vfs_unlink(lower_dir_inode, lower_dentry);
+	rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL);
 	if (rc) {
 		printk(KERN_ERR "Error in vfs_unlink; rc = [%d]\n", rc);
 		goto out_unlock;
diff --git a/fs/namei.c b/fs/namei.c
index 2b44960..2826bbd 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3431,7 +3431,25 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 	return do_rmdir(AT_FDCWD, pathname);
 }
 
-int vfs_unlink(struct inode *dir, struct dentry *dentry)
+/**
+ * vfs_unlink - unlink a filesystem object
+ * @dir:	parent directory
+ * @dentry:	victim
+ * @delegated_inode: returns victim inode, if the inode is delegated.
+ *
+ * The caller must hold dir->i_mutex.
+ *
+ * If vfs_unlink discovers a delegation, it will return -EWOULDBLOCK and
+ * return a reference to the inode in delegated_inode.  The caller
+ * should then break the delegation on that inode and retry.  Because
+ * breaking a delegation may take a long time, the caller should drop
+ * dir->i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.
+ */
+int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegated_inode)
 {
 	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
@@ -3448,11 +3466,20 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
+			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
+			if (error) {
+				if (error == -EWOULDBLOCK && delegated_inode) {
+					*delegated_inode = target;
+					ihold(target);
+				}
+				goto out;
+			}
 			error = dir->i_op->unlink(dir, dentry);
 			if (!error)
 				dont_mount(dentry);
 		}
 	}
+out:
 	mutex_unlock(&target->i_mutex);
 
 	/* We don't d_delete() NFS sillyrenamed files--they still exist. */
@@ -3477,6 +3504,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	struct dentry *dentry;
 	struct nameidata nd;
 	struct inode *inode = NULL;
+	struct inode *delegated_inode = NULL;
 	unsigned int lookup_flags = 0;
 retry:
 	name = user_path_parent(dfd, pathname, &nd, lookup_flags);
@@ -3491,7 +3519,7 @@ retry:
 	error = mnt_want_write(nd.path.mnt);
 	if (error)
 		goto exit1;
-
+retry_deleg:
 	mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
 	dentry = lookup_hash(&nd);
 	error = PTR_ERR(dentry);
@@ -3506,13 +3534,21 @@ retry:
 		error = security_path_unlink(&nd.path, dentry);
 		if (error)
 			goto exit2;
-		error = vfs_unlink(nd.path.dentry->d_inode, dentry);
+		error = vfs_unlink(nd.path.dentry->d_inode, dentry, &delegated_inode);
 exit2:
 		dput(dentry);
 	}
 	mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
 	if (inode)
 		iput(inode);	/* truncate the inode here */
+	inode = NULL;
+	if (delegated_inode) {
+		error = break_deleg(delegated_inode, O_WRONLY);
+		iput(delegated_inode);
+		delegated_inode = NULL;
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(nd.path.mnt);
 exit1:
 	path_put(&nd.path);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 8ff6a00..ff1fc44 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1910,7 +1910,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	if (host_err)
 		goto out_put;
 	if (type != S_IFDIR)
-		host_err = vfs_unlink(dirp, rdentry);
+		host_err = vfs_unlink(dirp, rdentry, NULL);
 	else
 		host_err = vfs_rmdir(dirp, rdentry);
 	if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bc1d8b8..ba29b37 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1469,7 +1469,7 @@ extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
 extern int vfs_rmdir(struct inode *, struct dentry *);
-extern int vfs_unlink(struct inode *, struct dentry *);
+extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
 
 /*
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ae1996d..95827ce 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -886,7 +886,7 @@ SYSCALL_DEFINE1(mq_unlink, const char __user *, u_name)
 		err = -ENOENT;
 	} else {
 		ihold(inode);
-		err = vfs_unlink(dentry->d_parent->d_inode, dentry);
+		err = vfs_unlink(dentry->d_parent->d_inode, dentry, NULL);
 	}
 	dput(dentry);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/16] locks: helper functions for delegation breaking
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

We'll need the same logic for rename and link.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/namei.c         |   13 +++----------
 include/linux/fs.h |   33 +++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2826bbd..d3b6a35 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3466,14 +3466,9 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
-			if (error) {
-				if (error == -EWOULDBLOCK && delegated_inode) {
-					*delegated_inode = target;
-					ihold(target);
-				}
+			error = try_break_deleg(target, delegated_inode);
+			if (error)
 				goto out;
-			}
 			error = dir->i_op->unlink(dir, dentry);
 			if (!error)
 				dont_mount(dentry);
@@ -3543,9 +3538,7 @@ exit2:
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
 	if (delegated_inode) {
-		error = break_deleg(delegated_inode, O_WRONLY);
-		iput(delegated_inode);
-		delegated_inode = NULL;
+		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ba29b37..43a3506 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1907,6 +1907,9 @@ extern bool our_mnt(struct vfsmount *mnt);
 
 extern int current_umask(void);
 
+extern void ihold(struct inode * inode);
+extern void iput(struct inode *);
+
 /* /sys/fs */
 extern struct kobject *fs_kobj;
 
@@ -1974,6 +1977,28 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
+	if (ret == -EWOULDBLOCK && delegated_inode) {
+		*delegated_inode = inode;
+		ihold(inode);
+	}
+	return ret;
+}
+
+static inline int break_deleg_wait(struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(*delegated_inode, O_WRONLY);
+	iput(*delegated_inode);
+	*delegated_inode = NULL;
+	return ret;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -2017,6 +2042,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 {
 	return 0;
 }
+
+static inline int try_break_deleg(struct inode *inode, struct delegated_inode **inode)
+{
+	return 0;
+}
+
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
@@ -2346,8 +2377,6 @@ extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
 extern int inode_init_always(struct super_block *, struct inode *);
 extern void inode_init_once(struct inode *);
 extern void address_space_init_once(struct address_space *mapping);
-extern void ihold(struct inode * inode);
-extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/16] locks: helper functions for delegation breaking
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We'll need the same logic for rename and link.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/namei.c         |   13 +++----------
 include/linux/fs.h |   33 +++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2826bbd..d3b6a35 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3466,14 +3466,9 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
-			if (error) {
-				if (error == -EWOULDBLOCK && delegated_inode) {
-					*delegated_inode = target;
-					ihold(target);
-				}
+			error = try_break_deleg(target, delegated_inode);
+			if (error)
 				goto out;
-			}
 			error = dir->i_op->unlink(dir, dentry);
 			if (!error)
 				dont_mount(dentry);
@@ -3543,9 +3538,7 @@ exit2:
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
 	if (delegated_inode) {
-		error = break_deleg(delegated_inode, O_WRONLY);
-		iput(delegated_inode);
-		delegated_inode = NULL;
+		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ba29b37..43a3506 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1907,6 +1907,9 @@ extern bool our_mnt(struct vfsmount *mnt);
 
 extern int current_umask(void);
 
+extern void ihold(struct inode * inode);
+extern void iput(struct inode *);
+
 /* /sys/fs */
 extern struct kobject *fs_kobj;
 
@@ -1974,6 +1977,28 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
+	if (ret == -EWOULDBLOCK && delegated_inode) {
+		*delegated_inode = inode;
+		ihold(inode);
+	}
+	return ret;
+}
+
+static inline int break_deleg_wait(struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(*delegated_inode, O_WRONLY);
+	iput(*delegated_inode);
+	*delegated_inode = NULL;
+	return ret;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -2017,6 +2042,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 {
 	return 0;
 }
+
+static inline int try_break_deleg(struct inode *inode, struct delegated_inode **inode)
+{
+	return 0;
+}
+
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
@@ -2346,8 +2377,6 @@ extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
 extern int inode_init_always(struct super_block *, struct inode *);
 extern void inode_init_once(struct inode *);
 extern void address_space_init_once(struct address_space *mapping);
-extern void ihold(struct inode * inode);
-extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/16] locks: break delegations on rename
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (4 preceding siblings ...)
  2013-07-17 20:50 ` [PATCH 08/16] locks: break delegations on unlink J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup J. Bruce Fields
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields,
	David Howells

From: "J. Bruce Fields" <bfields@redhat.com>

Cc: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/cachefiles/namei.c |    2 +-
 fs/namei.c            |   47 +++++++++++++++++++++++++++++++++++++++++++----
 fs/nfsd/vfs.c         |    2 +-
 include/linux/fs.h    |    2 +-
 4 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 6846fcef..1363eae 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -396,7 +396,7 @@ try_again:
 		cachefiles_io_error(cache, "Rename security error %d", ret);
 	} else {
 		ret = vfs_rename(dir->d_inode, rep,
-				 cache->graveyard->d_inode, grave);
+				 cache->graveyard->d_inode, grave, NULL);
 		if (ret != 0 && ret != -ENOMEM)
 			cachefiles_io_error(cache,
 					    "Rename failed with error %d", ret);
diff --git a/fs/namei.c b/fs/namei.c
index d3b6a35..0554c9a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3834,7 +3834,8 @@ out:
 }
 
 static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
-			    struct inode *new_dir, struct dentry *new_dentry)
+			    struct inode *new_dir, struct dentry *new_dentry,
+			    struct inode **delegated_inode)
 {
 	struct inode *target = new_dentry->d_inode;
 	struct inode *source = old_dentry->d_inode;
@@ -3851,6 +3852,14 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 	if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
 		goto out;
 
+	error = try_break_deleg(source, delegated_inode);
+	if (error)
+		goto out;
+	if (target) {
+		error = try_break_deleg(target, delegated_inode);
+		if (error)
+			goto out;
+	}
 	error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
 	if (error)
 		goto out;
@@ -3865,8 +3874,30 @@ out:
 	return error;
 }
 
+/**
+ * vfs_rename - rename a filesystem object
+ * @old_dir:	parent of source
+ * @old_dentry:	source
+ * @new_dir:	parent of destination
+ * @new_dentry:	destination
+ * @delegated_inode: returns an inode needing a delegation break
+ *
+ * The caller must hold multiple mutexes--see lock_rename()).
+ *
+ * If vfs_rename discovers a delegation in need of breaking at either
+ * the source or destination, it will return -EWOULDBLOCK and return a
+ * reference to the inode in delegated_inode.  The caller should then
+ * break the delegation and retry.  Because breaking a delegation may
+ * take a long time, the caller should drop all locks before doing
+ * so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.
+ */
 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
-	       struct inode *new_dir, struct dentry *new_dentry)
+	       struct inode *new_dir, struct dentry *new_dentry,
+	       struct inode **delegated_inode)
 {
 	int error;
 	int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
@@ -3894,7 +3925,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (is_dir)
 		error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
 	else
-		error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
+		error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry,delegated_inode);
 	if (!error)
 		fsnotify_move(old_dir, new_dir, old_name, is_dir,
 			      new_dentry->d_inode, old_dentry);
@@ -3910,6 +3941,7 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname,
 	struct dentry *old_dentry, *new_dentry;
 	struct dentry *trap;
 	struct nameidata oldnd, newnd;
+	struct inode *delegated_inode = NULL;
 	struct filename *from;
 	struct filename *to;
 	unsigned int lookup_flags = 0;
@@ -3949,6 +3981,7 @@ retry:
 	newnd.flags &= ~LOOKUP_PARENT;
 	newnd.flags |= LOOKUP_RENAME_TARGET;
 
+retry_deleg:
 	trap = lock_rename(new_dir, old_dir);
 
 	old_dentry = lookup_hash(&oldnd);
@@ -3985,13 +4018,19 @@ retry:
 	if (error)
 		goto exit5;
 	error = vfs_rename(old_dir->d_inode, old_dentry,
-				   new_dir->d_inode, new_dentry);
+				   new_dir->d_inode, new_dentry,
+				   &delegated_inode);
 exit5:
 	dput(new_dentry);
 exit4:
 	dput(old_dentry);
 exit3:
 	unlock_rename(new_dir, old_dir);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(oldnd.path.mnt);
 exit2:
 	if (retry_estale(error, lookup_flags))
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ff1fc44..fbf22d2 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1837,7 +1837,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 		if (host_err)
 			goto out_dput_new;
 	}
-	host_err = vfs_rename(fdir, odentry, tdir, ndentry);
+	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL);
 	if (!host_err) {
 		host_err = commit_metadata(tfhp);
 		if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 43a3506..74aa0c7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1470,7 +1470,7 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
-extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
+extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **);
 
 /*
  * VFS dentry helper functions.
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/16] locks: break delegations on link
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields,
	Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Cc: Tyler Hicks <tyhicks-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Dustin Kirkland <dustin.kirkland-Bv2LyzZ6GzxBDgjK7y7TUQ@public.gmane.org>
Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/ecryptfs/inode.c |    2 +-
 fs/namei.c          |   36 ++++++++++++++++++++++++++++++++----
 fs/nfsd/vfs.c       |    2 +-
 include/linux/fs.h  |    2 +-
 4 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 0b8b632..cb3ac33 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -475,7 +475,7 @@ static int ecryptfs_link(struct dentry *old_dentry, struct inode *dir,
 	dget(lower_new_dentry);
 	lower_dir_dentry = lock_parent(lower_new_dentry);
 	rc = vfs_link(lower_old_dentry, lower_dir_dentry->d_inode,
-		      lower_new_dentry);
+		      lower_new_dentry, NULL);
 	if (rc || !lower_new_dentry->d_inode)
 		goto out_lock;
 	rc = ecryptfs_interpose(lower_new_dentry, new_dentry, dir->i_sb);
diff --git a/fs/namei.c b/fs/namei.c
index 0554c9a..bd1f78e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3631,7 +3631,26 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
 	return sys_symlinkat(oldname, AT_FDCWD, newname);
 }
 
-int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)
+/**
+ * vfs_link - create a new link
+ * @old_dentry:	object to be linked
+ * @dir:	new parent
+ * @new_dentry:	where to create the new link
+ * @delegated_inode: returns inode needing a delegation break
+ *
+ * The caller must hold dir->i_mutex
+ *
+ * If vfs_link discovers a delegation on the to-be-linked file in need
+ * of breaking, it will return -EWOULDBLOCK and return a reference to the
+ * inode in delegated_inode.  The caller should then break the delegation
+ * and retry.  Because breaking a delegation may take a long time, the
+ * caller should drop the i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.
+ */
+int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct inode **delegated_inode)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -3667,8 +3686,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		error =  -ENOENT;
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
-	else
-		error = dir->i_op->link(old_dentry, dir, new_dentry);
+	else {
+		error = try_break_deleg(inode, delegated_inode);
+		if (!error)
+			error = dir->i_op->link(old_dentry, dir, new_dentry);
+	}
 
 	if (!error && (inode->i_state & I_LINKABLE)) {
 		spin_lock(&inode->i_lock);
@@ -3695,6 +3717,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 {
 	struct dentry *new_dentry;
 	struct path old_path, new_path;
+	struct inode *delegated_inode = NULL;
 	int how = 0;
 	int error;
 
@@ -3733,9 +3756,14 @@ retry:
 	error = security_path_link(old_path.dentry, &new_path, new_dentry);
 	if (error)
 		goto out_dput;
-	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry);
+	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, how)) {
 		how |= LOOKUP_REVAL;
 		goto retry;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fbf22d2..5479fff 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1736,7 +1736,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 		err = nfserrno(host_err);
 		goto out_dput;
 	}
-	host_err = vfs_link(dold, dirp, dnew);
+	host_err = vfs_link(dold, dirp, dnew, NULL);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
 		if (!err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 74aa0c7..a2403f9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1467,7 +1467,7 @@ extern int vfs_create(struct inode *, struct dentry *, umode_t, bool);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
-extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
+extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/16] locks: break delegations on link
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields,
	Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields@redhat.com>

Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/ecryptfs/inode.c |    2 +-
 fs/namei.c          |   36 ++++++++++++++++++++++++++++++++----
 fs/nfsd/vfs.c       |    2 +-
 include/linux/fs.h  |    2 +-
 4 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 0b8b632..cb3ac33 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -475,7 +475,7 @@ static int ecryptfs_link(struct dentry *old_dentry, struct inode *dir,
 	dget(lower_new_dentry);
 	lower_dir_dentry = lock_parent(lower_new_dentry);
 	rc = vfs_link(lower_old_dentry, lower_dir_dentry->d_inode,
-		      lower_new_dentry);
+		      lower_new_dentry, NULL);
 	if (rc || !lower_new_dentry->d_inode)
 		goto out_lock;
 	rc = ecryptfs_interpose(lower_new_dentry, new_dentry, dir->i_sb);
diff --git a/fs/namei.c b/fs/namei.c
index 0554c9a..bd1f78e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3631,7 +3631,26 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
 	return sys_symlinkat(oldname, AT_FDCWD, newname);
 }
 
-int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)
+/**
+ * vfs_link - create a new link
+ * @old_dentry:	object to be linked
+ * @dir:	new parent
+ * @new_dentry:	where to create the new link
+ * @delegated_inode: returns inode needing a delegation break
+ *
+ * The caller must hold dir->i_mutex
+ *
+ * If vfs_link discovers a delegation on the to-be-linked file in need
+ * of breaking, it will return -EWOULDBLOCK and return a reference to the
+ * inode in delegated_inode.  The caller should then break the delegation
+ * and retry.  Because breaking a delegation may take a long time, the
+ * caller should drop the i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.
+ */
+int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct inode **delegated_inode)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -3667,8 +3686,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		error =  -ENOENT;
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
-	else
-		error = dir->i_op->link(old_dentry, dir, new_dentry);
+	else {
+		error = try_break_deleg(inode, delegated_inode);
+		if (!error)
+			error = dir->i_op->link(old_dentry, dir, new_dentry);
+	}
 
 	if (!error && (inode->i_state & I_LINKABLE)) {
 		spin_lock(&inode->i_lock);
@@ -3695,6 +3717,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 {
 	struct dentry *new_dentry;
 	struct path old_path, new_path;
+	struct inode *delegated_inode = NULL;
 	int how = 0;
 	int error;
 
@@ -3733,9 +3756,14 @@ retry:
 	error = security_path_link(old_path.dentry, &new_path, new_dentry);
 	if (error)
 		goto out_dput;
-	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry);
+	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, how)) {
 		how |= LOOKUP_REVAL;
 		goto retry;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fbf22d2..5479fff 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1736,7 +1736,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 		err = nfserrno(host_err);
 		goto out_dput;
 	}
-	host_err = vfs_link(dold, dirp, dnew);
+	host_err = vfs_link(dold, dirp, dnew, NULL);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
 		if (!err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 74aa0c7..a2403f9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1467,7 +1467,7 @@ extern int vfs_create(struct inode *, struct dentry *, umode_t, bool);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
-extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
+extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/16] locks: break delegations on any attribute modification
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields,
	Mikulas Patocka, David Howells, Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

NFSv4 uses leases to guarantee that clients can cache metadata as well
as data.

Cc: Mikulas Patocka <mikulas-TTVWCEgN8Z9G4ohzP4jBZS1Fcj925eT/@public.gmane.org>
Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Tyler Hicks <tyhicks-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Dustin Kirkland <dustin.kirkland-Bv2LyzZ6GzxBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/base/devtmpfs.c   |    4 ++--
 fs/attr.c                 |   25 ++++++++++++++++++++++++-
 fs/cachefiles/interface.c |    4 ++--
 fs/ecryptfs/inode.c       |    2 +-
 fs/hpfs/namei.c           |    2 +-
 fs/inode.c                |    6 +++++-
 fs/nfsd/vfs.c             |    8 ++++++--
 fs/open.c                 |   22 ++++++++++++++++++----
 fs/utimes.c               |    9 ++++++++-
 include/linux/fs.h        |    2 +-
 10 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 1b8490e..0f38201 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -216,7 +216,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
 		newattrs.ia_gid = gid;
 		newattrs.ia_valid = ATTR_MODE|ATTR_UID|ATTR_GID;
 		mutex_lock(&dentry->d_inode->i_mutex);
-		notify_change(dentry, &newattrs);
+		notify_change(dentry, &newattrs, NULL);
 		mutex_unlock(&dentry->d_inode->i_mutex);
 
 		/* mark as kernel-created inode */
@@ -322,7 +322,7 @@ static int handle_remove(const char *nodename, struct device *dev)
 			newattrs.ia_valid =
 				ATTR_UID|ATTR_GID|ATTR_MODE;
 			mutex_lock(&dentry->d_inode->i_mutex);
-			notify_change(dentry, &newattrs);
+			notify_change(dentry, &newattrs, NULL);
 			mutex_unlock(&dentry->d_inode->i_mutex);
 			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
 			if (!err || err == -ENOENT)
diff --git a/fs/attr.c b/fs/attr.c
index 1449adb..267968d 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -167,7 +167,27 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
 }
 EXPORT_SYMBOL(setattr_copy);
 
-int notify_change(struct dentry * dentry, struct iattr * attr)
+/**
+ * notify_change - modify attributes of a filesytem object
+ * @dentry:	object affected
+ * @iattr:	new attributes
+ * @delegated_inode: returns inode, if the inode is delegated
+ *
+ * The caller must hold the i_mutex on the affected object.
+ *
+ * If notify_change discovers a delegation in need of breaking,
+ * it will return -EWOULDBLOCK and return a reference to the inode in
+ * delegated_inode.  The caller should then break the delegation and
+ * retry.  Because breaking a delegation may take a long time, the
+ * caller should drop the i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.  Also, passing NULL is fine for callers holding
+ * the file open for write, as there can be no conflicting delegation in
+ * that case.
+ */
+int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -243,6 +263,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
 	error = security_inode_setattr(dentry, attr);
 	if (error)
 		return error;
+	error = try_break_deleg(inode, delegated_inode);
+	if (error)
+		return error;
 
 	if (inode->i_op->setattr)
 		error = inode->i_op->setattr(dentry, attr);
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index d4c1206..ccc01f1 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -424,14 +424,14 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(object->backer, &newattrs);
+		ret = notify_change(object->backer, &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(object->backer, &newattrs);
+	ret = notify_change(object->backer, &newattrs, NULL);
 
 truncate_failed:
 	mutex_unlock(&object->backer->d_inode->i_mutex);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index cb3ac33..7b19ebb 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -992,7 +992,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
 		lower_ia.ia_valid &= ~ATTR_MODE;
 
 	mutex_lock(&lower_dentry->d_inode->i_mutex);
-	rc = notify_change(lower_dentry, &lower_ia);
+	rc = notify_change(lower_dentry, &lower_ia, NULL);
 	mutex_unlock(&lower_dentry->d_inode->i_mutex);
 out:
 	fsstack_copy_attr_all(inode, lower_inode);
diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
index 345713d..1b39afd 100644
--- a/fs/hpfs/namei.c
+++ b/fs/hpfs/namei.c
@@ -407,7 +407,7 @@ again:
 			/*printk("HPFS: truncating file before delete.\n");*/
 			newattrs.ia_size = 0;
 			newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
-			err = notify_change(dentry, &newattrs);
+			err = notify_change(dentry, &newattrs, NULL);
 			put_write_access(inode);
 			if (!err)
 				goto again;
diff --git a/fs/inode.c b/fs/inode.c
index 4178e91..82a5a30 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1642,7 +1642,11 @@ static int __remove_suid(struct dentry *dentry, int kill)
 	struct iattr newattrs;
 
 	newattrs.ia_valid = ATTR_FORCE | kill;
-	return notify_change(dentry, &newattrs);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	return notify_change(dentry, &newattrs, NULL);
 }
 
 int file_remove_suid(struct file *file)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5479fff..2586f6d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -427,7 +427,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 			goto out_nfserr;
 		fh_lock(fhp);
 
-		host_err = notify_change(dentry, iap);
+		host_err = notify_change(dentry, iap, NULL);
 		err = nfserrno(host_err);
 		fh_unlock(fhp);
 	}
@@ -987,7 +987,11 @@ static void kill_suid(struct dentry *dentry)
 	ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	notify_change(dentry, &ia);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	notify_change(dentry, &ia, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 }
 
diff --git a/fs/open.c b/fs/open.c
index 9156cb0..68e50fd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -57,7 +57,8 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
 		newattrs.ia_valid |= ret | ATTR_FORCE;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	ret = notify_change(dentry, &newattrs);
+	/* Note any delegations or leases have already been broken: */
+	ret = notify_change(dentry, &newattrs, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 	return ret;
 }
@@ -464,21 +465,28 @@ out:
 static int chmod_common(struct path *path, umode_t mode)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	struct iattr newattrs;
 	int error;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
 		return error;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chmod(path, mode);
 	if (error)
 		goto out_unlock;
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 out_unlock:
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(path->mnt);
 	return error;
 }
@@ -523,6 +531,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 static int chown_common(struct path *path, uid_t user, gid_t group)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	int error;
 	struct iattr newattrs;
 	kuid_t uid;
@@ -547,12 +556,17 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
 	if (!S_ISDIR(inode->i_mode))
 		newattrs.ia_valid |=
 			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chown(path, uid, gid);
 	if (!error)
-		error = notify_change(path->dentry, &newattrs);
+		error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
-
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	return error;
 }
 
diff --git a/fs/utimes.c b/fs/utimes.c
index f4fb7ec..aa138d6 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -53,6 +53,7 @@ static int utimes_common(struct path *path, struct timespec *times)
 	int error;
 	struct iattr newattrs;
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
@@ -101,9 +102,15 @@ static int utimes_common(struct path *path, struct timespec *times)
 				goto mnt_drop_write_and_out;
 		}
 	}
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 
 mnt_drop_write_and_out:
 	mnt_drop_write(path->mnt);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a2403f9..638cdae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2268,7 +2268,7 @@ extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
 extern sector_t bmap(struct inode *, sector_t);
 #endif
-extern int notify_change(struct dentry *, struct iattr *);
+extern int notify_change(struct dentry *, struct iattr *, struct inode **);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/16] locks: break delegations on any attribute modification
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields,
	Mikulas Patocka, David Howells, Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields@redhat.com>

NFSv4 uses leases to guarantee that clients can cache metadata as well
as data.

Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 drivers/base/devtmpfs.c   |    4 ++--
 fs/attr.c                 |   25 ++++++++++++++++++++++++-
 fs/cachefiles/interface.c |    4 ++--
 fs/ecryptfs/inode.c       |    2 +-
 fs/hpfs/namei.c           |    2 +-
 fs/inode.c                |    6 +++++-
 fs/nfsd/vfs.c             |    8 ++++++--
 fs/open.c                 |   22 ++++++++++++++++++----
 fs/utimes.c               |    9 ++++++++-
 include/linux/fs.h        |    2 +-
 10 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 1b8490e..0f38201 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -216,7 +216,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
 		newattrs.ia_gid = gid;
 		newattrs.ia_valid = ATTR_MODE|ATTR_UID|ATTR_GID;
 		mutex_lock(&dentry->d_inode->i_mutex);
-		notify_change(dentry, &newattrs);
+		notify_change(dentry, &newattrs, NULL);
 		mutex_unlock(&dentry->d_inode->i_mutex);
 
 		/* mark as kernel-created inode */
@@ -322,7 +322,7 @@ static int handle_remove(const char *nodename, struct device *dev)
 			newattrs.ia_valid =
 				ATTR_UID|ATTR_GID|ATTR_MODE;
 			mutex_lock(&dentry->d_inode->i_mutex);
-			notify_change(dentry, &newattrs);
+			notify_change(dentry, &newattrs, NULL);
 			mutex_unlock(&dentry->d_inode->i_mutex);
 			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
 			if (!err || err == -ENOENT)
diff --git a/fs/attr.c b/fs/attr.c
index 1449adb..267968d 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -167,7 +167,27 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
 }
 EXPORT_SYMBOL(setattr_copy);
 
-int notify_change(struct dentry * dentry, struct iattr * attr)
+/**
+ * notify_change - modify attributes of a filesytem object
+ * @dentry:	object affected
+ * @iattr:	new attributes
+ * @delegated_inode: returns inode, if the inode is delegated
+ *
+ * The caller must hold the i_mutex on the affected object.
+ *
+ * If notify_change discovers a delegation in need of breaking,
+ * it will return -EWOULDBLOCK and return a reference to the inode in
+ * delegated_inode.  The caller should then break the delegation and
+ * retry.  Because breaking a delegation may take a long time, the
+ * caller should drop the i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode.  This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.  Also, passing NULL is fine for callers holding
+ * the file open for write, as there can be no conflicting delegation in
+ * that case.
+ */
+int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -243,6 +263,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
 	error = security_inode_setattr(dentry, attr);
 	if (error)
 		return error;
+	error = try_break_deleg(inode, delegated_inode);
+	if (error)
+		return error;
 
 	if (inode->i_op->setattr)
 		error = inode->i_op->setattr(dentry, attr);
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index d4c1206..ccc01f1 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -424,14 +424,14 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(object->backer, &newattrs);
+		ret = notify_change(object->backer, &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(object->backer, &newattrs);
+	ret = notify_change(object->backer, &newattrs, NULL);
 
 truncate_failed:
 	mutex_unlock(&object->backer->d_inode->i_mutex);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index cb3ac33..7b19ebb 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -992,7 +992,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
 		lower_ia.ia_valid &= ~ATTR_MODE;
 
 	mutex_lock(&lower_dentry->d_inode->i_mutex);
-	rc = notify_change(lower_dentry, &lower_ia);
+	rc = notify_change(lower_dentry, &lower_ia, NULL);
 	mutex_unlock(&lower_dentry->d_inode->i_mutex);
 out:
 	fsstack_copy_attr_all(inode, lower_inode);
diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
index 345713d..1b39afd 100644
--- a/fs/hpfs/namei.c
+++ b/fs/hpfs/namei.c
@@ -407,7 +407,7 @@ again:
 			/*printk("HPFS: truncating file before delete.\n");*/
 			newattrs.ia_size = 0;
 			newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
-			err = notify_change(dentry, &newattrs);
+			err = notify_change(dentry, &newattrs, NULL);
 			put_write_access(inode);
 			if (!err)
 				goto again;
diff --git a/fs/inode.c b/fs/inode.c
index 4178e91..82a5a30 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1642,7 +1642,11 @@ static int __remove_suid(struct dentry *dentry, int kill)
 	struct iattr newattrs;
 
 	newattrs.ia_valid = ATTR_FORCE | kill;
-	return notify_change(dentry, &newattrs);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	return notify_change(dentry, &newattrs, NULL);
 }
 
 int file_remove_suid(struct file *file)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5479fff..2586f6d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -427,7 +427,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 			goto out_nfserr;
 		fh_lock(fhp);
 
-		host_err = notify_change(dentry, iap);
+		host_err = notify_change(dentry, iap, NULL);
 		err = nfserrno(host_err);
 		fh_unlock(fhp);
 	}
@@ -987,7 +987,11 @@ static void kill_suid(struct dentry *dentry)
 	ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	notify_change(dentry, &ia);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	notify_change(dentry, &ia, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 }
 
diff --git a/fs/open.c b/fs/open.c
index 9156cb0..68e50fd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -57,7 +57,8 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
 		newattrs.ia_valid |= ret | ATTR_FORCE;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	ret = notify_change(dentry, &newattrs);
+	/* Note any delegations or leases have already been broken: */
+	ret = notify_change(dentry, &newattrs, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 	return ret;
 }
@@ -464,21 +465,28 @@ out:
 static int chmod_common(struct path *path, umode_t mode)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	struct iattr newattrs;
 	int error;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
 		return error;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chmod(path, mode);
 	if (error)
 		goto out_unlock;
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 out_unlock:
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(path->mnt);
 	return error;
 }
@@ -523,6 +531,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 static int chown_common(struct path *path, uid_t user, gid_t group)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	int error;
 	struct iattr newattrs;
 	kuid_t uid;
@@ -547,12 +556,17 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
 	if (!S_ISDIR(inode->i_mode))
 		newattrs.ia_valid |=
 			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chown(path, uid, gid);
 	if (!error)
-		error = notify_change(path->dentry, &newattrs);
+		error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
-
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	return error;
 }
 
diff --git a/fs/utimes.c b/fs/utimes.c
index f4fb7ec..aa138d6 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -53,6 +53,7 @@ static int utimes_common(struct path *path, struct timespec *times)
 	int error;
 	struct iattr newattrs;
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
@@ -101,9 +102,15 @@ static int utimes_common(struct path *path, struct timespec *times)
 				goto mnt_drop_write_and_out;
 		}
 	}
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 
 mnt_drop_write_and_out:
 	mnt_drop_write(path->mnt);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a2403f9..638cdae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2268,7 +2268,7 @@ extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
 extern sector_t bmap(struct inode *, sector_t);
 #endif
-extern int notify_change(struct dentry *, struct iattr *);
+extern int notify_change(struct dentry *, struct iattr *, struct inode **);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (5 preceding siblings ...)
  2013-07-17 20:50 ` [PATCH 10/16] locks: break delegations on rename J. Bruce Fields
@ 2013-07-17 20:50 ` J. Bruce Fields
  2013-07-26 10:53   ` Jeff Layton
       [not found] ` <1374094217-31493-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

As far as I can tell, this list is used only under the state lock, so we
may as well do this in the simpler order.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1698816..7c91b6c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3028,18 +3028,18 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	if (!fl)
 		return -ENOMEM;
 	fl->fl_file = find_readable_file(fp);
-	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
-	if (status) {
-		list_del_init(&dp->dl_perclnt);
-		locks_free_lock(fl);
-		return -ENOMEM;
-	}
+	if (status)
+		goto out_free;
+	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = get_file(fl->fl_file);
 	atomic_set(&fp->fi_delegees, 1);
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
 	return 0;
+out_free:
+	locks_free_lock(fl);
+	return -ENOMEM;
 }
 
 static int nfs4_set_delegation(struct nfs4_delegation *dp)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/16] nfsd4: delay setting current_fh in open
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

This is basically a no-op, to simplify a following patch.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/nfsd/nfs4proc.c |   36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a7cee86..4c0cbeb 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -230,17 +230,17 @@ static void nfsd4_set_open_owner_reply_cache(struct nfsd4_compound_state *cstate
 }
 
 static __be32
-do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open)
+do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state
+*cstate, struct nfsd4_open *open, struct svc_fh **resfh)
 {
 	struct svc_fh *current_fh = &cstate->current_fh;
-	struct svc_fh *resfh;
 	int accmode;
 	__be32 status;
 
-	resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
-	if (!resfh)
+	*resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
+	if (!*resfh)
 		return nfserr_jukebox;
-	fh_init(resfh, NFS4_FHSIZE);
+	fh_init(*resfh, NFS4_FHSIZE);
 	open->op_truncate = 0;
 
 	if (open->op_create) {
@@ -265,7 +265,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
 		 */
 		status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
 					open->op_fname.len, &open->op_iattr,
-					resfh, open->op_createmode,
+					*resfh, open->op_createmode,
 					(u32 *)open->op_verf.data,
 					&open->op_truncate, &open->op_created);
 
@@ -282,29 +282,26 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
 							FATTR4_WORD1_TIME_MODIFY);
 	} else {
 		status = nfsd_lookup(rqstp, current_fh,
-				     open->op_fname.data, open->op_fname.len, resfh);
+				     open->op_fname.data, open->op_fname.len, *resfh);
 		fh_unlock(current_fh);
 	}
 	if (status)
 		goto out;
-	status = nfsd_check_obj_isreg(resfh);
+	status = nfsd_check_obj_isreg(*resfh);
 	if (status)
 		goto out;
 
 	if (is_create_with_attrs(open) && open->op_acl != NULL)
-		do_set_nfs4_acl(rqstp, resfh, open->op_acl, open->op_bmval);
+		do_set_nfs4_acl(rqstp, *resfh, open->op_acl, open->op_bmval);
 
-	nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
+	nfsd4_set_open_owner_reply_cache(cstate, open, *resfh);
 	accmode = NFSD_MAY_NOP;
 	if (open->op_created ||
 			open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
 		accmode |= NFSD_MAY_OWNER_OVERRIDE;
-	status = do_open_permission(rqstp, resfh, open, accmode);
+	status = do_open_permission(rqstp, *resfh, open, accmode);
 	set_change_info(&open->op_cinfo, current_fh);
-	fh_dup2(current_fh, resfh);
 out:
-	fh_put(resfh);
-	kfree(resfh);
 	return status;
 }
 
@@ -357,6 +354,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	   struct nfsd4_open *open)
 {
 	__be32 status;
+	struct svc_fh *resfh = NULL;
 	struct nfsd4_compoundres *resp;
 	struct net *net = SVC_NET(rqstp);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
@@ -423,7 +421,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	switch (open->op_claim_type) {
 		case NFS4_OPEN_CLAIM_DELEGATE_CUR:
 		case NFS4_OPEN_CLAIM_NULL:
-			status = do_open_lookup(rqstp, cstate, open);
+			status = do_open_lookup(rqstp, cstate, open, &resfh);
 			if (status)
 				goto out;
 			break;
@@ -439,6 +437,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			status = do_open_fhandle(rqstp, cstate, open);
 			if (status)
 				goto out;
+			resfh = &cstate->current_fh;
 			break;
 		case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
              	case NFS4_OPEN_CLAIM_DELEGATE_PREV:
@@ -458,9 +457,14 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * successful, it (1) truncates the file if open->op_truncate was
 	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
 	 */
-	status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
+	status = nfsd4_process_open2(rqstp, resfh, open);
 	WARN_ON(status && open->op_created);
 out:
+	if (resfh && resfh != &cstate->current_fh) {
+		fh_dup2(&cstate->current_fh, resfh);
+		fh_put(resfh);
+		kfree(resfh);
+	}
 	nfsd4_cleanup_open_state(open, status);
 	if (open->op_openowner && !nfsd4_has_session(cstate))
 		cstate->replay_owner = &open->op_openowner->oo_owner;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/16] nfsd4: delay setting current_fh in open
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

This is basically a no-op, to simplify a following patch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4proc.c |   36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a7cee86..4c0cbeb 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -230,17 +230,17 @@ static void nfsd4_set_open_owner_reply_cache(struct nfsd4_compound_state *cstate
 }
 
 static __be32
-do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open)
+do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state
+*cstate, struct nfsd4_open *open, struct svc_fh **resfh)
 {
 	struct svc_fh *current_fh = &cstate->current_fh;
-	struct svc_fh *resfh;
 	int accmode;
 	__be32 status;
 
-	resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
-	if (!resfh)
+	*resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
+	if (!*resfh)
 		return nfserr_jukebox;
-	fh_init(resfh, NFS4_FHSIZE);
+	fh_init(*resfh, NFS4_FHSIZE);
 	open->op_truncate = 0;
 
 	if (open->op_create) {
@@ -265,7 +265,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
 		 */
 		status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
 					open->op_fname.len, &open->op_iattr,
-					resfh, open->op_createmode,
+					*resfh, open->op_createmode,
 					(u32 *)open->op_verf.data,
 					&open->op_truncate, &open->op_created);
 
@@ -282,29 +282,26 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
 							FATTR4_WORD1_TIME_MODIFY);
 	} else {
 		status = nfsd_lookup(rqstp, current_fh,
-				     open->op_fname.data, open->op_fname.len, resfh);
+				     open->op_fname.data, open->op_fname.len, *resfh);
 		fh_unlock(current_fh);
 	}
 	if (status)
 		goto out;
-	status = nfsd_check_obj_isreg(resfh);
+	status = nfsd_check_obj_isreg(*resfh);
 	if (status)
 		goto out;
 
 	if (is_create_with_attrs(open) && open->op_acl != NULL)
-		do_set_nfs4_acl(rqstp, resfh, open->op_acl, open->op_bmval);
+		do_set_nfs4_acl(rqstp, *resfh, open->op_acl, open->op_bmval);
 
-	nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
+	nfsd4_set_open_owner_reply_cache(cstate, open, *resfh);
 	accmode = NFSD_MAY_NOP;
 	if (open->op_created ||
 			open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
 		accmode |= NFSD_MAY_OWNER_OVERRIDE;
-	status = do_open_permission(rqstp, resfh, open, accmode);
+	status = do_open_permission(rqstp, *resfh, open, accmode);
 	set_change_info(&open->op_cinfo, current_fh);
-	fh_dup2(current_fh, resfh);
 out:
-	fh_put(resfh);
-	kfree(resfh);
 	return status;
 }
 
@@ -357,6 +354,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	   struct nfsd4_open *open)
 {
 	__be32 status;
+	struct svc_fh *resfh = NULL;
 	struct nfsd4_compoundres *resp;
 	struct net *net = SVC_NET(rqstp);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
@@ -423,7 +421,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	switch (open->op_claim_type) {
 		case NFS4_OPEN_CLAIM_DELEGATE_CUR:
 		case NFS4_OPEN_CLAIM_NULL:
-			status = do_open_lookup(rqstp, cstate, open);
+			status = do_open_lookup(rqstp, cstate, open, &resfh);
 			if (status)
 				goto out;
 			break;
@@ -439,6 +437,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			status = do_open_fhandle(rqstp, cstate, open);
 			if (status)
 				goto out;
+			resfh = &cstate->current_fh;
 			break;
 		case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
              	case NFS4_OPEN_CLAIM_DELEGATE_PREV:
@@ -458,9 +457,14 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * successful, it (1) truncates the file if open->op_truncate was
 	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
 	 */
-	status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
+	status = nfsd4_process_open2(rqstp, resfh, open);
 	WARN_ON(status && open->op_created);
 out:
+	if (resfh && resfh != &cstate->current_fh) {
+		fh_dup2(&cstate->current_fh, resfh);
+		fh_put(resfh);
+		kfree(resfh);
+	}
 	nfsd4_cleanup_open_state(open, status);
 	if (open->op_openowner && !nfsd4_has_session(cstate))
 		cstate->replay_owner = &open->op_openowner->oo_owner;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
@ 2013-07-17 20:50     ` J. Bruce Fields
  2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

If a file is unlinked or renamed between the time when we do the local
open and the time when we get the delegation, then we will return to the
client indicating that it holds a delegation even though the file no
longer exists under the name it was open under.

But a client performing an open-by-name, when it is returned a
delegation, must be able to assume that the file is still linked at the
name it was opened under.

So, pass the parent filehandle into the delegation and lease-setting
code, and use it to re-lookup the file after we get the lease.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/nfsd/nfs4proc.c  |    2 +-
 fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
 fs/nfsd/xdr4.h      |    3 ++-
 3 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 4c0cbeb..f44b29d 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * successful, it (1) truncates the file if open->op_truncate was
 	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
 	 */
-	status = nfsd4_process_open2(rqstp, resfh, open);
+	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
 	WARN_ON(status && open->op_created);
 out:
 	if (resfh && resfh != &cstate->current_fh) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7c91b6c..193f2bb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 	return fl;
 }
 
-static int nfs4_setlease(struct nfs4_delegation *dp)
+static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
+{
+	struct xdr_netobj *name = &open->op_fname;
+	struct dentry *res;
+	bool ret;
+
+	if (parent->fh_dentry == dentry)
+		/* This was an open by filehandle, we don't care: */
+		return true;
+	if (nfsd_mountpoint(dentry, parent->fh_export))
+		/* We assume those never change */
+		return true;
+	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
+	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
+	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
+	ret = res == dentry;
+	if (!IS_ERR(res))
+		dput(res);
+	return ret;
+}
+
+static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfs4_file *fp = dp->dl_file;
 	struct file_lock *fl;
@@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
 		goto out_free;
+	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
+		goto out_unlease;
+	spin_lock(&recall_lock);
+	if (fp->fi_had_conflict)
+		/*
+		 * whoops, already broken, but before we got a chance to
+		 * install our delegation; never mind:
+		 */
+		 goto out_unlock;
+	list_add(&dp->dl_perfile, &fp->fi_delegations);
+	spin_unlock(&recall_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = get_file(fl->fl_file);
 	atomic_set(&fp->fi_delegees, 1);
-	list_add(&dp->dl_perfile, &fp->fi_delegations);
 	return 0;
+out_unlock:
+	spin_unlock(&recall_lock);
+out_unlease:
+	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
 out_free:
 	locks_free_lock(fl);
 	return -ENOMEM;
 }
 
-static int nfs4_set_delegation(struct nfs4_delegation *dp)
+static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfs4_file *fp = dp->dl_file;
 
 	if (!fp->fi_lease)
-		return nfs4_setlease(dp);
+		return nfs4_setlease(dp, open, parent);
 	spin_lock(&recall_lock);
 	if (fp->fi_had_conflict) {
 		spin_unlock(&recall_lock);
@@ -3089,7 +3124,8 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
  */
 static void
 nfs4_open_delegation(struct net *net, struct svc_fh *fh,
-		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
+		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
+		     struct svc_fh *parent)
 {
 	struct nfs4_delegation *dp;
 	struct nfs4_openowner *oo = container_of(stp->st_stateowner, struct nfs4_openowner, oo_owner);
@@ -3132,7 +3168,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
 	if (dp == NULL)
 		goto out_no_deleg;
-	status = nfs4_set_delegation(dp);
+	status = nfs4_set_delegation(dp, open, parent);
 	if (status)
 		goto out_free;
 
@@ -3181,7 +3217,7 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
  * called with nfs4_lock_state() held.
  */
 __be32
-nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
+nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfsd4_compoundres *resp = rqstp->rq_resp;
 	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
@@ -3250,7 +3286,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 	* Attempt to hand out a delegation. No error return, because the
 	* OPEN succeeds even if we fail.
 	*/
-	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp);
+	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp, parent);
 nodeleg:
 	status = nfs_ok;
 
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index b3ed644..3058885 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -596,7 +596,8 @@ __be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *,
 extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
 		struct nfsd4_open *open, struct nfsd_net *nn);
 extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
-		struct svc_fh *current_fh, struct nfsd4_open *open);
+		struct svc_fh *current_fh, struct nfsd4_open *open,
+		struct svc_fh *parent);
 extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
 extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
@ 2013-07-17 20:50     ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

If a file is unlinked or renamed between the time when we do the local
open and the time when we get the delegation, then we will return to the
client indicating that it holds a delegation even though the file no
longer exists under the name it was open under.

But a client performing an open-by-name, when it is returned a
delegation, must be able to assume that the file is still linked at the
name it was opened under.

So, pass the parent filehandle into the delegation and lease-setting
code, and use it to re-lookup the file after we get the lease.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4proc.c  |    2 +-
 fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
 fs/nfsd/xdr4.h      |    3 ++-
 3 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 4c0cbeb..f44b29d 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * successful, it (1) truncates the file if open->op_truncate was
 	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
 	 */
-	status = nfsd4_process_open2(rqstp, resfh, open);
+	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
 	WARN_ON(status && open->op_created);
 out:
 	if (resfh && resfh != &cstate->current_fh) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7c91b6c..193f2bb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 	return fl;
 }
 
-static int nfs4_setlease(struct nfs4_delegation *dp)
+static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
+{
+	struct xdr_netobj *name = &open->op_fname;
+	struct dentry *res;
+	bool ret;
+
+	if (parent->fh_dentry == dentry)
+		/* This was an open by filehandle, we don't care: */
+		return true;
+	if (nfsd_mountpoint(dentry, parent->fh_export))
+		/* We assume those never change */
+		return true;
+	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
+	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
+	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
+	ret = res == dentry;
+	if (!IS_ERR(res))
+		dput(res);
+	return ret;
+}
+
+static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfs4_file *fp = dp->dl_file;
 	struct file_lock *fl;
@@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
 		goto out_free;
+	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
+		goto out_unlease;
+	spin_lock(&recall_lock);
+	if (fp->fi_had_conflict)
+		/*
+		 * whoops, already broken, but before we got a chance to
+		 * install our delegation; never mind:
+		 */
+		 goto out_unlock;
+	list_add(&dp->dl_perfile, &fp->fi_delegations);
+	spin_unlock(&recall_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = get_file(fl->fl_file);
 	atomic_set(&fp->fi_delegees, 1);
-	list_add(&dp->dl_perfile, &fp->fi_delegations);
 	return 0;
+out_unlock:
+	spin_unlock(&recall_lock);
+out_unlease:
+	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
 out_free:
 	locks_free_lock(fl);
 	return -ENOMEM;
 }
 
-static int nfs4_set_delegation(struct nfs4_delegation *dp)
+static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfs4_file *fp = dp->dl_file;
 
 	if (!fp->fi_lease)
-		return nfs4_setlease(dp);
+		return nfs4_setlease(dp, open, parent);
 	spin_lock(&recall_lock);
 	if (fp->fi_had_conflict) {
 		spin_unlock(&recall_lock);
@@ -3089,7 +3124,8 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
  */
 static void
 nfs4_open_delegation(struct net *net, struct svc_fh *fh,
-		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
+		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
+		     struct svc_fh *parent)
 {
 	struct nfs4_delegation *dp;
 	struct nfs4_openowner *oo = container_of(stp->st_stateowner, struct nfs4_openowner, oo_owner);
@@ -3132,7 +3168,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
 	if (dp == NULL)
 		goto out_no_deleg;
-	status = nfs4_set_delegation(dp);
+	status = nfs4_set_delegation(dp, open, parent);
 	if (status)
 		goto out_free;
 
@@ -3181,7 +3217,7 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
  * called with nfs4_lock_state() held.
  */
 __be32
-nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
+nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open, struct svc_fh *parent)
 {
 	struct nfsd4_compoundres *resp = rqstp->rq_resp;
 	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
@@ -3250,7 +3286,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 	* Attempt to hand out a delegation. No error return, because the
 	* OPEN succeeds even if we fail.
 	*/
-	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp);
+	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp, parent);
 nodeleg:
 	status = nfs_ok;
 
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index b3ed644..3058885 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -596,7 +596,8 @@ __be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *,
 extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
 		struct nfsd4_open *open, struct nfsd_net *nn);
 extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
-		struct svc_fh *current_fh, struct nfsd4_open *open);
+		struct svc_fh *current_fh, struct nfsd4_open *open,
+		struct svc_fh *parent);
 extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
 extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/16] nfsd4: break only delegations when appropriate
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (7 preceding siblings ...)
       [not found] ` <1374094217-31493-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-07-17 20:50 ` J. Bruce Fields
       [not found]   ` <1374094217-31493-18-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2013-07-17 21:09 ` [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
  9 siblings, 1 reply; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 20:50 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, jlayton, Dave Chinner, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

As a temporary fix, nfsd was breaking all leases on unlink, link,
rename, and setattr.

Now that we can distinguish between leases and delegations, we can be
nicer and break only the delegations, and not bother lease-holders with
operations they don't care about.

And we get to delete some code while we're at it.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/vfs.c |   27 ---------------------------
 1 file changed, 27 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 2586f6d..51a5ede 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -273,13 +273,6 @@ out:
 	return err;
 }
 
-static int nfsd_break_lease(struct inode *inode)
-{
-	if (!S_ISREG(inode->i_mode))
-		return 0;
-	return break_lease(inode, O_WRONLY | O_NONBLOCK);
-}
-
 /*
  * Commit metadata changes to stable storage.
  */
@@ -422,9 +415,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 
 	err = nfserr_notsync;
 	if (!check_guard || guardtime == inode->i_ctime.tv_sec) {
-		host_err = nfsd_break_lease(inode);
-		if (host_err)
-			goto out_nfserr;
 		fh_lock(fhp);
 
 		host_err = notify_change(dentry, iap, NULL);
@@ -1735,11 +1725,6 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	err = nfserr_noent;
 	if (!dold->d_inode)
 		goto out_dput;
-	host_err = nfsd_break_lease(dold->d_inode);
-	if (host_err) {
-		err = nfserrno(host_err);
-		goto out_dput;
-	}
 	host_err = vfs_link(dold, dirp, dnew, NULL);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
@@ -1833,14 +1818,6 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
 		goto out_dput_new;
 
-	host_err = nfsd_break_lease(odentry->d_inode);
-	if (host_err)
-		goto out_dput_new;
-	if (ndentry->d_inode) {
-		host_err = nfsd_break_lease(ndentry->d_inode);
-		if (host_err)
-			goto out_dput_new;
-	}
 	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL);
 	if (!host_err) {
 		host_err = commit_metadata(tfhp);
@@ -1910,16 +1887,12 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	if (!type)
 		type = rdentry->d_inode->i_mode & S_IFMT;
 
-	host_err = nfsd_break_lease(rdentry->d_inode);
-	if (host_err)
-		goto out_put;
 	if (type != S_IFDIR)
 		host_err = vfs_unlink(dirp, rdentry, NULL);
 	else
 		host_err = vfs_rmdir(dirp, rdentry);
 	if (!host_err)
 		host_err = commit_metadata(fhp);
-out_put:
 	dput(rdentry);
 
 out_nfserr:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] nfsd4: fix minorversion support interface
  2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
@ 2013-07-17 21:08   ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 21:08 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, jlayton, Dave Chinner

On Wed, Jul 17, 2013 at 04:50:01PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> You can turn on or off support for minorversions using e.g.
> 
> 	echo "-4.2" >/proc/fs/nfsd/versions
> 
> However, the current implementation is a little wonky.  For example, the
> above will turn off 4.2 support, but it will also turn *on* 4.1 support.

Argh, sorry, I mistakenly fed an unrelated patch on the git-send-email
commandline: just ignore this one patch.

--b.

> 
> This didn't matter as long as we only had 2 minorversions, which was
> true till very recently.
> 
> And do a little cleanup here.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/nfs4proc.c |    2 +-
>  fs/nfsd/nfsd.h     |    1 -
>  fs/nfsd/nfssvc.c   |   13 +++++++------
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index a7cee86..0d4c410 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1293,7 +1293,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
>  	 * According to RFC3010, this takes precedence over all other errors.
>  	 */
>  	status = nfserr_minor_vers_mismatch;
> -	if (args->minorversion > nfsd_supported_minorversion)
> +	if (nfsd_minorversion(args->minorversion, NFSD_TEST) <= 0)
>  		goto out;
>  
>  	status = nfs41_check_op_ordering(args);
> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> index 2bbd94e..30f34ab 100644
> --- a/fs/nfsd/nfsd.h
> +++ b/fs/nfsd/nfsd.h
> @@ -53,7 +53,6 @@ struct readdir_cd {
>  extern struct svc_program	nfsd_program;
>  extern struct svc_version	nfsd_version2, nfsd_version3,
>  				nfsd_version4;
> -extern u32			nfsd_supported_minorversion;
>  extern struct mutex		nfsd_mutex;
>  extern spinlock_t		nfsd_drc_lock;
>  extern unsigned long		nfsd_drc_max_mem;
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 6b9f48c..760c85a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -116,7 +116,10 @@ struct svc_program		nfsd_program = {
>  
>  };
>  
> -u32 nfsd_supported_minorversion = 1;
> +static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
> +	[0] = 1,
> +	[1] = 1,
> +};
>  
>  int nfsd_vers(int vers, enum vers_op change)
>  {
> @@ -151,15 +154,13 @@ int nfsd_minorversion(u32 minorversion, enum vers_op change)
>  		return -1;
>  	switch(change) {
>  	case NFSD_SET:
> -		nfsd_supported_minorversion = minorversion;
> +		nfsd_supported_minorversions[minorversion] = true;
>  		break;
>  	case NFSD_CLEAR:
> -		if (minorversion == 0)
> -			return -1;
> -		nfsd_supported_minorversion = minorversion - 1;
> +		nfsd_supported_minorversions[minorversion] = false;
>  		break;
>  	case NFSD_TEST:
> -		return minorversion <= nfsd_supported_minorversion;
> +		return nfsd_supported_minorversions[minorversion];
>  	case NFSD_AVAIL:
>  		return minorversion <= NFSD_SUPPORTED_MINOR_VERSION;
>  	}
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/16] Implement NFSv4 delegations, take 9
  2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
                   ` (8 preceding siblings ...)
  2013-07-17 20:50 ` [PATCH 16/16] nfsd4: break only delegations when appropriate J. Bruce Fields
@ 2013-07-17 21:09 ` J. Bruce Fields
  9 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-17 21:09 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, jlayton, Dave Chinner

On Wed, Jul 17, 2013 at 04:50:00PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> Changes since version 8, thanks to dchinner and jlayton for review:
> 
> 	- additional warnings in lock_two_nondirectories
> 	- lock_two_nondirectories handles NULL second argument,
> 	  simplifying vfs_rename_other
> 	- kerneldoc comments on notify_change, vfs_link, vfs_rename,
> 	  vfs_unlink, to explain delegated_inode argument.
> 	- make clear non-support of write delegations in
> 	  generic_add_lease
> 	- rebase to 3.11-rc1

(Not addressed: Dave's concerns about possible incompatibility with
ordering of filesystem-specific locks, in xfs and possibly also in other
filesystems.)

--b.

> 
> Introduction copied from previous posting:
> 
> This patch series implements read delegations, which allow NFSv4 clients
> to perform read opens without contacting the server, by promising to
> call back to clients before modifying the data, metadata, or set of
> links pointing to a file.
> 
> The main recent change was in response to review from Linus, who didn't
> want us to hang under directory i_mutex's on timeouts communicating with
> unresponsive clients.
> 
> So, this version of the series drops the i_mutex before waiting.  The
> logic ends up looking something like:
> 
>         acquire locks
>         look up inode
>         test for delegation; if found:
>                 take reference on inode
>                 release locks
>                 wait for delegation break
>                 drop reference on inode
>                 retry
> 
> The initial test for a delegation happens after the lock on the
> delegated inode is acquired, but additional directory mutexes may have
> been acquired further up the call stack.  I therefore add a "struct
> inode **" argument to any intervening functions, which we use to pass
> the inode back up to the caller in the case it needs to wait for the
> delegation to be broken.
> 
> I also allow callers to pass in NULL for the "struct inode **" argument
> to indicate they'd rather just fail than wait for a delegation.  For
> example, as long as ecryptfs isn't exportable I assume they'd rather not
> see retry logic there that they won't use.  But I may have misjudged in
> some of these cases.
> 
> J. Bruce Fields (16):
>   vfs: pull ext4's double-i_mutex-locking into common code
>   vfs: don't use PARENT/CHILD lock classes for non-directories
>   vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
>   vfs: take i_mutex on renamed file
>   locks: introduce new FL_DELEG lock flag
>   locks: implement delegations
>   namei: minor vfs_unlink cleanup
>   locks: break delegations on unlink
>   locks: helper functions for delegation breaking
>   locks: break delegations on rename
>   locks: break delegations on link
>   locks: break delegations on any attribute modification
>   nfsd4: minor nfs4_setlease cleanup
>   nfsd4: delay setting current_fh in open
>   nfsd4: close open-deleg/unlink/rename race
>   nfsd4: break only delegations when appropriate
> 
>  Documentation/filesystems/directory-locking |   31 ++++--
>  drivers/base/devtmpfs.c                     |    6 +-
>  fs/attr.c                                   |   25 ++++-
>  fs/cachefiles/interface.c                   |    4 +-
>  fs/cachefiles/namei.c                       |    4 +-
>  fs/ecryptfs/inode.c                         |    6 +-
>  fs/ext4/ext4.h                              |    2 -
>  fs/ext4/ioctl.c                             |    4 +-
>  fs/ext4/move_extent.c                       |   40 +-------
>  fs/hpfs/namei.c                             |    2 +-
>  fs/inode.c                                  |   42 ++++++++-
>  fs/locks.c                                  |   57 ++++++++---
>  fs/namei.c                                  |  135 +++++++++++++++++++++++----
>  fs/nfsd/nfs4proc.c                          |   36 +++----
>  fs/nfsd/nfs4state.c                         |   66 ++++++++++---
>  fs/nfsd/vfs.c                               |   41 ++------
>  fs/nfsd/xdr4.h                              |    3 +-
>  fs/open.c                                   |   22 ++++-
>  fs/utimes.c                                 |    9 +-
>  include/linux/fs.h                          |   72 +++++++++++---
>  ipc/mqueue.c                                |    2 +-
>  21 files changed, 433 insertions(+), 176 deletions(-)
> 
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/16] locks: helper functions for delegation breaking
  2013-07-17 20:50     ` J. Bruce Fields
@ 2013-07-26 10:50         ` Jeff Layton
  -1 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 10:50 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Dave Chinner

On Wed, 17 Jul 2013 16:50:10 -0400
"J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> We'll need the same logic for rename and link.
> 
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/namei.c         |   13 +++----------
>  include/linux/fs.h |   33 +++++++++++++++++++++++++++++++--
>  2 files changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 2826bbd..d3b6a35 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3466,14 +3466,9 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
>  	else {
>  		error = security_inode_unlink(dir, dentry);
>  		if (!error) {
> -			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
> -			if (error) {
> -				if (error == -EWOULDBLOCK && delegated_inode) {
> -					*delegated_inode = target;
> -					ihold(target);
> -				}
> +			error = try_break_deleg(target, delegated_inode);
> +			if (error)
>  				goto out;
> -			}
>  			error = dir->i_op->unlink(dir, dentry);
>  			if (!error)
>  				dont_mount(dentry);
> @@ -3543,9 +3538,7 @@ exit2:
>  		iput(inode);	/* truncate the inode here */
>  	inode = NULL;
>  	if (delegated_inode) {
> -		error = break_deleg(delegated_inode, O_WRONLY);
> -		iput(delegated_inode);
> -		delegated_inode = NULL;
> +		error = break_deleg_wait(&delegated_inode);
>  		if (!error)
>  			goto retry_deleg;
>  	}
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index ba29b37..43a3506 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1907,6 +1907,9 @@ extern bool our_mnt(struct vfsmount *mnt);
>  
>  extern int current_umask(void);
>  
> +extern void ihold(struct inode * inode);
> +extern void iput(struct inode *);
> +
>  /* /sys/fs */
>  extern struct kobject *fs_kobj;
>  
> @@ -1974,6 +1977,28 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
>  	return 0;
>  }
>  
> +static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
> +{
> +	int ret;
> +
> +	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
> +	if (ret == -EWOULDBLOCK && delegated_inode) {
> +		*delegated_inode = inode;
> +		ihold(inode);
> +	}
> +	return ret;
> +}
> +
> +static inline int break_deleg_wait(struct inode **delegated_inode)
> +{
> +	int ret;
> +
> +	ret = break_deleg(*delegated_inode, O_WRONLY);
> +	iput(*delegated_inode);
> +	*delegated_inode = NULL;
> +	return ret;
> +}
> +
>  #else /* !CONFIG_FILE_LOCKING */
>  static inline int locks_mandatory_locked(struct inode *inode)
>  {
> @@ -2017,6 +2042,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
>  {
>  	return 0;
>  }
> +
> +static inline int try_break_deleg(struct inode *inode, struct delegated_inode **inode)
> +{
> +	return 0;
> +}
> +
>  #endif /* CONFIG_FILE_LOCKING */
>  
>  /* fs/open.c */
> @@ -2346,8 +2377,6 @@ extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
>  extern int inode_init_always(struct super_block *, struct inode *);
>  extern void inode_init_once(struct inode *);
>  extern void address_space_init_once(struct address_space *mapping);
> -extern void ihold(struct inode * inode);
> -extern void iput(struct inode *);
>  extern struct inode * igrab(struct inode *);
>  extern ino_t iunique(struct super_block *, ino_t);
>  extern int inode_needs_sync(struct inode *inode);

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/16] locks: helper functions for delegation breaking
@ 2013-07-26 10:50         ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 10:50 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Wed, 17 Jul 2013 16:50:10 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> We'll need the same logic for rename and link.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/namei.c         |   13 +++----------
>  include/linux/fs.h |   33 +++++++++++++++++++++++++++++++--
>  2 files changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 2826bbd..d3b6a35 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3466,14 +3466,9 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
>  	else {
>  		error = security_inode_unlink(dir, dentry);
>  		if (!error) {
> -			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
> -			if (error) {
> -				if (error == -EWOULDBLOCK && delegated_inode) {
> -					*delegated_inode = target;
> -					ihold(target);
> -				}
> +			error = try_break_deleg(target, delegated_inode);
> +			if (error)
>  				goto out;
> -			}
>  			error = dir->i_op->unlink(dir, dentry);
>  			if (!error)
>  				dont_mount(dentry);
> @@ -3543,9 +3538,7 @@ exit2:
>  		iput(inode);	/* truncate the inode here */
>  	inode = NULL;
>  	if (delegated_inode) {
> -		error = break_deleg(delegated_inode, O_WRONLY);
> -		iput(delegated_inode);
> -		delegated_inode = NULL;
> +		error = break_deleg_wait(&delegated_inode);
>  		if (!error)
>  			goto retry_deleg;
>  	}
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index ba29b37..43a3506 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1907,6 +1907,9 @@ extern bool our_mnt(struct vfsmount *mnt);
>  
>  extern int current_umask(void);
>  
> +extern void ihold(struct inode * inode);
> +extern void iput(struct inode *);
> +
>  /* /sys/fs */
>  extern struct kobject *fs_kobj;
>  
> @@ -1974,6 +1977,28 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
>  	return 0;
>  }
>  
> +static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
> +{
> +	int ret;
> +
> +	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
> +	if (ret == -EWOULDBLOCK && delegated_inode) {
> +		*delegated_inode = inode;
> +		ihold(inode);
> +	}
> +	return ret;
> +}
> +
> +static inline int break_deleg_wait(struct inode **delegated_inode)
> +{
> +	int ret;
> +
> +	ret = break_deleg(*delegated_inode, O_WRONLY);
> +	iput(*delegated_inode);
> +	*delegated_inode = NULL;
> +	return ret;
> +}
> +
>  #else /* !CONFIG_FILE_LOCKING */
>  static inline int locks_mandatory_locked(struct inode *inode)
>  {
> @@ -2017,6 +2042,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
>  {
>  	return 0;
>  }
> +
> +static inline int try_break_deleg(struct inode *inode, struct delegated_inode **inode)
> +{
> +	return 0;
> +}
> +
>  #endif /* CONFIG_FILE_LOCKING */
>  
>  /* fs/open.c */
> @@ -2346,8 +2377,6 @@ extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
>  extern int inode_init_always(struct super_block *, struct inode *);
>  extern void inode_init_once(struct inode *);
>  extern void address_space_init_once(struct address_space *mapping);
> -extern void ihold(struct inode * inode);
> -extern void iput(struct inode *);
>  extern struct inode * igrab(struct inode *);
>  extern ino_t iunique(struct super_block *, ino_t);
>  extern int inode_needs_sync(struct inode *inode);

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] locks: break delegations on any attribute modification
  2013-07-17 20:50     ` J. Bruce Fields
@ 2013-07-26 10:50         ` Jeff Layton
  -1 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 10:50 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Dave Chinner,
	Mikulas Patocka, David Howells, Tyler Hicks, Dustin Kirkland

On Wed, 17 Jul 2013 16:50:13 -0400
"J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> NFSv4 uses leases to guarantee that clients can cache metadata as well
> as data.
> 
> Cc: Mikulas Patocka <mikulas-TTVWCEgN8Z9G4ohzP4jBZS1Fcj925eT/@public.gmane.org>
> Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Tyler Hicks <tyhicks-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Cc: Dustin Kirkland <dustin.kirkland-Bv2LyzZ6GzxBDgjK7y7TUQ@public.gmane.org>
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/base/devtmpfs.c   |    4 ++--
>  fs/attr.c                 |   25 ++++++++++++++++++++++++-
>  fs/cachefiles/interface.c |    4 ++--
>  fs/ecryptfs/inode.c       |    2 +-
>  fs/hpfs/namei.c           |    2 +-
>  fs/inode.c                |    6 +++++-
>  fs/nfsd/vfs.c             |    8 ++++++--
>  fs/open.c                 |   22 ++++++++++++++++++----
>  fs/utimes.c               |    9 ++++++++-
>  include/linux/fs.h        |    2 +-
>  10 files changed, 68 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
> index 1b8490e..0f38201 100644
> --- a/drivers/base/devtmpfs.c
> +++ b/drivers/base/devtmpfs.c
> @@ -216,7 +216,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
>  		newattrs.ia_gid = gid;
>  		newattrs.ia_valid = ATTR_MODE|ATTR_UID|ATTR_GID;
>  		mutex_lock(&dentry->d_inode->i_mutex);
> -		notify_change(dentry, &newattrs);
> +		notify_change(dentry, &newattrs, NULL);
>  		mutex_unlock(&dentry->d_inode->i_mutex);
>  
>  		/* mark as kernel-created inode */
> @@ -322,7 +322,7 @@ static int handle_remove(const char *nodename, struct device *dev)
>  			newattrs.ia_valid =
>  				ATTR_UID|ATTR_GID|ATTR_MODE;
>  			mutex_lock(&dentry->d_inode->i_mutex);
> -			notify_change(dentry, &newattrs);
> +			notify_change(dentry, &newattrs, NULL);
>  			mutex_unlock(&dentry->d_inode->i_mutex);
>  			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
>  			if (!err || err == -ENOENT)
> diff --git a/fs/attr.c b/fs/attr.c
> index 1449adb..267968d 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -167,7 +167,27 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
>  }
>  EXPORT_SYMBOL(setattr_copy);
>  
> -int notify_change(struct dentry * dentry, struct iattr * attr)
> +/**
> + * notify_change - modify attributes of a filesytem object
> + * @dentry:	object affected
> + * @iattr:	new attributes
> + * @delegated_inode: returns inode, if the inode is delegated
> + *
> + * The caller must hold the i_mutex on the affected object.
> + *
> + * If notify_change discovers a delegation in need of breaking,
> + * it will return -EWOULDBLOCK and return a reference to the inode in
> + * delegated_inode.  The caller should then break the delegation and
> + * retry.  Because breaking a delegation may take a long time, the
> + * caller should drop the i_mutex before doing so.
> + *
> + * Alternatively, a caller may pass NULL for delegated_inode.  This may
> + * be appropriate for callers that expect the underlying filesystem not
> + * to be NFS exported.  Also, passing NULL is fine for callers holding
> + * the file open for write, as there can be no conflicting delegation in
> + * that case.
> + */
> +int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
>  {
>  	struct inode *inode = dentry->d_inode;
>  	umode_t mode = inode->i_mode;
> @@ -243,6 +263,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
>  	error = security_inode_setattr(dentry, attr);
>  	if (error)
>  		return error;
> +	error = try_break_deleg(inode, delegated_inode);
> +	if (error)
> +		return error;
>  
>  	if (inode->i_op->setattr)
>  		error = inode->i_op->setattr(dentry, attr);
> diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
> index d4c1206..ccc01f1 100644
> --- a/fs/cachefiles/interface.c
> +++ b/fs/cachefiles/interface.c
> @@ -424,14 +424,14 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
>  		_debug("discard tail %llx", oi_size);
>  		newattrs.ia_valid = ATTR_SIZE;
>  		newattrs.ia_size = oi_size & PAGE_MASK;
> -		ret = notify_change(object->backer, &newattrs);
> +		ret = notify_change(object->backer, &newattrs, NULL);
>  		if (ret < 0)
>  			goto truncate_failed;
>  	}
>  
>  	newattrs.ia_valid = ATTR_SIZE;
>  	newattrs.ia_size = ni_size;
> -	ret = notify_change(object->backer, &newattrs);
> +	ret = notify_change(object->backer, &newattrs, NULL);
>  
>  truncate_failed:
>  	mutex_unlock(&object->backer->d_inode->i_mutex);
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index cb3ac33..7b19ebb 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -992,7 +992,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
>  		lower_ia.ia_valid &= ~ATTR_MODE;
>  
>  	mutex_lock(&lower_dentry->d_inode->i_mutex);
> -	rc = notify_change(lower_dentry, &lower_ia);
> +	rc = notify_change(lower_dentry, &lower_ia, NULL);
>  	mutex_unlock(&lower_dentry->d_inode->i_mutex);
>  out:
>  	fsstack_copy_attr_all(inode, lower_inode);
> diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
> index 345713d..1b39afd 100644
> --- a/fs/hpfs/namei.c
> +++ b/fs/hpfs/namei.c
> @@ -407,7 +407,7 @@ again:
>  			/*printk("HPFS: truncating file before delete.\n");*/
>  			newattrs.ia_size = 0;
>  			newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> -			err = notify_change(dentry, &newattrs);
> +			err = notify_change(dentry, &newattrs, NULL);
>  			put_write_access(inode);
>  			if (!err)
>  				goto again;
> diff --git a/fs/inode.c b/fs/inode.c
> index 4178e91..82a5a30 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1642,7 +1642,11 @@ static int __remove_suid(struct dentry *dentry, int kill)
>  	struct iattr newattrs;
>  
>  	newattrs.ia_valid = ATTR_FORCE | kill;
> -	return notify_change(dentry, &newattrs);
> +	/*
> +	 * Note we call this on write, so notify_change will not
> +	 * encounter any conflicting delegations:
> +	 */
> +	return notify_change(dentry, &newattrs, NULL);
>  }
>  
>  int file_remove_suid(struct file *file)
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 5479fff..2586f6d 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -427,7 +427,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  			goto out_nfserr;
>  		fh_lock(fhp);
>  
> -		host_err = notify_change(dentry, iap);
> +		host_err = notify_change(dentry, iap, NULL);
>  		err = nfserrno(host_err);
>  		fh_unlock(fhp);
>  	}
> @@ -987,7 +987,11 @@ static void kill_suid(struct dentry *dentry)
>  	ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
>  
>  	mutex_lock(&dentry->d_inode->i_mutex);
> -	notify_change(dentry, &ia);
> +	/*
> +	 * Note we call this on write, so notify_change will not
> +	 * encounter any conflicting delegations:
> +	 */
> +	notify_change(dentry, &ia, NULL);
>  	mutex_unlock(&dentry->d_inode->i_mutex);
>  }
>  
> diff --git a/fs/open.c b/fs/open.c
> index 9156cb0..68e50fd 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -57,7 +57,8 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
>  		newattrs.ia_valid |= ret | ATTR_FORCE;
>  
>  	mutex_lock(&dentry->d_inode->i_mutex);
> -	ret = notify_change(dentry, &newattrs);
> +	/* Note any delegations or leases have already been broken: */
> +	ret = notify_change(dentry, &newattrs, NULL);
>  	mutex_unlock(&dentry->d_inode->i_mutex);
>  	return ret;
>  }
> @@ -464,21 +465,28 @@ out:
>  static int chmod_common(struct path *path, umode_t mode)
>  {
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  	struct iattr newattrs;
>  	int error;
>  
>  	error = mnt_want_write(path->mnt);
>  	if (error)
>  		return error;
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
>  	error = security_path_chmod(path, mode);
>  	if (error)
>  		goto out_unlock;
>  	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
>  	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
> -	error = notify_change(path->dentry, &newattrs);
> +	error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  out_unlock:
>  	mutex_unlock(&inode->i_mutex);
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  	mnt_drop_write(path->mnt);
>  	return error;
>  }
> @@ -523,6 +531,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
>  static int chown_common(struct path *path, uid_t user, gid_t group)
>  {
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  	int error;
>  	struct iattr newattrs;
>  	kuid_t uid;
> @@ -547,12 +556,17 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
>  	if (!S_ISDIR(inode->i_mode))
>  		newattrs.ia_valid |=
>  			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
>  	error = security_path_chown(path, uid, gid);
>  	if (!error)
> -		error = notify_change(path->dentry, &newattrs);
> +		error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  	mutex_unlock(&inode->i_mutex);
> -
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  	return error;
>  }
>  
> diff --git a/fs/utimes.c b/fs/utimes.c
> index f4fb7ec..aa138d6 100644
> --- a/fs/utimes.c
> +++ b/fs/utimes.c
> @@ -53,6 +53,7 @@ static int utimes_common(struct path *path, struct timespec *times)
>  	int error;
>  	struct iattr newattrs;
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  
>  	error = mnt_want_write(path->mnt);
>  	if (error)
> @@ -101,9 +102,15 @@ static int utimes_common(struct path *path, struct timespec *times)
>  				goto mnt_drop_write_and_out;
>  		}
>  	}
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
> -	error = notify_change(path->dentry, &newattrs);
> +	error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  	mutex_unlock(&inode->i_mutex);
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  
>  mnt_drop_write_and_out:
>  	mnt_drop_write(path->mnt);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a2403f9..638cdae 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2268,7 +2268,7 @@ extern void emergency_remount(void);
>  #ifdef CONFIG_BLOCK
>  extern sector_t bmap(struct inode *, sector_t);
>  #endif
> -extern int notify_change(struct dentry *, struct iattr *);
> +extern int notify_change(struct dentry *, struct iattr *, struct inode **);
>  extern int inode_permission(struct inode *, int);
>  extern int generic_permission(struct inode *, int);
>  

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] locks: break delegations on any attribute modification
@ 2013-07-26 10:50         ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 10:50 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner, Mikulas Patocka,
	David Howells, Tyler Hicks, Dustin Kirkland

On Wed, 17 Jul 2013 16:50:13 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> NFSv4 uses leases to guarantee that clients can cache metadata as well
> as data.
> 
> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Tyler Hicks <tyhicks@canonical.com>
> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  drivers/base/devtmpfs.c   |    4 ++--
>  fs/attr.c                 |   25 ++++++++++++++++++++++++-
>  fs/cachefiles/interface.c |    4 ++--
>  fs/ecryptfs/inode.c       |    2 +-
>  fs/hpfs/namei.c           |    2 +-
>  fs/inode.c                |    6 +++++-
>  fs/nfsd/vfs.c             |    8 ++++++--
>  fs/open.c                 |   22 ++++++++++++++++++----
>  fs/utimes.c               |    9 ++++++++-
>  include/linux/fs.h        |    2 +-
>  10 files changed, 68 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
> index 1b8490e..0f38201 100644
> --- a/drivers/base/devtmpfs.c
> +++ b/drivers/base/devtmpfs.c
> @@ -216,7 +216,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
>  		newattrs.ia_gid = gid;
>  		newattrs.ia_valid = ATTR_MODE|ATTR_UID|ATTR_GID;
>  		mutex_lock(&dentry->d_inode->i_mutex);
> -		notify_change(dentry, &newattrs);
> +		notify_change(dentry, &newattrs, NULL);
>  		mutex_unlock(&dentry->d_inode->i_mutex);
>  
>  		/* mark as kernel-created inode */
> @@ -322,7 +322,7 @@ static int handle_remove(const char *nodename, struct device *dev)
>  			newattrs.ia_valid =
>  				ATTR_UID|ATTR_GID|ATTR_MODE;
>  			mutex_lock(&dentry->d_inode->i_mutex);
> -			notify_change(dentry, &newattrs);
> +			notify_change(dentry, &newattrs, NULL);
>  			mutex_unlock(&dentry->d_inode->i_mutex);
>  			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
>  			if (!err || err == -ENOENT)
> diff --git a/fs/attr.c b/fs/attr.c
> index 1449adb..267968d 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -167,7 +167,27 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
>  }
>  EXPORT_SYMBOL(setattr_copy);
>  
> -int notify_change(struct dentry * dentry, struct iattr * attr)
> +/**
> + * notify_change - modify attributes of a filesytem object
> + * @dentry:	object affected
> + * @iattr:	new attributes
> + * @delegated_inode: returns inode, if the inode is delegated
> + *
> + * The caller must hold the i_mutex on the affected object.
> + *
> + * If notify_change discovers a delegation in need of breaking,
> + * it will return -EWOULDBLOCK and return a reference to the inode in
> + * delegated_inode.  The caller should then break the delegation and
> + * retry.  Because breaking a delegation may take a long time, the
> + * caller should drop the i_mutex before doing so.
> + *
> + * Alternatively, a caller may pass NULL for delegated_inode.  This may
> + * be appropriate for callers that expect the underlying filesystem not
> + * to be NFS exported.  Also, passing NULL is fine for callers holding
> + * the file open for write, as there can be no conflicting delegation in
> + * that case.
> + */
> +int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
>  {
>  	struct inode *inode = dentry->d_inode;
>  	umode_t mode = inode->i_mode;
> @@ -243,6 +263,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
>  	error = security_inode_setattr(dentry, attr);
>  	if (error)
>  		return error;
> +	error = try_break_deleg(inode, delegated_inode);
> +	if (error)
> +		return error;
>  
>  	if (inode->i_op->setattr)
>  		error = inode->i_op->setattr(dentry, attr);
> diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
> index d4c1206..ccc01f1 100644
> --- a/fs/cachefiles/interface.c
> +++ b/fs/cachefiles/interface.c
> @@ -424,14 +424,14 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
>  		_debug("discard tail %llx", oi_size);
>  		newattrs.ia_valid = ATTR_SIZE;
>  		newattrs.ia_size = oi_size & PAGE_MASK;
> -		ret = notify_change(object->backer, &newattrs);
> +		ret = notify_change(object->backer, &newattrs, NULL);
>  		if (ret < 0)
>  			goto truncate_failed;
>  	}
>  
>  	newattrs.ia_valid = ATTR_SIZE;
>  	newattrs.ia_size = ni_size;
> -	ret = notify_change(object->backer, &newattrs);
> +	ret = notify_change(object->backer, &newattrs, NULL);
>  
>  truncate_failed:
>  	mutex_unlock(&object->backer->d_inode->i_mutex);
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index cb3ac33..7b19ebb 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -992,7 +992,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
>  		lower_ia.ia_valid &= ~ATTR_MODE;
>  
>  	mutex_lock(&lower_dentry->d_inode->i_mutex);
> -	rc = notify_change(lower_dentry, &lower_ia);
> +	rc = notify_change(lower_dentry, &lower_ia, NULL);
>  	mutex_unlock(&lower_dentry->d_inode->i_mutex);
>  out:
>  	fsstack_copy_attr_all(inode, lower_inode);
> diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
> index 345713d..1b39afd 100644
> --- a/fs/hpfs/namei.c
> +++ b/fs/hpfs/namei.c
> @@ -407,7 +407,7 @@ again:
>  			/*printk("HPFS: truncating file before delete.\n");*/
>  			newattrs.ia_size = 0;
>  			newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> -			err = notify_change(dentry, &newattrs);
> +			err = notify_change(dentry, &newattrs, NULL);
>  			put_write_access(inode);
>  			if (!err)
>  				goto again;
> diff --git a/fs/inode.c b/fs/inode.c
> index 4178e91..82a5a30 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1642,7 +1642,11 @@ static int __remove_suid(struct dentry *dentry, int kill)
>  	struct iattr newattrs;
>  
>  	newattrs.ia_valid = ATTR_FORCE | kill;
> -	return notify_change(dentry, &newattrs);
> +	/*
> +	 * Note we call this on write, so notify_change will not
> +	 * encounter any conflicting delegations:
> +	 */
> +	return notify_change(dentry, &newattrs, NULL);
>  }
>  
>  int file_remove_suid(struct file *file)
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 5479fff..2586f6d 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -427,7 +427,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  			goto out_nfserr;
>  		fh_lock(fhp);
>  
> -		host_err = notify_change(dentry, iap);
> +		host_err = notify_change(dentry, iap, NULL);
>  		err = nfserrno(host_err);
>  		fh_unlock(fhp);
>  	}
> @@ -987,7 +987,11 @@ static void kill_suid(struct dentry *dentry)
>  	ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
>  
>  	mutex_lock(&dentry->d_inode->i_mutex);
> -	notify_change(dentry, &ia);
> +	/*
> +	 * Note we call this on write, so notify_change will not
> +	 * encounter any conflicting delegations:
> +	 */
> +	notify_change(dentry, &ia, NULL);
>  	mutex_unlock(&dentry->d_inode->i_mutex);
>  }
>  
> diff --git a/fs/open.c b/fs/open.c
> index 9156cb0..68e50fd 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -57,7 +57,8 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
>  		newattrs.ia_valid |= ret | ATTR_FORCE;
>  
>  	mutex_lock(&dentry->d_inode->i_mutex);
> -	ret = notify_change(dentry, &newattrs);
> +	/* Note any delegations or leases have already been broken: */
> +	ret = notify_change(dentry, &newattrs, NULL);
>  	mutex_unlock(&dentry->d_inode->i_mutex);
>  	return ret;
>  }
> @@ -464,21 +465,28 @@ out:
>  static int chmod_common(struct path *path, umode_t mode)
>  {
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  	struct iattr newattrs;
>  	int error;
>  
>  	error = mnt_want_write(path->mnt);
>  	if (error)
>  		return error;
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
>  	error = security_path_chmod(path, mode);
>  	if (error)
>  		goto out_unlock;
>  	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
>  	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
> -	error = notify_change(path->dentry, &newattrs);
> +	error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  out_unlock:
>  	mutex_unlock(&inode->i_mutex);
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  	mnt_drop_write(path->mnt);
>  	return error;
>  }
> @@ -523,6 +531,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
>  static int chown_common(struct path *path, uid_t user, gid_t group)
>  {
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  	int error;
>  	struct iattr newattrs;
>  	kuid_t uid;
> @@ -547,12 +556,17 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
>  	if (!S_ISDIR(inode->i_mode))
>  		newattrs.ia_valid |=
>  			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
>  	error = security_path_chown(path, uid, gid);
>  	if (!error)
> -		error = notify_change(path->dentry, &newattrs);
> +		error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  	mutex_unlock(&inode->i_mutex);
> -
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  	return error;
>  }
>  
> diff --git a/fs/utimes.c b/fs/utimes.c
> index f4fb7ec..aa138d6 100644
> --- a/fs/utimes.c
> +++ b/fs/utimes.c
> @@ -53,6 +53,7 @@ static int utimes_common(struct path *path, struct timespec *times)
>  	int error;
>  	struct iattr newattrs;
>  	struct inode *inode = path->dentry->d_inode;
> +	struct inode *delegated_inode = NULL;
>  
>  	error = mnt_want_write(path->mnt);
>  	if (error)
> @@ -101,9 +102,15 @@ static int utimes_common(struct path *path, struct timespec *times)
>  				goto mnt_drop_write_and_out;
>  		}
>  	}
> +retry_deleg:
>  	mutex_lock(&inode->i_mutex);
> -	error = notify_change(path->dentry, &newattrs);
> +	error = notify_change(path->dentry, &newattrs, &delegated_inode);
>  	mutex_unlock(&inode->i_mutex);
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry_deleg;
> +	}
>  
>  mnt_drop_write_and_out:
>  	mnt_drop_write(path->mnt);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a2403f9..638cdae 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2268,7 +2268,7 @@ extern void emergency_remount(void);
>  #ifdef CONFIG_BLOCK
>  extern sector_t bmap(struct inode *, sector_t);
>  #endif
> -extern int notify_change(struct dentry *, struct iattr *);
> +extern int notify_change(struct dentry *, struct iattr *, struct inode **);
>  extern int inode_permission(struct inode *, int);
>  extern int generic_permission(struct inode *, int);
>  

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup
  2013-07-17 20:50 ` [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup J. Bruce Fields
@ 2013-07-26 10:53   ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 10:53 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Wed, 17 Jul 2013 16:50:14 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> As far as I can tell, this list is used only under the state lock, so we
> may as well do this in the simpler order.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/nfs4state.c |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 1698816..7c91b6c 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3028,18 +3028,18 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
>  	if (!fl)
>  		return -ENOMEM;
>  	fl->fl_file = find_readable_file(fp);
> -	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
>  	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
> -	if (status) {
> -		list_del_init(&dp->dl_perclnt);
> -		locks_free_lock(fl);
> -		return -ENOMEM;
> -	}
> +	if (status)
> +		goto out_free;
> +	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
>  	fp->fi_lease = fl;
>  	fp->fi_deleg_file = get_file(fl->fl_file);
>  	atomic_set(&fp->fi_delegees, 1);
>  	list_add(&dp->dl_perfile, &fp->fi_delegations);
>  	return 0;
> +out_free:
> +	locks_free_lock(fl);
> +	return -ENOMEM;
>  }
>  
>  static int nfs4_set_delegation(struct nfs4_delegation *dp)

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/16] nfsd4: delay setting current_fh in open
  2013-07-17 20:50     ` J. Bruce Fields
@ 2013-07-26 11:11         ` Jeff Layton
  -1 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 11:11 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Dave Chinner

On Wed, 17 Jul 2013 16:50:15 -0400
"J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> This is basically a no-op, to simplify a following patch.
> 
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/nfsd/nfs4proc.c |   36 ++++++++++++++++++++----------------
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index a7cee86..4c0cbeb 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -230,17 +230,17 @@ static void nfsd4_set_open_owner_reply_cache(struct nfsd4_compound_state *cstate
>  }
>  
>  static __be32
> -do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open)
> +do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state
> +*cstate, struct nfsd4_open *open, struct svc_fh **resfh)
	^^^
nit: weird indentation here

>  {
>  	struct svc_fh *current_fh = &cstate->current_fh;
> -	struct svc_fh *resfh;
>  	int accmode;
>  	__be32 status;
>  
> -	resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> -	if (!resfh)
> +	*resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> +	if (!*resfh)
>  		return nfserr_jukebox;
> -	fh_init(resfh, NFS4_FHSIZE);
> +	fh_init(*resfh, NFS4_FHSIZE);
>  	open->op_truncate = 0;
>  
>  	if (open->op_create) {
> @@ -265,7 +265,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
>  		 */
>  		status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
>  					open->op_fname.len, &open->op_iattr,
> -					resfh, open->op_createmode,
> +					*resfh, open->op_createmode,
>  					(u32 *)open->op_verf.data,
>  					&open->op_truncate, &open->op_created);
>  
> @@ -282,29 +282,26 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
>  							FATTR4_WORD1_TIME_MODIFY);
>  	} else {
>  		status = nfsd_lookup(rqstp, current_fh,
> -				     open->op_fname.data, open->op_fname.len, resfh);
> +				     open->op_fname.data, open->op_fname.len, *resfh);
>  		fh_unlock(current_fh);
>  	}
>  	if (status)
>  		goto out;
> -	status = nfsd_check_obj_isreg(resfh);
> +	status = nfsd_check_obj_isreg(*resfh);
>  	if (status)
>  		goto out;
>  
>  	if (is_create_with_attrs(open) && open->op_acl != NULL)
> -		do_set_nfs4_acl(rqstp, resfh, open->op_acl, open->op_bmval);
> +		do_set_nfs4_acl(rqstp, *resfh, open->op_acl, open->op_bmval);
>  
> -	nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
> +	nfsd4_set_open_owner_reply_cache(cstate, open, *resfh);
>  	accmode = NFSD_MAY_NOP;
>  	if (open->op_created ||
>  			open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
>  		accmode |= NFSD_MAY_OWNER_OVERRIDE;
> -	status = do_open_permission(rqstp, resfh, open, accmode);
> +	status = do_open_permission(rqstp, *resfh, open, accmode);
>  	set_change_info(&open->op_cinfo, current_fh);
> -	fh_dup2(current_fh, resfh);
>  out:
> -	fh_put(resfh);
> -	kfree(resfh);
>  	return status;
>  }
>  
> @@ -357,6 +354,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	   struct nfsd4_open *open)
>  {
>  	__be32 status;
> +	struct svc_fh *resfh = NULL;
>  	struct nfsd4_compoundres *resp;
>  	struct net *net = SVC_NET(rqstp);
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> @@ -423,7 +421,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	switch (open->op_claim_type) {
>  		case NFS4_OPEN_CLAIM_DELEGATE_CUR:
>  		case NFS4_OPEN_CLAIM_NULL:
> -			status = do_open_lookup(rqstp, cstate, open);
> +			status = do_open_lookup(rqstp, cstate, open, &resfh);
>  			if (status)
>  				goto out;
>  			break;
> @@ -439,6 +437,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  			status = do_open_fhandle(rqstp, cstate, open);
>  			if (status)
>  				goto out;
> +			resfh = &cstate->current_fh;
>  			break;
>  		case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
>               	case NFS4_OPEN_CLAIM_DELEGATE_PREV:
> @@ -458,9 +457,14 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	 * successful, it (1) truncates the file if open->op_truncate was
>  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
>  	 */
> -	status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> +	status = nfsd4_process_open2(rqstp, resfh, open);
>  	WARN_ON(status && open->op_created);
>  out:
> +	if (resfh && resfh != &cstate->current_fh) {
> +		fh_dup2(&cstate->current_fh, resfh);
> +		fh_put(resfh);
> +		kfree(resfh);
> +	}
>  	nfsd4_cleanup_open_state(open, status);
>  	if (open->op_openowner && !nfsd4_has_session(cstate))
>  		cstate->replay_owner = &open->op_openowner->oo_owner;

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/16] nfsd4: delay setting current_fh in open
@ 2013-07-26 11:11         ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 11:11 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Wed, 17 Jul 2013 16:50:15 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> This is basically a no-op, to simplify a following patch.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/nfs4proc.c |   36 ++++++++++++++++++++----------------
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index a7cee86..4c0cbeb 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -230,17 +230,17 @@ static void nfsd4_set_open_owner_reply_cache(struct nfsd4_compound_state *cstate
>  }
>  
>  static __be32
> -do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open)
> +do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state
> +*cstate, struct nfsd4_open *open, struct svc_fh **resfh)
	^^^
nit: weird indentation here

>  {
>  	struct svc_fh *current_fh = &cstate->current_fh;
> -	struct svc_fh *resfh;
>  	int accmode;
>  	__be32 status;
>  
> -	resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> -	if (!resfh)
> +	*resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> +	if (!*resfh)
>  		return nfserr_jukebox;
> -	fh_init(resfh, NFS4_FHSIZE);
> +	fh_init(*resfh, NFS4_FHSIZE);
>  	open->op_truncate = 0;
>  
>  	if (open->op_create) {
> @@ -265,7 +265,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
>  		 */
>  		status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
>  					open->op_fname.len, &open->op_iattr,
> -					resfh, open->op_createmode,
> +					*resfh, open->op_createmode,
>  					(u32 *)open->op_verf.data,
>  					&open->op_truncate, &open->op_created);
>  
> @@ -282,29 +282,26 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
>  							FATTR4_WORD1_TIME_MODIFY);
>  	} else {
>  		status = nfsd_lookup(rqstp, current_fh,
> -				     open->op_fname.data, open->op_fname.len, resfh);
> +				     open->op_fname.data, open->op_fname.len, *resfh);
>  		fh_unlock(current_fh);
>  	}
>  	if (status)
>  		goto out;
> -	status = nfsd_check_obj_isreg(resfh);
> +	status = nfsd_check_obj_isreg(*resfh);
>  	if (status)
>  		goto out;
>  
>  	if (is_create_with_attrs(open) && open->op_acl != NULL)
> -		do_set_nfs4_acl(rqstp, resfh, open->op_acl, open->op_bmval);
> +		do_set_nfs4_acl(rqstp, *resfh, open->op_acl, open->op_bmval);
>  
> -	nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
> +	nfsd4_set_open_owner_reply_cache(cstate, open, *resfh);
>  	accmode = NFSD_MAY_NOP;
>  	if (open->op_created ||
>  			open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
>  		accmode |= NFSD_MAY_OWNER_OVERRIDE;
> -	status = do_open_permission(rqstp, resfh, open, accmode);
> +	status = do_open_permission(rqstp, *resfh, open, accmode);
>  	set_change_info(&open->op_cinfo, current_fh);
> -	fh_dup2(current_fh, resfh);
>  out:
> -	fh_put(resfh);
> -	kfree(resfh);
>  	return status;
>  }
>  
> @@ -357,6 +354,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	   struct nfsd4_open *open)
>  {
>  	__be32 status;
> +	struct svc_fh *resfh = NULL;
>  	struct nfsd4_compoundres *resp;
>  	struct net *net = SVC_NET(rqstp);
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> @@ -423,7 +421,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	switch (open->op_claim_type) {
>  		case NFS4_OPEN_CLAIM_DELEGATE_CUR:
>  		case NFS4_OPEN_CLAIM_NULL:
> -			status = do_open_lookup(rqstp, cstate, open);
> +			status = do_open_lookup(rqstp, cstate, open, &resfh);
>  			if (status)
>  				goto out;
>  			break;
> @@ -439,6 +437,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  			status = do_open_fhandle(rqstp, cstate, open);
>  			if (status)
>  				goto out;
> +			resfh = &cstate->current_fh;
>  			break;
>  		case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
>               	case NFS4_OPEN_CLAIM_DELEGATE_PREV:
> @@ -458,9 +457,14 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	 * successful, it (1) truncates the file if open->op_truncate was
>  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
>  	 */
> -	status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> +	status = nfsd4_process_open2(rqstp, resfh, open);
>  	WARN_ON(status && open->op_created);
>  out:
> +	if (resfh && resfh != &cstate->current_fh) {
> +		fh_dup2(&cstate->current_fh, resfh);
> +		fh_put(resfh);
> +		kfree(resfh);
> +	}
>  	nfsd4_cleanup_open_state(open, status);
>  	if (open->op_openowner && !nfsd4_has_session(cstate))
>  		cstate->replay_owner = &open->op_openowner->oo_owner;

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
  2013-07-17 20:50     ` J. Bruce Fields
  (?)
@ 2013-07-26 11:23     ` Jeff Layton
       [not found]       ` <20130726072326.56113a2c-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  -1 siblings, 1 reply; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 11:23 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Wed, 17 Jul 2013 16:50:16 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> If a file is unlinked or renamed between the time when we do the local
> open and the time when we get the delegation, then we will return to the
> client indicating that it holds a delegation even though the file no
> longer exists under the name it was open under.
> 
> But a client performing an open-by-name, when it is returned a
> delegation, must be able to assume that the file is still linked at the
> name it was opened under.
> 
> So, pass the parent filehandle into the delegation and lease-setting
> code, and use it to re-lookup the file after we get the lease.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/nfs4proc.c  |    2 +-
>  fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
>  fs/nfsd/xdr4.h      |    3 ++-
>  3 files changed, 47 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 4c0cbeb..f44b29d 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	 * successful, it (1) truncates the file if open->op_truncate was
>  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
>  	 */
> -	status = nfsd4_process_open2(rqstp, resfh, open);
> +	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
>  	WARN_ON(status && open->op_created);
>  out:
>  	if (resfh && resfh != &cstate->current_fh) {
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7c91b6c..193f2bb 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
>  	return fl;
>  }
>  
> -static int nfs4_setlease(struct nfs4_delegation *dp)
> +static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
> +{
> +	struct xdr_netobj *name = &open->op_fname;
> +	struct dentry *res;
> +	bool ret;
> +
> +	if (parent->fh_dentry == dentry)
> +		/* This was an open by filehandle, we don't care: */
> +		return true;
> +	if (nfsd_mountpoint(dentry, parent->fh_export))
> +		/* We assume those never change */
> +		return true;
> +	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
> +	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
> +	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
> +	ret = res == dentry;
> +	if (!IS_ERR(res))
> +		dput(res);
> +	return ret;
> +}
> +
> +static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
>  {
>  	struct nfs4_file *fp = dp->dl_file;
>  	struct file_lock *fl;
> @@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
>  	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
>  	if (status)
>  		goto out_free;
> +	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
> +		goto out_unlease;
> +	spin_lock(&recall_lock);
> +	if (fp->fi_had_conflict)
> +		/*
> +		 * whoops, already broken, but before we got a chance to
> +		 * install our delegation; never mind:
> +		 */
> +		 goto out_unlock;
> +	list_add(&dp->dl_perfile, &fp->fi_delegations);
> +	spin_unlock(&recall_lock);
>  	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
>  	fp->fi_lease = fl;
>  	fp->fi_deleg_file = get_file(fl->fl_file);
>  	atomic_set(&fp->fi_delegees, 1);
> -	list_add(&dp->dl_perfile, &fp->fi_delegations);
>  	return 0;
> +out_unlock:
> +	spin_unlock(&recall_lock);
> +out_unlease:
> +	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
>  out_free:
>  	locks_free_lock(fl);
>  	return -ENOMEM;

Seems a little odd to return -ENOMEM when fi_had_conflict is true, but
from looking over the code I think that eventually becomes something
else so it shouldn't affect anything.

>  }
>  
> -static int nfs4_set_delegation(struct nfs4_delegation *dp)
> +static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
>  {
>  	struct nfs4_file *fp = dp->dl_file;
>  
>  	if (!fp->fi_lease)
> -		return nfs4_setlease(dp);
> +		return nfs4_setlease(dp, open, parent);
>  	spin_lock(&recall_lock);
>  	if (fp->fi_had_conflict) {
>  		spin_unlock(&recall_lock);
> @@ -3089,7 +3124,8 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
>   */
>  static void
>  nfs4_open_delegation(struct net *net, struct svc_fh *fh,
> -		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
> +		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> +		     struct svc_fh *parent)
>  {
>  	struct nfs4_delegation *dp;
>  	struct nfs4_openowner *oo = container_of(stp->st_stateowner, struct nfs4_openowner, oo_owner);
> @@ -3132,7 +3168,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
>  	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
>  	if (dp == NULL)
>  		goto out_no_deleg;
> -	status = nfs4_set_delegation(dp);
> +	status = nfs4_set_delegation(dp, open, parent);
>  	if (status)
>  		goto out_free;
>  
> @@ -3181,7 +3217,7 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
>   * called with nfs4_lock_state() held.
>   */
>  __be32
> -nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
> +nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open, struct svc_fh *parent)
>  {
>  	struct nfsd4_compoundres *resp = rqstp->rq_resp;
>  	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
> @@ -3250,7 +3286,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
>  	* Attempt to hand out a delegation. No error return, because the
>  	* OPEN succeeds even if we fail.
>  	*/
> -	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp);
> +	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp, parent);
>  nodeleg:
>  	status = nfs_ok;
>  
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index b3ed644..3058885 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -596,7 +596,8 @@ __be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *,
>  extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
>  		struct nfsd4_open *open, struct nfsd_net *nn);
>  extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
> -		struct svc_fh *current_fh, struct nfsd4_open *open);
> +		struct svc_fh *current_fh, struct nfsd4_open *open,
> +		struct svc_fh *parent);
>  extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
>  extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
>  		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);


Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 16/16] nfsd4: break only delegations when appropriate
  2013-07-17 20:50 ` [PATCH 16/16] nfsd4: break only delegations when appropriate J. Bruce Fields
@ 2013-07-26 11:24       ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 11:24 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Dave Chinner

On Wed, 17 Jul 2013 16:50:17 -0400
"J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> As a temporary fix, nfsd was breaking all leases on unlink, link,
> rename, and setattr.
> 
> Now that we can distinguish between leases and delegations, we can be
> nicer and break only the delegations, and not bother lease-holders with
> operations they don't care about.
> 
> And we get to delete some code while we're at it.
> 
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/nfsd/vfs.c |   27 ---------------------------
>  1 file changed, 27 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 2586f6d..51a5ede 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -273,13 +273,6 @@ out:
>  	return err;
>  }
>  
> -static int nfsd_break_lease(struct inode *inode)
> -{
> -	if (!S_ISREG(inode->i_mode))
> -		return 0;
> -	return break_lease(inode, O_WRONLY | O_NONBLOCK);
> -}
> -
>  /*
>   * Commit metadata changes to stable storage.
>   */
> @@ -422,9 +415,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  
>  	err = nfserr_notsync;
>  	if (!check_guard || guardtime == inode->i_ctime.tv_sec) {
> -		host_err = nfsd_break_lease(inode);
> -		if (host_err)
> -			goto out_nfserr;
>  		fh_lock(fhp);
>  
>  		host_err = notify_change(dentry, iap, NULL);
> @@ -1735,11 +1725,6 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
>  	err = nfserr_noent;
>  	if (!dold->d_inode)
>  		goto out_dput;
> -	host_err = nfsd_break_lease(dold->d_inode);
> -	if (host_err) {
> -		err = nfserrno(host_err);
> -		goto out_dput;
> -	}
>  	host_err = vfs_link(dold, dirp, dnew, NULL);
>  	if (!host_err) {
>  		err = nfserrno(commit_metadata(ffhp));
> @@ -1833,14 +1818,6 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
>  	if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
>  		goto out_dput_new;
>  
> -	host_err = nfsd_break_lease(odentry->d_inode);
> -	if (host_err)
> -		goto out_dput_new;
> -	if (ndentry->d_inode) {
> -		host_err = nfsd_break_lease(ndentry->d_inode);
> -		if (host_err)
> -			goto out_dput_new;
> -	}
>  	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL);
>  	if (!host_err) {
>  		host_err = commit_metadata(tfhp);
> @@ -1910,16 +1887,12 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
>  	if (!type)
>  		type = rdentry->d_inode->i_mode & S_IFMT;
>  
> -	host_err = nfsd_break_lease(rdentry->d_inode);
> -	if (host_err)
> -		goto out_put;
>  	if (type != S_IFDIR)
>  		host_err = vfs_unlink(dirp, rdentry, NULL);
>  	else
>  		host_err = vfs_rmdir(dirp, rdentry);
>  	if (!host_err)
>  		host_err = commit_metadata(fhp);
> -out_put:
>  	dput(rdentry);
>  
>  out_nfserr:

Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 16/16] nfsd4: break only delegations when appropriate
@ 2013-07-26 11:24       ` Jeff Layton
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff Layton @ 2013-07-26 11:24 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Wed, 17 Jul 2013 16:50:17 -0400
"J. Bruce Fields" <bfields@redhat.com> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> As a temporary fix, nfsd was breaking all leases on unlink, link,
> rename, and setattr.
> 
> Now that we can distinguish between leases and delegations, we can be
> nicer and break only the delegations, and not bother lease-holders with
> operations they don't care about.
> 
> And we get to delete some code while we're at it.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/vfs.c |   27 ---------------------------
>  1 file changed, 27 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 2586f6d..51a5ede 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -273,13 +273,6 @@ out:
>  	return err;
>  }
>  
> -static int nfsd_break_lease(struct inode *inode)
> -{
> -	if (!S_ISREG(inode->i_mode))
> -		return 0;
> -	return break_lease(inode, O_WRONLY | O_NONBLOCK);
> -}
> -
>  /*
>   * Commit metadata changes to stable storage.
>   */
> @@ -422,9 +415,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  
>  	err = nfserr_notsync;
>  	if (!check_guard || guardtime == inode->i_ctime.tv_sec) {
> -		host_err = nfsd_break_lease(inode);
> -		if (host_err)
> -			goto out_nfserr;
>  		fh_lock(fhp);
>  
>  		host_err = notify_change(dentry, iap, NULL);
> @@ -1735,11 +1725,6 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
>  	err = nfserr_noent;
>  	if (!dold->d_inode)
>  		goto out_dput;
> -	host_err = nfsd_break_lease(dold->d_inode);
> -	if (host_err) {
> -		err = nfserrno(host_err);
> -		goto out_dput;
> -	}
>  	host_err = vfs_link(dold, dirp, dnew, NULL);
>  	if (!host_err) {
>  		err = nfserrno(commit_metadata(ffhp));
> @@ -1833,14 +1818,6 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
>  	if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
>  		goto out_dput_new;
>  
> -	host_err = nfsd_break_lease(odentry->d_inode);
> -	if (host_err)
> -		goto out_dput_new;
> -	if (ndentry->d_inode) {
> -		host_err = nfsd_break_lease(ndentry->d_inode);
> -		if (host_err)
> -			goto out_dput_new;
> -	}
>  	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL);
>  	if (!host_err) {
>  		host_err = commit_metadata(tfhp);
> @@ -1910,16 +1887,12 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
>  	if (!type)
>  		type = rdentry->d_inode->i_mode & S_IFMT;
>  
> -	host_err = nfsd_break_lease(rdentry->d_inode);
> -	if (host_err)
> -		goto out_put;
>  	if (type != S_IFDIR)
>  		host_err = vfs_unlink(dirp, rdentry, NULL);
>  	else
>  		host_err = vfs_rmdir(dirp, rdentry);
>  	if (!host_err)
>  		host_err = commit_metadata(fhp);
> -out_put:
>  	dput(rdentry);
>  
>  out_nfserr:

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
  2013-07-26 11:23     ` Jeff Layton
@ 2013-07-26 16:04           ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-26 16:04 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Dave Chinner

On Fri, Jul 26, 2013 at 07:23:26AM -0400, Jeff Layton wrote:
> On Wed, 17 Jul 2013 16:50:16 -0400
> "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> 
> > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > 
> > If a file is unlinked or renamed between the time when we do the local
> > open and the time when we get the delegation, then we will return to the
> > client indicating that it holds a delegation even though the file no
> > longer exists under the name it was open under.
> > 
> > But a client performing an open-by-name, when it is returned a
> > delegation, must be able to assume that the file is still linked at the
> > name it was opened under.
> > 
> > So, pass the parent filehandle into the delegation and lease-setting
> > code, and use it to re-lookup the file after we get the lease.
> > 
> > Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  fs/nfsd/nfs4proc.c  |    2 +-
> >  fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
> >  fs/nfsd/xdr4.h      |    3 ++-
> >  3 files changed, 47 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 4c0cbeb..f44b29d 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  	 * successful, it (1) truncates the file if open->op_truncate was
> >  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
> >  	 */
> > -	status = nfsd4_process_open2(rqstp, resfh, open);
> > +	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
> >  	WARN_ON(status && open->op_created);
> >  out:
> >  	if (resfh && resfh != &cstate->current_fh) {
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 7c91b6c..193f2bb 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
> >  	return fl;
> >  }
> >  
> > -static int nfs4_setlease(struct nfs4_delegation *dp)
> > +static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
> > +{
> > +	struct xdr_netobj *name = &open->op_fname;
> > +	struct dentry *res;
> > +	bool ret;
> > +
> > +	if (parent->fh_dentry == dentry)
> > +		/* This was an open by filehandle, we don't care: */
> > +		return true;
> > +	if (nfsd_mountpoint(dentry, parent->fh_export))
> > +		/* We assume those never change */
> > +		return true;
> > +	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
> > +	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
> > +	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
> > +	ret = res == dentry;
> > +	if (!IS_ERR(res))
> > +		dput(res);
> > +	return ret;
> > +}
> > +
> > +static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfs4_file *fp = dp->dl_file;
> >  	struct file_lock *fl;
> > @@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
> >  	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
> >  	if (status)
> >  		goto out_free;
> > +	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
> > +		goto out_unlease;
> > +	spin_lock(&recall_lock);
> > +	if (fp->fi_had_conflict)
> > +		/*
> > +		 * whoops, already broken, but before we got a chance to
> > +		 * install our delegation; never mind:
> > +		 */
> > +		 goto out_unlock;
> > +	list_add(&dp->dl_perfile, &fp->fi_delegations);
> > +	spin_unlock(&recall_lock);
> >  	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
> >  	fp->fi_lease = fl;
> >  	fp->fi_deleg_file = get_file(fl->fl_file);
> >  	atomic_set(&fp->fi_delegees, 1);
> > -	list_add(&dp->dl_perfile, &fp->fi_delegations);
> >  	return 0;
> > +out_unlock:
> > +	spin_unlock(&recall_lock);
> > +out_unlease:
> > +	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
> >  out_free:
> >  	locks_free_lock(fl);
> >  	return -ENOMEM;
> 
> Seems a little odd to return -ENOMEM when fi_had_conflict is true, but
> from looking over the code I think that eventually becomes something
> else so it shouldn't affect anything.

Yeah.  Errors setting a lease are ignored.  But I suppose eventually
(e.g. as we implement more 4.1 stuff that tells client more about why
delegations failed) we may want to use that.  So, probably fix this up
with something like:


 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
 		goto out_free;
+	status = -EAGAIN;
 	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
 		goto out_unlease;
 	spin_lock(&recall_lock);
@@ -3074,7 +3075,7 @@ out_unlease:
 	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
 out_free:
 	locks_free_lock(fl);
-	return -ENOMEM;
+	return status;
 }
 
in a later patch.

Thanks!

Actually this was an accident, I didn't mean to send out these last four
"nfsd4:" patches yet.

They depend on that but they're much more nfsd-specific (and probably
shouldn't go through the vfs tree for example).

--b.

> 
> >  }
> >  
> > -static int nfs4_set_delegation(struct nfs4_delegation *dp)
> > +static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfs4_file *fp = dp->dl_file;
> >  
> >  	if (!fp->fi_lease)
> > -		return nfs4_setlease(dp);
> > +		return nfs4_setlease(dp, open, parent);
> >  	spin_lock(&recall_lock);
> >  	if (fp->fi_had_conflict) {
> >  		spin_unlock(&recall_lock);
> > @@ -3089,7 +3124,8 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
> >   */
> >  static void
> >  nfs4_open_delegation(struct net *net, struct svc_fh *fh,
> > -		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
> > +		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> > +		     struct svc_fh *parent)
> >  {
> >  	struct nfs4_delegation *dp;
> >  	struct nfs4_openowner *oo = container_of(stp->st_stateowner, struct nfs4_openowner, oo_owner);
> > @@ -3132,7 +3168,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
> >  	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
> >  	if (dp == NULL)
> >  		goto out_no_deleg;
> > -	status = nfs4_set_delegation(dp);
> > +	status = nfs4_set_delegation(dp, open, parent);
> >  	if (status)
> >  		goto out_free;
> >  
> > @@ -3181,7 +3217,7 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
> >   * called with nfs4_lock_state() held.
> >   */
> >  __be32
> > -nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
> > +nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfsd4_compoundres *resp = rqstp->rq_resp;
> >  	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
> > @@ -3250,7 +3286,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> >  	* Attempt to hand out a delegation. No error return, because the
> >  	* OPEN succeeds even if we fail.
> >  	*/
> > -	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp);
> > +	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp, parent);
> >  nodeleg:
> >  	status = nfs_ok;
> >  
> > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > index b3ed644..3058885 100644
> > --- a/fs/nfsd/xdr4.h
> > +++ b/fs/nfsd/xdr4.h
> > @@ -596,7 +596,8 @@ __be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *,
> >  extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
> >  		struct nfsd4_open *open, struct nfsd_net *nn);
> >  extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
> > -		struct svc_fh *current_fh, struct nfsd4_open *open);
> > +		struct svc_fh *current_fh, struct nfsd4_open *open,
> > +		struct svc_fh *parent);
> >  extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
> >  extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
> >  		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
> 
> 
> Acked-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
@ 2013-07-26 16:04           ` J. Bruce Fields
  0 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-26 16:04 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Fri, Jul 26, 2013 at 07:23:26AM -0400, Jeff Layton wrote:
> On Wed, 17 Jul 2013 16:50:16 -0400
> "J. Bruce Fields" <bfields@redhat.com> wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > If a file is unlinked or renamed between the time when we do the local
> > open and the time when we get the delegation, then we will return to the
> > client indicating that it holds a delegation even though the file no
> > longer exists under the name it was open under.
> > 
> > But a client performing an open-by-name, when it is returned a
> > delegation, must be able to assume that the file is still linked at the
> > name it was opened under.
> > 
> > So, pass the parent filehandle into the delegation and lease-setting
> > code, and use it to re-lookup the file after we get the lease.
> > 
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> >  fs/nfsd/nfs4proc.c  |    2 +-
> >  fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
> >  fs/nfsd/xdr4.h      |    3 ++-
> >  3 files changed, 47 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 4c0cbeb..f44b29d 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  	 * successful, it (1) truncates the file if open->op_truncate was
> >  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
> >  	 */
> > -	status = nfsd4_process_open2(rqstp, resfh, open);
> > +	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
> >  	WARN_ON(status && open->op_created);
> >  out:
> >  	if (resfh && resfh != &cstate->current_fh) {
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 7c91b6c..193f2bb 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
> >  	return fl;
> >  }
> >  
> > -static int nfs4_setlease(struct nfs4_delegation *dp)
> > +static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
> > +{
> > +	struct xdr_netobj *name = &open->op_fname;
> > +	struct dentry *res;
> > +	bool ret;
> > +
> > +	if (parent->fh_dentry == dentry)
> > +		/* This was an open by filehandle, we don't care: */
> > +		return true;
> > +	if (nfsd_mountpoint(dentry, parent->fh_export))
> > +		/* We assume those never change */
> > +		return true;
> > +	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
> > +	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
> > +	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
> > +	ret = res == dentry;
> > +	if (!IS_ERR(res))
> > +		dput(res);
> > +	return ret;
> > +}
> > +
> > +static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfs4_file *fp = dp->dl_file;
> >  	struct file_lock *fl;
> > @@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
> >  	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
> >  	if (status)
> >  		goto out_free;
> > +	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
> > +		goto out_unlease;
> > +	spin_lock(&recall_lock);
> > +	if (fp->fi_had_conflict)
> > +		/*
> > +		 * whoops, already broken, but before we got a chance to
> > +		 * install our delegation; never mind:
> > +		 */
> > +		 goto out_unlock;
> > +	list_add(&dp->dl_perfile, &fp->fi_delegations);
> > +	spin_unlock(&recall_lock);
> >  	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
> >  	fp->fi_lease = fl;
> >  	fp->fi_deleg_file = get_file(fl->fl_file);
> >  	atomic_set(&fp->fi_delegees, 1);
> > -	list_add(&dp->dl_perfile, &fp->fi_delegations);
> >  	return 0;
> > +out_unlock:
> > +	spin_unlock(&recall_lock);
> > +out_unlease:
> > +	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
> >  out_free:
> >  	locks_free_lock(fl);
> >  	return -ENOMEM;
> 
> Seems a little odd to return -ENOMEM when fi_had_conflict is true, but
> from looking over the code I think that eventually becomes something
> else so it shouldn't affect anything.

Yeah.  Errors setting a lease are ignored.  But I suppose eventually
(e.g. as we implement more 4.1 stuff that tells client more about why
delegations failed) we may want to use that.  So, probably fix this up
with something like:


 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
 		goto out_free;
+	status = -EAGAIN;
 	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
 		goto out_unlease;
 	spin_lock(&recall_lock);
@@ -3074,7 +3075,7 @@ out_unlease:
 	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
 out_free:
 	locks_free_lock(fl);
-	return -ENOMEM;
+	return status;
 }
 
in a later patch.

Thanks!

Actually this was an accident, I didn't mean to send out these last four
"nfsd4:" patches yet.

They depend on that but they're much more nfsd-specific (and probably
shouldn't go through the vfs tree for example).

--b.

> 
> >  }
> >  
> > -static int nfs4_set_delegation(struct nfs4_delegation *dp)
> > +static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfs4_file *fp = dp->dl_file;
> >  
> >  	if (!fp->fi_lease)
> > -		return nfs4_setlease(dp);
> > +		return nfs4_setlease(dp, open, parent);
> >  	spin_lock(&recall_lock);
> >  	if (fp->fi_had_conflict) {
> >  		spin_unlock(&recall_lock);
> > @@ -3089,7 +3124,8 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
> >   */
> >  static void
> >  nfs4_open_delegation(struct net *net, struct svc_fh *fh,
> > -		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
> > +		     struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> > +		     struct svc_fh *parent)
> >  {
> >  	struct nfs4_delegation *dp;
> >  	struct nfs4_openowner *oo = container_of(stp->st_stateowner, struct nfs4_openowner, oo_owner);
> > @@ -3132,7 +3168,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
> >  	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
> >  	if (dp == NULL)
> >  		goto out_no_deleg;
> > -	status = nfs4_set_delegation(dp);
> > +	status = nfs4_set_delegation(dp, open, parent);
> >  	if (status)
> >  		goto out_free;
> >  
> > @@ -3181,7 +3217,7 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
> >   * called with nfs4_lock_state() held.
> >   */
> >  __be32
> > -nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
> > +nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open, struct svc_fh *parent)
> >  {
> >  	struct nfsd4_compoundres *resp = rqstp->rq_resp;
> >  	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
> > @@ -3250,7 +3286,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> >  	* Attempt to hand out a delegation. No error return, because the
> >  	* OPEN succeeds even if we fail.
> >  	*/
> > -	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp);
> > +	nfs4_open_delegation(SVC_NET(rqstp), current_fh, open, stp, parent);
> >  nodeleg:
> >  	status = nfs_ok;
> >  
> > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > index b3ed644..3058885 100644
> > --- a/fs/nfsd/xdr4.h
> > +++ b/fs/nfsd/xdr4.h
> > @@ -596,7 +596,8 @@ __be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *,
> >  extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
> >  		struct nfsd4_open *open, struct nfsd_net *nn);
> >  extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
> > -		struct svc_fh *current_fh, struct nfsd4_open *open);
> > +		struct svc_fh *current_fh, struct nfsd4_open *open,
> > +		struct svc_fh *parent);
> >  extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
> >  extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
> >  		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
> 
> 
> Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/16] nfsd4: delay setting current_fh in open
  2013-07-26 11:11         ` Jeff Layton
  (?)
@ 2013-07-26 16:04         ` J. Bruce Fields
  -1 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-26 16:04 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Fri, Jul 26, 2013 at 07:11:53AM -0400, Jeff Layton wrote:
> On Wed, 17 Jul 2013 16:50:15 -0400
> "J. Bruce Fields" <bfields@redhat.com> wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > This is basically a no-op, to simplify a following patch.
> > 
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> >  fs/nfsd/nfs4proc.c |   36 ++++++++++++++++++++----------------
> >  1 file changed, 20 insertions(+), 16 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index a7cee86..4c0cbeb 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -230,17 +230,17 @@ static void nfsd4_set_open_owner_reply_cache(struct nfsd4_compound_state *cstate
> >  }
> >  
> >  static __be32
> > -do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open)
> > +do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state
> > +*cstate, struct nfsd4_open *open, struct svc_fh **resfh)
> 	^^^
> nit: weird indentation here

Thanks, fixed.--b.

> 
> >  {
> >  	struct svc_fh *current_fh = &cstate->current_fh;
> > -	struct svc_fh *resfh;
> >  	int accmode;
> >  	__be32 status;
> >  
> > -	resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> > -	if (!resfh)
> > +	*resfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
> > +	if (!*resfh)
> >  		return nfserr_jukebox;
> > -	fh_init(resfh, NFS4_FHSIZE);
> > +	fh_init(*resfh, NFS4_FHSIZE);
> >  	open->op_truncate = 0;
> >  
> >  	if (open->op_create) {
> > @@ -265,7 +265,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
> >  		 */
> >  		status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
> >  					open->op_fname.len, &open->op_iattr,
> > -					resfh, open->op_createmode,
> > +					*resfh, open->op_createmode,
> >  					(u32 *)open->op_verf.data,
> >  					&open->op_truncate, &open->op_created);
> >  
> > @@ -282,29 +282,26 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
> >  							FATTR4_WORD1_TIME_MODIFY);
> >  	} else {
> >  		status = nfsd_lookup(rqstp, current_fh,
> > -				     open->op_fname.data, open->op_fname.len, resfh);
> > +				     open->op_fname.data, open->op_fname.len, *resfh);
> >  		fh_unlock(current_fh);
> >  	}
> >  	if (status)
> >  		goto out;
> > -	status = nfsd_check_obj_isreg(resfh);
> > +	status = nfsd_check_obj_isreg(*resfh);
> >  	if (status)
> >  		goto out;
> >  
> >  	if (is_create_with_attrs(open) && open->op_acl != NULL)
> > -		do_set_nfs4_acl(rqstp, resfh, open->op_acl, open->op_bmval);
> > +		do_set_nfs4_acl(rqstp, *resfh, open->op_acl, open->op_bmval);
> >  
> > -	nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
> > +	nfsd4_set_open_owner_reply_cache(cstate, open, *resfh);
> >  	accmode = NFSD_MAY_NOP;
> >  	if (open->op_created ||
> >  			open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
> >  		accmode |= NFSD_MAY_OWNER_OVERRIDE;
> > -	status = do_open_permission(rqstp, resfh, open, accmode);
> > +	status = do_open_permission(rqstp, *resfh, open, accmode);
> >  	set_change_info(&open->op_cinfo, current_fh);
> > -	fh_dup2(current_fh, resfh);
> >  out:
> > -	fh_put(resfh);
> > -	kfree(resfh);
> >  	return status;
> >  }
> >  
> > @@ -357,6 +354,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  	   struct nfsd4_open *open)
> >  {
> >  	__be32 status;
> > +	struct svc_fh *resfh = NULL;
> >  	struct nfsd4_compoundres *resp;
> >  	struct net *net = SVC_NET(rqstp);
> >  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > @@ -423,7 +421,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  	switch (open->op_claim_type) {
> >  		case NFS4_OPEN_CLAIM_DELEGATE_CUR:
> >  		case NFS4_OPEN_CLAIM_NULL:
> > -			status = do_open_lookup(rqstp, cstate, open);
> > +			status = do_open_lookup(rqstp, cstate, open, &resfh);
> >  			if (status)
> >  				goto out;
> >  			break;
> > @@ -439,6 +437,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  			status = do_open_fhandle(rqstp, cstate, open);
> >  			if (status)
> >  				goto out;
> > +			resfh = &cstate->current_fh;
> >  			break;
> >  		case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
> >               	case NFS4_OPEN_CLAIM_DELEGATE_PREV:
> > @@ -458,9 +457,14 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >  	 * successful, it (1) truncates the file if open->op_truncate was
> >  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
> >  	 */
> > -	status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> > +	status = nfsd4_process_open2(rqstp, resfh, open);
> >  	WARN_ON(status && open->op_created);
> >  out:
> > +	if (resfh && resfh != &cstate->current_fh) {
> > +		fh_dup2(&cstate->current_fh, resfh);
> > +		fh_put(resfh);
> > +		kfree(resfh);
> > +	}
> >  	nfsd4_cleanup_open_state(open, status);
> >  	if (open->op_openowner && !nfsd4_has_session(cstate))
> >  		cstate->replay_owner = &open->op_openowner->oo_owner;
> 
> Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race
  2013-07-26 16:04           ` J. Bruce Fields
  (?)
@ 2013-07-26 21:14           ` J. Bruce Fields
  -1 siblings, 0 replies; 43+ messages in thread
From: J. Bruce Fields @ 2013-07-26 21:14 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, Al Viro, linux-nfs, linux-fsdevel, Dave Chinner

On Fri, Jul 26, 2013 at 12:04:33PM -0400, J. Bruce Fields wrote:
> On Fri, Jul 26, 2013 at 07:23:26AM -0400, Jeff Layton wrote:
> > On Wed, 17 Jul 2013 16:50:16 -0400
> > "J. Bruce Fields" <bfields@redhat.com> wrote:
> > 
> > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > 
> > > If a file is unlinked or renamed between the time when we do the local
> > > open and the time when we get the delegation, then we will return to the
> > > client indicating that it holds a delegation even though the file no
> > > longer exists under the name it was open under.
> > > 
> > > But a client performing an open-by-name, when it is returned a
> > > delegation, must be able to assume that the file is still linked at the
> > > name it was opened under.
> > > 
> > > So, pass the parent filehandle into the delegation and lease-setting
> > > code, and use it to re-lookup the file after we get the lease.
> > > 
> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > ---
> > >  fs/nfsd/nfs4proc.c  |    2 +-
> > >  fs/nfsd/nfs4state.c |   52 +++++++++++++++++++++++++++++++++++++++++++--------
> > >  fs/nfsd/xdr4.h      |    3 ++-
> > >  3 files changed, 47 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > index 4c0cbeb..f44b29d 100644
> > > --- a/fs/nfsd/nfs4proc.c
> > > +++ b/fs/nfsd/nfs4proc.c
> > > @@ -457,7 +457,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >  	 * successful, it (1) truncates the file if open->op_truncate was
> > >  	 * set, (2) sets open->op_stateid, (3) sets open->op_delegation.
> > >  	 */
> > > -	status = nfsd4_process_open2(rqstp, resfh, open);
> > > +	status = nfsd4_process_open2(rqstp, resfh, open, &cstate->current_fh);
> > >  	WARN_ON(status && open->op_created);
> > >  out:
> > >  	if (resfh && resfh != &cstate->current_fh) {
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 7c91b6c..193f2bb 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -3018,7 +3018,28 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
> > >  	return fl;
> > >  }
> > >  
> > > -static int nfs4_setlease(struct nfs4_delegation *dp)
> > > +static bool nfsd4_name_still_same(struct svc_fh *parent, struct nfsd4_open *open, struct dentry *dentry)
> > > +{
> > > +	struct xdr_netobj *name = &open->op_fname;
> > > +	struct dentry *res;
> > > +	bool ret;
> > > +
> > > +	if (parent->fh_dentry == dentry)
> > > +		/* This was an open by filehandle, we don't care: */
> > > +		return true;
> > > +	if (nfsd_mountpoint(dentry, parent->fh_export))
> > > +		/* We assume those never change */
> > > +		return true;
> > > +	mutex_lock(&parent->fh_dentry->d_inode->i_mutex); /* XXX? */
> > > +	res = lookup_one_len(name->data, parent->fh_dentry, name->len);
> > > +	mutex_unlock(&parent->fh_dentry->d_inode->i_mutex);
> > > +	ret = res == dentry;
> > > +	if (!IS_ERR(res))
> > > +		dput(res);
> > > +	return ret;
> > > +}
> > > +
> > > +static int nfs4_setlease(struct nfs4_delegation *dp, struct nfsd4_open *open, struct svc_fh *parent)
> > >  {
> > >  	struct nfs4_file *fp = dp->dl_file;
> > >  	struct file_lock *fl;
> > > @@ -3031,23 +3052,37 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
> > >  	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
> > >  	if (status)
> > >  		goto out_free;
> > > +	if (!nfsd4_name_still_same(parent, open, fl->fl_file->f_dentry))
> > > +		goto out_unlease;
> > > +	spin_lock(&recall_lock);
> > > +	if (fp->fi_had_conflict)
> > > +		/*
> > > +		 * whoops, already broken, but before we got a chance to
> > > +		 * install our delegation; never mind:
> > > +		 */
> > > +		 goto out_unlock;
> > > +	list_add(&dp->dl_perfile, &fp->fi_delegations);
> > > +	spin_unlock(&recall_lock);
> > >  	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
> > >  	fp->fi_lease = fl;
> > >  	fp->fi_deleg_file = get_file(fl->fl_file);
> > >  	atomic_set(&fp->fi_delegees, 1);
> > > -	list_add(&dp->dl_perfile, &fp->fi_delegations);
> > >  	return 0;
> > > +out_unlock:
> > > +	spin_unlock(&recall_lock);
> > > +out_unlease:
> > > +	vfs_setlease(fl->fl_file, F_UNLCK, &fl);
> > >  out_free:
> > >  	locks_free_lock(fl);
> > >  	return -ENOMEM;
> > 
> > Seems a little odd to return -ENOMEM when fi_had_conflict is true, but
> > from looking over the code I think that eventually becomes something
> > else so it shouldn't affect anything.
> 
> Yeah.  Errors setting a lease are ignored.  But I suppose eventually
> (e.g. as we implement more 4.1 stuff that tells client more about why
> delegations failed) we may want to use that.

Oh, actually we already do enough of that for the difference to be
visible to a 4.1 client (it will get either WND4_CONTENTION or
WND4_RESOURCE).  Doubt it makes much difference, probably no client is
even using this information yet, but I'll queue this up for 3.12.

--b.

commit b1948a641daefe8d128749f3d419ed24d529a8ed
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Fri Jul 26 16:57:20 2013 -0400

    nfsd4: fix setlease error return
    
    This actually makes a difference in the 4.1 case, since we use the
    status to decide what reason to give the client for the delegation
    refusal (see nfsd4_open_deleg_none_ext), and in theory a client might
    choose suboptimal behavior if we give the wrong answer.
    
    Reported-by: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1cb6211..1852f53 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3028,7 +3028,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	if (status) {
 		list_del_init(&dp->dl_perclnt);
 		locks_free_lock(fl);
-		return -ENOMEM;
+		return status;
 	}
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = get_file(fl->fl_file);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2013-07-26 21:14 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
2013-07-17 21:08   ` J. Bruce Fields
2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
2013-07-17 20:50 ` [PATCH 03/16] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2013-07-17 20:50 ` [PATCH 04/16] vfs: take i_mutex on renamed file J. Bruce Fields
2013-07-17 20:50 ` [PATCH 08/16] locks: break delegations on unlink J. Bruce Fields
2013-07-17 20:50 ` [PATCH 10/16] locks: break delegations on rename J. Bruce Fields
2013-07-17 20:50 ` [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup J. Bruce Fields
2013-07-26 10:53   ` Jeff Layton
     [not found] ` <1374094217-31493-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-17 20:50   ` [PATCH 01/16] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 05/16] locks: introduce new FL_DELEG lock flag J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 06/16] locks: implement delegations J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 07/16] namei: minor vfs_unlink cleanup J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 09/16] locks: helper functions for delegation breaking J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
     [not found]     ` <1374094217-31493-11-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 10:50       ` Jeff Layton
2013-07-26 10:50         ` Jeff Layton
2013-07-17 20:50   ` [PATCH 11/16] locks: break delegations on link J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 12/16] locks: break delegations on any attribute modification J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
     [not found]     ` <1374094217-31493-14-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 10:50       ` Jeff Layton
2013-07-26 10:50         ` Jeff Layton
2013-07-17 20:50   ` [PATCH 14/16] nfsd4: delay setting current_fh in open J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
     [not found]     ` <1374094217-31493-16-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 11:11       ` Jeff Layton
2013-07-26 11:11         ` Jeff Layton
2013-07-26 16:04         ` J. Bruce Fields
2013-07-17 20:50   ` [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race J. Bruce Fields
2013-07-17 20:50     ` J. Bruce Fields
2013-07-26 11:23     ` Jeff Layton
     [not found]       ` <20130726072326.56113a2c-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2013-07-26 16:04         ` J. Bruce Fields
2013-07-26 16:04           ` J. Bruce Fields
2013-07-26 21:14           ` J. Bruce Fields
2013-07-17 20:50 ` [PATCH 16/16] nfsd4: break only delegations when appropriate J. Bruce Fields
     [not found]   ` <1374094217-31493-18-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 11:24     ` Jeff Layton
2013-07-26 11:24       ` Jeff Layton
2013-07-17 21:09 ` [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.