All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 0/7] NFS Force Unmounting
@ 2017-11-10 22:37 Joshua Watt
  2017-11-10 22:37 ` [RFC v2 1/7] SUNRPC: Add flag to kill new tasks Joshua Watt
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Reworked patch set in response to Neil's comments. This patchset
requires his patch as a prerequisite:
    http://www.spinics.net/lists/linux-nfs/msg66341.html

These patches implements behavior that allows a failed server to be "cut
off" (e.g. all RPC tasks fail with -EIO) with the following sequence of
operations:

 mount -o remount,serverfailed
 umount -f

In the event you want to try and recover the server, you can remount to
remove the serverfailed status:
 
 mount -o remount,noserverfailed

Although, I'm not sure what implications that might have for data loss.

For the most part I think it is pretty straight forward, the only thing
I am unsure about is the last two patches. This behavior is really
useful in my use case (where the server disappears) as it prevents the
umount -f from blocking while trying to send some DESTROY_SESSION
messages to the client (as long as you mounted with nosharecache).

Version 1 can be found at:
    http://www.spinics.net/lists/linux-nfs/msg66348.html

Thanks in advance for the reviews and comments

Joshua Watt (7):
  SUNRPC: Add flag to kill new tasks
  SUNRPC: Expose kill_new_tasks in debugfs
  SUNRPC: Simplify client shutdown
  NFS: Add mount flags mask
  NFS: Add serverfailed mount option
  NFS: Propagate NFS_MOUNT_UNSHARED to clients
  NFS: Propagate operations to unshared clients

 fs/nfs/client.c                |  9 +++-
 fs/nfs/inode.c                 |  6 +++
 fs/nfs/internal.h              |  1 +
 fs/nfs/nfs3client.c            |  2 +
 fs/nfs/nfs4client.c            |  4 ++
 fs/nfs/super.c                 | 98 ++++++++++++++++++++++++++----------------
 include/linux/nfs_fs_sb.h      |  1 +
 include/linux/sunrpc/clnt.h    |  1 +
 include/uapi/linux/nfs_mount.h |  1 +
 net/sunrpc/clnt.c              | 11 ++---
 net/sunrpc/debugfs.c           |  4 ++
 net/sunrpc/sched.c             |  3 ++
 12 files changed, 95 insertions(+), 46 deletions(-)

-- 
2.13.6


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC v2 1/7] SUNRPC: Add flag to kill new tasks
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:37 ` [RFC v2 2/7] SUNRPC: Expose kill_new_tasks in debugfs Joshua Watt
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

The flag causes any new tasks that are queued to exit immediately with
-EIO instead of executing. This will allow clients (particularly NFS) to
prevents these task from delaying shutdown of the RPC session longer
than necessary.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 include/linux/sunrpc/clnt.h | 1 +
 net/sunrpc/clnt.c           | 3 ---
 net/sunrpc/sched.c          | 3 +++
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 71c237e8240e..94f4e464de1b 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -54,6 +54,7 @@ struct rpc_clnt {
 				cl_noretranstimeo: 1,/* No retransmit timeouts */
 				cl_autobind : 1,/* use getport() */
 				cl_chatty   : 1;/* be verbose */
+	bool			cl_kill_new_tasks;	/* Kill all new tasks */
 
 	struct rpc_rtt *	cl_rtt;		/* RTO estimator data */
 	const struct rpc_timeout *cl_timeout;	/* Timeout strategy */
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index df4ecb042ebe..aa64cd667696 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -818,9 +818,6 @@ void rpc_killall_tasks(struct rpc_clnt *clnt)
 {
 	struct rpc_task	*rovr;
 
-
-	if (list_empty(&clnt->cl_tasks))
-		return;
 	dprintk("RPC:       killing all tasks for client %p\n", clnt);
 	/*
 	 * Spin lock all_tasks to prevent changes...
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 0cc83839c13c..2c9fe5c0319b 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -748,6 +748,9 @@ static void __rpc_execute(struct rpc_task *task)
 	dprintk("RPC: %5u __rpc_execute flags=0x%x\n",
 			task->tk_pid, task->tk_flags);
 
+	if (task->tk_client->cl_kill_new_tasks)
+		rpc_exit(task, -EIO);
+
 	WARN_ON_ONCE(RPC_IS_QUEUED(task));
 	if (RPC_IS_QUEUED(task))
 		return;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 2/7] SUNRPC: Expose kill_new_tasks in debugfs
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
  2017-11-10 22:37 ` [RFC v2 1/7] SUNRPC: Add flag to kill new tasks Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:37 ` [RFC v2 3/7] SUNRPC: Simplify client shutdown Joshua Watt
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 net/sunrpc/debugfs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/sunrpc/debugfs.c b/net/sunrpc/debugfs.c
index e980d2a493de..2ca67b681c1e 100644
--- a/net/sunrpc/debugfs.c
+++ b/net/sunrpc/debugfs.c
@@ -160,6 +160,10 @@ rpc_clnt_debugfs_register(struct rpc_clnt *clnt)
 	if (!debugfs_create_symlink("xprt", clnt->cl_debugfs, name))
 		goto out_err;
 
+	if (!debugfs_create_bool("kill_new_tasks", S_IRUSR | S_IWUSR,
+				 clnt->cl_debugfs, &clnt->cl_kill_new_tasks))
+		goto out_err;
+
 	return;
 out_err:
 	debugfs_remove_recursive(clnt->cl_debugfs);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 3/7] SUNRPC: Simplify client shutdown
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
  2017-11-10 22:37 ` [RFC v2 1/7] SUNRPC: Add flag to kill new tasks Joshua Watt
  2017-11-10 22:37 ` [RFC v2 2/7] SUNRPC: Expose kill_new_tasks in debugfs Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:37 ` [RFC v2 4/7] NFS: Add mount flags mask Joshua Watt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Use the flag to kill all new tasks when shutting down instead of
repeatedly killing all the pending tasks.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 net/sunrpc/clnt.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index aa64cd667696..b00793c30518 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -850,11 +850,9 @@ void rpc_shutdown_client(struct rpc_clnt *clnt)
 			clnt->cl_program->name,
 			rcu_dereference(clnt->cl_xprt)->servername);
 
-	while (!list_empty(&clnt->cl_tasks)) {
-		rpc_killall_tasks(clnt);
-		wait_event_timeout(destroy_wait,
-			list_empty(&clnt->cl_tasks), 1*HZ);
-	}
+	clnt->cl_kill_new_tasks = true;
+	rpc_killall_tasks(clnt);
+	wait_event(destroy_wait, list_empty(&clnt->cl_tasks));
 
 	rpc_release_client(clnt);
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 4/7] NFS: Add mount flags mask
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
                   ` (2 preceding siblings ...)
  2017-11-10 22:37 ` [RFC v2 3/7] SUNRPC: Simplify client shutdown Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:37 ` [RFC v2 5/7] NFS: Add serverfailed mount option Joshua Watt
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Track which mount options were specified so that only the flags that
actually changed will be updated when remounting.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 fs/nfs/internal.h |  1 +
 fs/nfs/super.c    | 58 ++++++++++++++++++++++++++-----------------------------
 2 files changed, 28 insertions(+), 31 deletions(-)

diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index f9a4a5524bd5..f4308b730c1c 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -90,6 +90,7 @@ struct nfs_client_initdata {
  */
 struct nfs_parsed_mount_data {
 	int			flags;
+	int			flags_mask;
 	unsigned int		rsize, wsize;
 	unsigned int		timeo, retrans;
 	unsigned int		acregmin, acregmax,
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index cf6de998294d..777a0cc22704 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1197,6 +1197,16 @@ static int nfs_get_option_ul_bound(substring_t args[], unsigned long *option,
 	return 0;
 }
 
+static void
+set_flag(struct nfs_parsed_mount_data *mnt, int flag, bool set)
+{
+	mnt->flags_mask |= flag;
+	if (set)
+		mnt->flags |= flag;
+	else
+		mnt->flags &= ~flag;
+}
+
 /*
  * Error-check and convert a string of mount options from user space into
  * a data structure.  The whole mount string is processed; bad options are
@@ -1248,75 +1258,61 @@ static int nfs_parse_mount_options(char *raw,
 		 * boolean options:  foo/nofoo
 		 */
 		case Opt_soft:
-			mnt->flags |= NFS_MOUNT_SOFT;
-			break;
 		case Opt_hard:
-			mnt->flags &= ~NFS_MOUNT_SOFT;
+			set_flag(mnt, NFS_MOUNT_SOFT, token == Opt_soft);
 			break;
 		case Opt_posix:
-			mnt->flags |= NFS_MOUNT_POSIX;
-			break;
 		case Opt_noposix:
-			mnt->flags &= ~NFS_MOUNT_POSIX;
+			set_flag(mnt, NFS_MOUNT_POSIX, token == Opt_posix);
 			break;
 		case Opt_cto:
-			mnt->flags &= ~NFS_MOUNT_NOCTO;
-			break;
 		case Opt_nocto:
-			mnt->flags |= NFS_MOUNT_NOCTO;
+			set_flag(mnt, NFS_MOUNT_NOCTO, token == Opt_nocto);
 			break;
 		case Opt_ac:
-			mnt->flags &= ~NFS_MOUNT_NOAC;
-			break;
 		case Opt_noac:
-			mnt->flags |= NFS_MOUNT_NOAC;
+			set_flag(mnt, NFS_MOUNT_NOAC, token == Opt_noac);
 			break;
 		case Opt_lock:
-			mnt->flags &= ~NFS_MOUNT_NONLM;
-			mnt->flags &= ~(NFS_MOUNT_LOCAL_FLOCK |
-					NFS_MOUNT_LOCAL_FCNTL);
-			break;
 		case Opt_nolock:
-			mnt->flags |= NFS_MOUNT_NONLM;
-			mnt->flags |= (NFS_MOUNT_LOCAL_FLOCK |
-				       NFS_MOUNT_LOCAL_FCNTL);
+			set_flag(mnt, NFS_MOUNT_NONLM | NFS_MOUNT_LOCAL_FLOCK |
+				 NFS_MOUNT_LOCAL_FCNTL,
+				 token == Opt_nolock);
 			break;
 		case Opt_udp:
 			mnt->flags &= ~NFS_MOUNT_TCP;
+			mnt->flags_mask |= NFS_MOUNT_TCP;
 			mnt->nfs_server.protocol = XPRT_TRANSPORT_UDP;
 			break;
 		case Opt_tcp:
 			mnt->flags |= NFS_MOUNT_TCP;
+			mnt->flags_mask |= NFS_MOUNT_TCP;
 			mnt->nfs_server.protocol = XPRT_TRANSPORT_TCP;
 			break;
 		case Opt_rdma:
 			mnt->flags |= NFS_MOUNT_TCP; /* for side protocols */
+			mnt->flags_mask |= NFS_MOUNT_TCP;
 			mnt->nfs_server.protocol = XPRT_TRANSPORT_RDMA;
 			xprt_load_transport(p);
 			break;
 		case Opt_acl:
-			mnt->flags &= ~NFS_MOUNT_NOACL;
-			break;
 		case Opt_noacl:
-			mnt->flags |= NFS_MOUNT_NOACL;
+			set_flag(mnt, NFS_MOUNT_NOACL, token == Opt_noacl);
 			break;
 		case Opt_rdirplus:
-			mnt->flags &= ~NFS_MOUNT_NORDIRPLUS;
-			break;
 		case Opt_nordirplus:
-			mnt->flags |= NFS_MOUNT_NORDIRPLUS;
+			set_flag(mnt, NFS_MOUNT_NORDIRPLUS,
+				 token == Opt_nordirplus);
 			break;
 		case Opt_sharecache:
-			mnt->flags &= ~NFS_MOUNT_UNSHARED;
-			break;
 		case Opt_nosharecache:
-			mnt->flags |= NFS_MOUNT_UNSHARED;
+			set_flag(mnt, NFS_MOUNT_UNSHARED,
+				 token == Opt_nosharecache);
 			break;
 		case Opt_resvport:
-			mnt->flags &= ~NFS_MOUNT_NORESVPORT;
-			break;
 		case Opt_noresvport:
-			mnt->flags |= NFS_MOUNT_NORESVPORT;
+			set_flag(mnt, NFS_MOUNT_NORESVPORT,
+				 token == Opt_noresvport);
 			break;
 		case Opt_fscache:
 			mnt->options |= NFS_OPTION_FSCACHE;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 5/7] NFS: Add serverfailed mount option
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
                   ` (3 preceding siblings ...)
  2017-11-10 22:37 ` [RFC v2 4/7] NFS: Add mount flags mask Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:45   ` Chuck Lever
  2017-11-10 22:37 ` [RFC v2 6/7] NFS: Propagate NFS_MOUNT_UNSHARED to clients Joshua Watt
  2017-11-10 22:37 ` [RFC v2 7/7] NFS: Propagate operations to unshared clients Joshua Watt
  6 siblings, 1 reply; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Specifying the serverfailed mount option causes all subsequent RPC tasks
that are queued to fail with -EIO instead of timing out. For example, if
a server has disappeared, the sequence:

 mount -o remount,serverfailed
 umount -f

will ensure that all pending I/O requests are cancelled, and any new
requests will also fail. In the event that the server returns, the flag
can be removed with a remount:

 mount -o remount,noserverfailed

Although bringing the server back may result in a loss of data

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 fs/nfs/inode.c                 |  6 ++++++
 fs/nfs/super.c                 | 30 +++++++++++++++++++++++++-----
 include/uapi/linux/nfs_mount.h |  1 +
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 134d9f560240..55b7cd40508d 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -723,6 +723,12 @@ static void nfs_readdirplus_parent_cache_hit(struct dentry *dentry)
 
 static bool nfs_need_revalidate_inode(struct inode *inode)
 {
+	/* If the server has failed, it is not going to respond, so don't try
+	 * and revalidated (otherwise, the serverfailed flag can't be cleared by
+	 * a remount)
+	 */
+	if (NFS_SERVER(inode)->flags & NFS_MOUNT_SERVERFAILED)
+		return false;
 	if (NFS_I(inode)->cache_validity &
 			(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_LABEL))
 		return true;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 777a0cc22704..bca38e1cdd85 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -90,6 +90,7 @@ enum {
 	Opt_resvport, Opt_noresvport,
 	Opt_fscache, Opt_nofscache,
 	Opt_migration, Opt_nomigration,
+	Opt_serverfailed, Opt_noserverfailed,
 
 	/* Mount options that take integer arguments */
 	Opt_port,
@@ -151,6 +152,8 @@ static const match_table_t nfs_mount_option_tokens = {
 	{ Opt_nofscache, "nofsc" },
 	{ Opt_migration, "migration" },
 	{ Opt_nomigration, "nomigration" },
+	{ Opt_serverfailed, "serverfailed" },
+	{ Opt_noserverfailed, "noserverfailed" },
 
 	{ Opt_port, "port=%s" },
 	{ Opt_rsize, "rsize=%s" },
@@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
 		{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
 		{ NFS_MOUNT_UNSHARED, ",nosharecache", "" },
 		{ NFS_MOUNT_NORESVPORT, ",noresvport", "" },
+		{ NFS_MOUNT_SERVERFAILED, ",serverfailed", "" },
 		{ 0, NULL, NULL }
 	};
 	const struct proc_nfs_info *nfs_infop;
@@ -1330,6 +1334,11 @@ static int nfs_parse_mount_options(char *raw,
 		case Opt_nomigration:
 			mnt->options &= ~NFS_OPTION_MIGRATION;
 			break;
+		case Opt_serverfailed:
+		case Opt_noserverfailed:
+			set_flag(mnt, NFS_MOUNT_SERVERFAILED,
+				 token == Opt_serverfailed);
+			break;
 
 		/*
 		 * options that take numeric values
@@ -2192,6 +2201,8 @@ static int nfs_validate_text_mount_data(void *options,
 	return -EINVAL;
 }
 
+#define NFS_REMOUNT_CHANGE_FLAGS (NFS_MOUNT_SERVERFAILED)
+
 #define NFS_REMOUNT_CMP_FLAGMASK ~(NFS_MOUNT_INTR \
 		| NFS_MOUNT_SECURE \
 		| NFS_MOUNT_TCP \
@@ -2200,7 +2211,8 @@ static int nfs_validate_text_mount_data(void *options,
 		| NFS_MOUNT_NONLM \
 		| NFS_MOUNT_BROKEN_SUID \
 		| NFS_MOUNT_STRICTLOCK \
-		| NFS_MOUNT_LEGACY_INTERFACE)
+		| NFS_MOUNT_LEGACY_INTERFACE \
+		| NFS_REMOUNT_CHANGE_FLAGS)
 
 #define NFS_MOUNT_CMP_FLAGMASK (NFS_REMOUNT_CMP_FLAGMASK & \
 		~(NFS_MOUNT_UNSHARED | NFS_MOUNT_NORESVPORT))
@@ -2209,12 +2221,15 @@ static int
 nfs_compare_and_set_remount_data(struct nfs_server *nfss,
 				 struct nfs_parsed_mount_data *data)
 {
+	int changed_flags_mask = data->flags_mask & NFS_REMOUNT_CHANGE_FLAGS;
+	struct rpc_clnt *cl = nfss->client;
+
 	if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK ||
 	    data->rsize != nfss->rsize ||
 	    data->wsize != nfss->wsize ||
 	    data->version != nfss->nfs_client->rpc_ops->version ||
 	    data->minorversion != nfss->nfs_client->cl_minorversion ||
-	    !nfs_auth_info_match(&data->auth_info, nfss->client->cl_auth->au_flavor) ||
+	    !nfs_auth_info_match(&data->auth_info, cl->cl_auth->au_flavor) ||
 	    data->acregmin != nfss->acregmin / HZ ||
 	    data->acregmax != nfss->acregmax / HZ ||
 	    data->acdirmin != nfss->acdirmin / HZ ||
@@ -2225,13 +2240,16 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
 			  (struct sockaddr *)&nfss->nfs_client->cl_addr))
 		return -EINVAL;
 
-	if (data->retrans != nfss->client->cl_timeout->to_retries ||
-	    data->timeo != (10U * nfss->client->cl_timeout->to_initval / HZ)) {
+	/* Update flags */
+	nfss->flags = (nfss->flags & ~changed_flags_mask) |
+		(data->flags & changed_flags_mask);
+
+	if (data->retrans != cl->cl_timeout->to_retries ||
+	    data->timeo != (10U * cl->cl_timeout->to_initval / HZ)) {
 		/* Note that this will affect all mounts from the same server,
 		 * that use the same protocol.  The timeouts are always forced
 		 * to be the same.
 		 */
-		struct rpc_clnt *cl = nfss->client;
 		if (cl->cl_timeout != &cl->cl_timeout_default)
 			memcpy(&cl->cl_timeout_default, cl->cl_timeout,
 			       sizeof(struct rpc_timeout));
@@ -2239,6 +2257,8 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
 		cl->cl_timeout_default.to_initval = data->timeo * HZ / 10U;
 	}
 
+	cl->cl_kill_new_tasks = !!(nfss->flags & NFS_MOUNT_SERVERFAILED);
+
 	return 0;
 }
 
diff --git a/include/uapi/linux/nfs_mount.h b/include/uapi/linux/nfs_mount.h
index e44e00616ab5..8ad931cdca20 100644
--- a/include/uapi/linux/nfs_mount.h
+++ b/include/uapi/linux/nfs_mount.h
@@ -74,5 +74,6 @@ struct nfs_mount_data {
 
 #define NFS_MOUNT_LOCAL_FLOCK	0x100000
 #define NFS_MOUNT_LOCAL_FCNTL	0x200000
+#define NFS_MOUNT_SERVERFAILED	0x400000
 
 #endif
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 6/7] NFS: Propagate NFS_MOUNT_UNSHARED to clients
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
                   ` (4 preceding siblings ...)
  2017-11-10 22:37 ` [RFC v2 5/7] NFS: Add serverfailed mount option Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  2017-11-10 22:37 ` [RFC v2 7/7] NFS: Propagate operations to unshared clients Joshua Watt
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

Propagating the flag allows operations to be performed on the client
without affecting other mounts to the same client, much like operations
on the superblock will not affect other superblocks.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 fs/nfs/client.c           | 9 +++++++--
 fs/nfs/nfs3client.c       | 2 ++
 fs/nfs/nfs4client.c       | 4 ++++
 include/linux/nfs_fs_sb.h | 1 +
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 22880ef6d8dd..c5872286f322 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -297,6 +297,10 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat
 		if (clp->cl_cons_state < 0)
 			continue;
 
+		/* Don't match clients that don't want to be shared */
+		if (test_bit(NFS_CS_UNSHARED, &clp->cl_flags))
+			continue;
+
 		/* Different NFS versions cannot share the same nfs_client */
 		if (clp->rpc_ops != data->nfs_mod->rpc_ops)
 			continue;
@@ -386,7 +390,7 @@ nfs_found_client(const struct nfs_client_initdata *cl_init,
  */
 struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init)
 {
-	struct nfs_client *clp, *new = NULL;
+	struct nfs_client *clp = NULL, *new = NULL;
 	struct nfs_net *nn = net_generic(cl_init->net, nfs_net_id);
 	const struct nfs_rpc_ops *rpc_ops = cl_init->nfs_mod->rpc_ops;
 
@@ -399,7 +403,8 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init)
 	do {
 		spin_lock(&nn->nfs_client_lock);
 
-		clp = nfs_match_client(cl_init);
+		if (!test_bit(NFS_CS_UNSHARED, &cl_init->init_flags))
+			clp = nfs_match_client(cl_init);
 		if (clp) {
 			spin_unlock(&nn->nfs_client_lock);
 			if (new)
diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c
index 7879f2a0fcfd..b62bed41053f 100644
--- a/fs/nfs/nfs3client.c
+++ b/fs/nfs/nfs3client.c
@@ -102,6 +102,8 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv,
 
 	if (mds_srv->flags & NFS_MOUNT_NORESVPORT)
 		set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags);
+	if (mds_srv->flags & NFS_MOUNT_UNSHARED)
+		__set_bit(NFS_CS_UNSHARED, &cl_init.init_flags);
 
 	/* Use the MDS nfs_client cl_ipaddr. */
 	nfs_init_timeout_values(&ds_timeout, ds_proto, ds_timeo, ds_retrans);
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index e9bea90dc017..e51467f5b3bc 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -852,6 +852,8 @@ static int nfs4_set_client(struct nfs_server *server,
 		set_bit(NFS_CS_MIGRATION, &cl_init.init_flags);
 	if (test_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status))
 		set_bit(NFS_CS_TSM_POSSIBLE, &cl_init.init_flags);
+	if (server->flags & NFS_MOUNT_UNSHARED)
+		__set_bit(NFS_CS_UNSHARED, &cl_init.init_flags);
 
 	/* Allocate or find a client reference we can use */
 	clp = nfs_get_client(&cl_init);
@@ -910,6 +912,8 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_server *mds_srv,
 
 	if (mds_srv->flags & NFS_MOUNT_NORESVPORT)
 		__set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags);
+	if (mds_srv->flags & NFS_MOUNT_UNSHARED)
+		__set_bit(NFS_CS_UNSHARED, &cl_init.init_flags);
 
 	/*
 	 * Set an authflavor equual to the MDS value. Use the MDS nfs_client
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 286b71c418b4..8761f729921a 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -44,6 +44,7 @@ struct nfs_client {
 #define NFS_CS_INFINITE_SLOTS	3		/* - don't limit TCP slots */
 #define NFS_CS_NO_RETRANS_TIMEOUT	4	/* - Disable retransmit timeouts */
 #define NFS_CS_TSM_POSSIBLE	5		/* - Maybe state migration */
+#define NFS_CS_UNSHARED		6		/* - Client state is not shared */
 	struct sockaddr_storage	cl_addr;	/* server identifier */
 	size_t			cl_addrlen;
 	char *			cl_hostname;	/* hostname of server */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2 7/7] NFS: Propagate operations to unshared clients
  2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
                   ` (5 preceding siblings ...)
  2017-11-10 22:37 ` [RFC v2 6/7] NFS: Propagate NFS_MOUNT_UNSHARED to clients Joshua Watt
@ 2017-11-10 22:37 ` Joshua Watt
  6 siblings, 0 replies; 11+ messages in thread
From: Joshua Watt @ 2017-11-10 22:37 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Trond Myklebust
  Cc: linux-nfs, Al Viro, J . Bruce Fields, David Howells, Joshua Watt

If a struct nfs_server's client is not shared with other superblocks,
propagate the umount_begin and serverfailed behavior to it.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 fs/nfs/super.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index bca38e1cdd85..0348fd43ffe4 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -911,6 +911,11 @@ void nfs_umount_begin(struct super_block *sb)
 	rpc = server->client;
 	if (!IS_ERR(rpc))
 		rpc_killall_tasks(rpc);
+
+	rpc = server->nfs_client->cl_rpcclient;
+	if (test_bit(NFS_CS_UNSHARED, &server->nfs_client->cl_flags) &&
+	    !IS_ERR(rpc))
+		rpc_killall_tasks(rpc);
 }
 EXPORT_SYMBOL_GPL(nfs_umount_begin);
 
@@ -2223,6 +2228,7 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
 {
 	int changed_flags_mask = data->flags_mask & NFS_REMOUNT_CHANGE_FLAGS;
 	struct rpc_clnt *cl = nfss->client;
+	bool serverfailed;
 
 	if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK ||
 	    data->rsize != nfss->rsize ||
@@ -2257,7 +2263,11 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
 		cl->cl_timeout_default.to_initval = data->timeo * HZ / 10U;
 	}
 
-	cl->cl_kill_new_tasks = !!(nfss->flags & NFS_MOUNT_SERVERFAILED);
+	serverfailed = !!(nfss->flags & NFS_MOUNT_SERVERFAILED);
+	cl->cl_kill_new_tasks = serverfailed;
+	if (test_bit(NFS_CS_UNSHARED, &nfss->nfs_client->cl_flags))
+		nfss->nfs_client->cl_rpcclient->cl_kill_new_tasks =
+			serverfailed;
 
 	return 0;
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC v2 5/7] NFS: Add serverfailed mount option
  2017-11-10 22:37 ` [RFC v2 5/7] NFS: Add serverfailed mount option Joshua Watt
@ 2017-11-10 22:45   ` Chuck Lever
  2017-11-13 16:29     ` Joshua Watt
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2017-11-10 22:45 UTC (permalink / raw)
  To: Joshua Watt
  Cc: NeilBrown, Jeff Layton, Trond Myklebust, Linux NFS Mailing List,
	Al Viro, Bruce Fields, David Howells


> On Nov 10, 2017, at 5:37 PM, Joshua Watt <jpewhacker@gmail.com> wrote:
> 
> Specifying the serverfailed mount option causes all subsequent RPC tasks
> that are queued to fail with -EIO instead of timing out. For example, if
> a server has disappeared, the sequence:
> 
> mount -o remount,serverfailed
> umount -f
> 
> will ensure that all pending I/O requests are cancelled, and any new
> requests will also fail. In the event that the server returns, the flag
> can be removed with a remount:
> 
> mount -o remount,noserverfailed
> 
> Although bringing the server back may result in a loss of data

Hi Joshua, interesting work!

A couple of things I'm wondering:

1. Does this also change the serverfailed setting on submounts
(ie mounts that the kernel did when traversing an NFSv4 referral or
when going from the server's pseudofs into a real fs). These need
to be unmounted first before the parent mount can be unmounted,
and in the latter case they would all be backed by the same stuck
NFS server.

2. If there is a hanging sync(2), does the umount -f unstick it?
That seems relevant for Neil's "make shutdown reliable" use case.
I would like a stuck NFS mount not to result in local file system
corruption, if possible, during a hard shutdown.


> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
> fs/nfs/inode.c                 |  6 ++++++
> fs/nfs/super.c                 | 30 +++++++++++++++++++++++++-----
> include/uapi/linux/nfs_mount.h |  1 +
> 3 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 134d9f560240..55b7cd40508d 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -723,6 +723,12 @@ static void nfs_readdirplus_parent_cache_hit(struct dentry *dentry)
> 
> static bool nfs_need_revalidate_inode(struct inode *inode)
> {
> +	/* If the server has failed, it is not going to respond, so don't try
> +	 * and revalidated (otherwise, the serverfailed flag can't be cleared by
> +	 * a remount)
> +	 */
> +	if (NFS_SERVER(inode)->flags & NFS_MOUNT_SERVERFAILED)
> +		return false;
> 	if (NFS_I(inode)->cache_validity &
> 			(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_LABEL))
> 		return true;
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index 777a0cc22704..bca38e1cdd85 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -90,6 +90,7 @@ enum {
> 	Opt_resvport, Opt_noresvport,
> 	Opt_fscache, Opt_nofscache,
> 	Opt_migration, Opt_nomigration,
> +	Opt_serverfailed, Opt_noserverfailed,
> 
> 	/* Mount options that take integer arguments */
> 	Opt_port,
> @@ -151,6 +152,8 @@ static const match_table_t nfs_mount_option_tokens = {
> 	{ Opt_nofscache, "nofsc" },
> 	{ Opt_migration, "migration" },
> 	{ Opt_nomigration, "nomigration" },
> +	{ Opt_serverfailed, "serverfailed" },
> +	{ Opt_noserverfailed, "noserverfailed" },
> 
> 	{ Opt_port, "port=%s" },
> 	{ Opt_rsize, "rsize=%s" },
> @@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
> 		{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
> 		{ NFS_MOUNT_UNSHARED, ",nosharecache", "" },
> 		{ NFS_MOUNT_NORESVPORT, ",noresvport", "" },
> +		{ NFS_MOUNT_SERVERFAILED, ",serverfailed", "" },
> 		{ 0, NULL, NULL }
> 	};
> 	const struct proc_nfs_info *nfs_infop;
> @@ -1330,6 +1334,11 @@ static int nfs_parse_mount_options(char *raw,
> 		case Opt_nomigration:
> 			mnt->options &= ~NFS_OPTION_MIGRATION;
> 			break;
> +		case Opt_serverfailed:
> +		case Opt_noserverfailed:
> +			set_flag(mnt, NFS_MOUNT_SERVERFAILED,
> +				 token == Opt_serverfailed);
> +			break;
> 
> 		/*
> 		 * options that take numeric values
> @@ -2192,6 +2201,8 @@ static int nfs_validate_text_mount_data(void *options,
> 	return -EINVAL;
> }
> 
> +#define NFS_REMOUNT_CHANGE_FLAGS (NFS_MOUNT_SERVERFAILED)
> +
> #define NFS_REMOUNT_CMP_FLAGMASK ~(NFS_MOUNT_INTR \
> 		| NFS_MOUNT_SECURE \
> 		| NFS_MOUNT_TCP \
> @@ -2200,7 +2211,8 @@ static int nfs_validate_text_mount_data(void *options,
> 		| NFS_MOUNT_NONLM \
> 		| NFS_MOUNT_BROKEN_SUID \
> 		| NFS_MOUNT_STRICTLOCK \
> -		| NFS_MOUNT_LEGACY_INTERFACE)
> +		| NFS_MOUNT_LEGACY_INTERFACE \
> +		| NFS_REMOUNT_CHANGE_FLAGS)
> 
> #define NFS_MOUNT_CMP_FLAGMASK (NFS_REMOUNT_CMP_FLAGMASK & \
> 		~(NFS_MOUNT_UNSHARED | NFS_MOUNT_NORESVPORT))
> @@ -2209,12 +2221,15 @@ static int
> nfs_compare_and_set_remount_data(struct nfs_server *nfss,
> 				 struct nfs_parsed_mount_data *data)
> {
> +	int changed_flags_mask = data->flags_mask & NFS_REMOUNT_CHANGE_FLAGS;
> +	struct rpc_clnt *cl = nfss->client;
> +
> 	if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK ||
> 	    data->rsize != nfss->rsize ||
> 	    data->wsize != nfss->wsize ||
> 	    data->version != nfss->nfs_client->rpc_ops->version ||
> 	    data->minorversion != nfss->nfs_client->cl_minorversion ||
> -	    !nfs_auth_info_match(&data->auth_info, nfss->client->cl_auth->au_flavor) ||
> +	    !nfs_auth_info_match(&data->auth_info, cl->cl_auth->au_flavor) ||
> 	    data->acregmin != nfss->acregmin / HZ ||
> 	    data->acregmax != nfss->acregmax / HZ ||
> 	    data->acdirmin != nfss->acdirmin / HZ ||
> @@ -2225,13 +2240,16 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
> 			  (struct sockaddr *)&nfss->nfs_client->cl_addr))
> 		return -EINVAL;
> 
> -	if (data->retrans != nfss->client->cl_timeout->to_retries ||
> -	    data->timeo != (10U * nfss->client->cl_timeout->to_initval / HZ)) {
> +	/* Update flags */
> +	nfss->flags = (nfss->flags & ~changed_flags_mask) |
> +		(data->flags & changed_flags_mask);
> +
> +	if (data->retrans != cl->cl_timeout->to_retries ||
> +	    data->timeo != (10U * cl->cl_timeout->to_initval / HZ)) {
> 		/* Note that this will affect all mounts from the same server,
> 		 * that use the same protocol.  The timeouts are always forced
> 		 * to be the same.
> 		 */
> -		struct rpc_clnt *cl = nfss->client;
> 		if (cl->cl_timeout != &cl->cl_timeout_default)
> 			memcpy(&cl->cl_timeout_default, cl->cl_timeout,
> 			       sizeof(struct rpc_timeout));
> @@ -2239,6 +2257,8 @@ nfs_compare_and_set_remount_data(struct nfs_server *nfss,
> 		cl->cl_timeout_default.to_initval = data->timeo * HZ / 10U;
> 	}
> 
> +	cl->cl_kill_new_tasks = !!(nfss->flags & NFS_MOUNT_SERVERFAILED);
> +
> 	return 0;
> }
> 
> diff --git a/include/uapi/linux/nfs_mount.h b/include/uapi/linux/nfs_mount.h
> index e44e00616ab5..8ad931cdca20 100644
> --- a/include/uapi/linux/nfs_mount.h
> +++ b/include/uapi/linux/nfs_mount.h
> @@ -74,5 +74,6 @@ struct nfs_mount_data {
> 
> #define NFS_MOUNT_LOCAL_FLOCK	0x100000
> #define NFS_MOUNT_LOCAL_FCNTL	0x200000
> +#define NFS_MOUNT_SERVERFAILED	0x400000
> 
> #endif
> -- 
> 2.13.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2 5/7] NFS: Add serverfailed mount option
  2017-11-10 22:45   ` Chuck Lever
@ 2017-11-13 16:29     ` Joshua Watt
  2017-11-13 17:23       ` Chuck Lever
  0 siblings, 1 reply; 11+ messages in thread
From: Joshua Watt @ 2017-11-13 16:29 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Jeff Layton, Trond Myklebust, Linux NFS Mailing List,
	Al Viro, Bruce Fields, David Howells

On Fri, 2017-11-10 at 17:45 -0500, Chuck Lever wrote:
> > On Nov 10, 2017, at 5:37 PM, Joshua Watt <jpewhacker@gmail.com>
> > wrote:
> > 
> > Specifying the serverfailed mount option causes all subsequent RPC
> > tasks
> > that are queued to fail with -EIO instead of timing out. For
> > example, if
> > a server has disappeared, the sequence:
> > 
> > mount -o remount,serverfailed
> > umount -f
> > 
> > will ensure that all pending I/O requests are cancelled, and any
> > new
> > requests will also fail. In the event that the server returns, the
> > flag
> > can be removed with a remount:
> > 
> > mount -o remount,noserverfailed
> > 
> > Although bringing the server back may result in a loss of data
> 
> Hi Joshua, interesting work!
> 
> A couple of things I'm wondering:
> 
> 1. Does this also change the serverfailed setting on submounts
> (ie mounts that the kernel did when traversing an NFSv4 referral or
> when going from the server's pseudofs into a real fs). These need
> to be unmounted first before the parent mount can be unmounted,
> and in the latter case they would all be backed by the same stuck
> NFS server.

I don't honestly know, but I will find out. Our use case doens't deal
with either of those cases much.

> 
> 2. If there is a hanging sync(2), does the umount -f unstick it?
> That seems relevant for Neil's "make shutdown reliable" use case.
> I would like a stuck NFS mount not to result in local file system
> corruption, if possible, during a hard shutdown.
> 

Several previous posts have aluded to calling "umount -f" repeatedly to
get a "stuck" NFS mount to actually unmount. This patch set is
effectively a way of automating that process. More formally, the
sequence of commands:

 mount PATH -o remount,serverfailed
 umount -f PATH

Is closely approximated by:

 while(1)
     umount2(PATH, MNT_FORCE);

from userspace.

In retrospect, I think that the "umount -f" shouldn't be required after
remount: If the serverfailed mount option is specified it should also
kill all pending RPCs. A umount -f is undesirable, because you may not
actually want to (potentially) unmount the file system just to cancel
the RPCs.

More directly to your question: I don't honestly know if this can
"unstick" a hanging sync(2). If repeatedly calling umount -f will
unstick it, then these patches will exhibit the same behavior. I will
continue to do some research on this, but if anyone knows the answer
they might be able to chime in.

Could you clarify on what you mean by a stuck NFS mount not resulting
in local filesystem corruption? I'm not sure I undertsand well enough
to follow that comment.

Thanks for the feedback

> 
> > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > ---
> > fs/nfs/inode.c                 |  6 ++++++
> > fs/nfs/super.c                 | 30 +++++++++++++++++++++++++-----
> > include/uapi/linux/nfs_mount.h |  1 +
> > 3 files changed, 32 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index 134d9f560240..55b7cd40508d 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -723,6 +723,12 @@ static void
> > nfs_readdirplus_parent_cache_hit(struct dentry *dentry)
> > 
> > static bool nfs_need_revalidate_inode(struct inode *inode)
> > {
> > +	/* If the server has failed, it is not going to respond,
> > so don't try
> > +	 * and revalidated (otherwise, the serverfailed flag can't
> > be cleared by
> > +	 * a remount)
> > +	 */
> > +	if (NFS_SERVER(inode)->flags & NFS_MOUNT_SERVERFAILED)
> > +		return false;
> > 	if (NFS_I(inode)->cache_validity &
> > 			(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_LABEL))
> > 		return true;
> > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > index 777a0cc22704..bca38e1cdd85 100644
> > --- a/fs/nfs/super.c
> > +++ b/fs/nfs/super.c
> > @@ -90,6 +90,7 @@ enum {
> > 	Opt_resvport, Opt_noresvport,
> > 	Opt_fscache, Opt_nofscache,
> > 	Opt_migration, Opt_nomigration,
> > +	Opt_serverfailed, Opt_noserverfailed,
> > 
> > 	/* Mount options that take integer arguments */
> > 	Opt_port,
> > @@ -151,6 +152,8 @@ static const match_table_t
> > nfs_mount_option_tokens = {
> > 	{ Opt_nofscache, "nofsc" },
> > 	{ Opt_migration, "migration" },
> > 	{ Opt_nomigration, "nomigration" },
> > +	{ Opt_serverfailed, "serverfailed" },
> > +	{ Opt_noserverfailed, "noserverfailed" },
> > 
> > 	{ Opt_port, "port=%s" },
> > 	{ Opt_rsize, "rsize=%s" },
> > @@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct
> > seq_file *m, struct nfs_server *nfss,
> > 		{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
> > 		{ NFS_MOUNT_UNSHARED, ",nosharecache", "" },
> > 		{ NFS_MOUNT_NORESVPORT, ",noresvport", "" },
> > +		{ NFS_MOUNT_SERVERFAILED, ",serverfailed", "" },
> > 		{ 0, NULL, NULL }
> > 	};
> > 	const struct proc_nfs_info *nfs_infop;
> > @@ -1330,6 +1334,11 @@ static int nfs_parse_mount_options(char
> > *raw,
> > 		case Opt_nomigration:
> > 			mnt->options &= ~NFS_OPTION_MIGRATION;
> > 			break;
> > +		case Opt_serverfailed:
> > +		case Opt_noserverfailed:
> > +			set_flag(mnt, NFS_MOUNT_SERVERFAILED,
> > +				 token == Opt_serverfailed);
> > +			break;
> > 
> > 		/*
> > 		 * options that take numeric values
> > @@ -2192,6 +2201,8 @@ static int nfs_validate_text_mount_data(void
> > *options,
> > 	return -EINVAL;
> > }
> > 
> > +#define NFS_REMOUNT_CHANGE_FLAGS (NFS_MOUNT_SERVERFAILED)
> > +
> > #define NFS_REMOUNT_CMP_FLAGMASK ~(NFS_MOUNT_INTR \
> > 		| NFS_MOUNT_SECURE \
> > 		| NFS_MOUNT_TCP \
> > @@ -2200,7 +2211,8 @@ static int nfs_validate_text_mount_data(void
> > *options,
> > 		| NFS_MOUNT_NONLM \
> > 		| NFS_MOUNT_BROKEN_SUID \
> > 		| NFS_MOUNT_STRICTLOCK \
> > -		| NFS_MOUNT_LEGACY_INTERFACE)
> > +		| NFS_MOUNT_LEGACY_INTERFACE \
> > +		| NFS_REMOUNT_CHANGE_FLAGS)
> > 
> > #define NFS_MOUNT_CMP_FLAGMASK (NFS_REMOUNT_CMP_FLAGMASK & \
> > 		~(NFS_MOUNT_UNSHARED | NFS_MOUNT_NORESVPORT))
> > @@ -2209,12 +2221,15 @@ static int
> > nfs_compare_and_set_remount_data(struct nfs_server *nfss,
> > 				 struct nfs_parsed_mount_data *data)
> > {
> > +	int changed_flags_mask = data->flags_mask &
> > NFS_REMOUNT_CHANGE_FLAGS;
> > +	struct rpc_clnt *cl = nfss->client;
> > +
> > 	if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK ||
> > 	    data->rsize != nfss->rsize ||
> > 	    data->wsize != nfss->wsize ||
> > 	    data->version != nfss->nfs_client->rpc_ops->version ||
> > 	    data->minorversion != nfss->nfs_client->cl_minorversion ||
> > -	    !nfs_auth_info_match(&data->auth_info, nfss->client-
> > >cl_auth->au_flavor) ||
> > +	    !nfs_auth_info_match(&data->auth_info, cl->cl_auth-
> > >au_flavor) ||
> > 	    data->acregmin != nfss->acregmin / HZ ||
> > 	    data->acregmax != nfss->acregmax / HZ ||
> > 	    data->acdirmin != nfss->acdirmin / HZ ||
> > @@ -2225,13 +2240,16 @@ nfs_compare_and_set_remount_data(struct
> > nfs_server *nfss,
> > 			  (struct sockaddr *)&nfss->nfs_client-
> > >cl_addr))
> > 		return -EINVAL;
> > 
> > -	if (data->retrans != nfss->client->cl_timeout->to_retries
> > ||
> > -	    data->timeo != (10U * nfss->client->cl_timeout-
> > >to_initval / HZ)) {
> > +	/* Update flags */
> > +	nfss->flags = (nfss->flags & ~changed_flags_mask) |
> > +		(data->flags & changed_flags_mask);
> > +
> > +	if (data->retrans != cl->cl_timeout->to_retries ||
> > +	    data->timeo != (10U * cl->cl_timeout->to_initval /
> > HZ)) {
> > 		/* Note that this will affect all mounts from the same
> > server,
> > 		 * that use the same protocol.  The timeouts are always
> > forced
> > 		 * to be the same.
> > 		 */
> > -		struct rpc_clnt *cl = nfss->client;
> > 		if (cl->cl_timeout != &cl->cl_timeout_default)
> > 			memcpy(&cl->cl_timeout_default, cl->cl_timeout,
> > 			       sizeof(struct rpc_timeout));
> > @@ -2239,6 +2257,8 @@ nfs_compare_and_set_remount_data(struct
> > nfs_server *nfss,
> > 		cl->cl_timeout_default.to_initval = data->timeo * HZ /
> > 10U;
> > 	}
> > 
> > +	cl->cl_kill_new_tasks = !!(nfss->flags &
> > NFS_MOUNT_SERVERFAILED);
> > +
> > 	return 0;
> > }
> > 
> > diff --git a/include/uapi/linux/nfs_mount.h
> > b/include/uapi/linux/nfs_mount.h
> > index e44e00616ab5..8ad931cdca20 100644
> > --- a/include/uapi/linux/nfs_mount.h
> > +++ b/include/uapi/linux/nfs_mount.h
> > @@ -74,5 +74,6 @@ struct nfs_mount_data {
> > 
> > #define NFS_MOUNT_LOCAL_FLOCK	0x100000
> > #define NFS_MOUNT_LOCAL_FCNTL	0x200000
> > +#define NFS_MOUNT_SERVERFAILED	0x400000
> > 
> > #endif
> > -- 
> > 2.13.6
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2 5/7] NFS: Add serverfailed mount option
  2017-11-13 16:29     ` Joshua Watt
@ 2017-11-13 17:23       ` Chuck Lever
  0 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2017-11-13 17:23 UTC (permalink / raw)
  To: Joshua Watt
  Cc: NeilBrown, Jeff Layton, Trond Myklebust, Linux NFS Mailing List,
	Al Viro, Bruce Fields, David Howells


> On Nov 13, 2017, at 11:29 AM, Joshua Watt <jpewhacker@gmail.com> wrote:
> 
> On Fri, 2017-11-10 at 17:45 -0500, Chuck Lever wrote:
>>> On Nov 10, 2017, at 5:37 PM, Joshua Watt <jpewhacker@gmail.com>
>>> wrote:
>>> 
>>> Specifying the serverfailed mount option causes all subsequent RPC
>>> tasks
>>> that are queued to fail with -EIO instead of timing out. For
>>> example, if
>>> a server has disappeared, the sequence:
>>> 
>>> mount -o remount,serverfailed
>>> umount -f
>>> 
>>> will ensure that all pending I/O requests are cancelled, and any
>>> new
>>> requests will also fail. In the event that the server returns, the
>>> flag
>>> can be removed with a remount:
>>> 
>>> mount -o remount,noserverfailed
>>> 
>>> Although bringing the server back may result in a loss of data
>> 
>> Hi Joshua, interesting work!
>> 
>> A couple of things I'm wondering:
>> 
>> 1. Does this also change the serverfailed setting on submounts
>> (ie mounts that the kernel did when traversing an NFSv4 referral or
>> when going from the server's pseudofs into a real fs). These need
>> to be unmounted first before the parent mount can be unmounted,
>> and in the latter case they would all be backed by the same stuck
>> NFS server.
> 
> I don't honestly know, but I will find out. Our use case doens't deal
> with either of those cases much.
> 
>> 
>> 2. If there is a hanging sync(2), does the umount -f unstick it?
>> That seems relevant for Neil's "make shutdown reliable" use case.
>> I would like a stuck NFS mount not to result in local file system
>> corruption, if possible, during a hard shutdown.
>> 
> 
> Several previous posts have aluded to calling "umount -f" repeatedly to
> get a "stuck" NFS mount to actually unmount. This patch set is
> effectively a way of automating that process. More formally, the
> sequence of commands:
> 
> mount PATH -o remount,serverfailed
> umount -f PATH
> 
> Is closely approximated by:
> 
> while(1)
>     umount2(PATH, MNT_FORCE);
> 
> from userspace.
> 
> In retrospect, I think that the "umount -f" shouldn't be required after
> remount: If the serverfailed mount option is specified it should also
> kill all pending RPCs. A umount -f is undesirable, because you may not
> actually want to (potentially) unmount the file system just to cancel
> the RPCs.
> 
> More directly to your question: I don't honestly know if this can
> "unstick" a hanging sync(2). If repeatedly calling umount -f will
> unstick it, then these patches will exhibit the same behavior. I will
> continue to do some research on this, but if anyone knows the answer
> they might be able to chime in.
> 
> Could you clarify on what you mean by a stuck NFS mount not resulting
> in local filesystem corruption? I'm not sure I undertsand well enough
> to follow that comment.

During my own development work, I often cause NFS mount failures
that leave an NFS mount deadlocked or otherwise stuck. It behaves
as if the server is not responding.

If I'm sloppy, I will sometimes be looking at kernel source code
trying to debug the problem and may modify one or more local files
before then rebooting the client to unstick the NFS mount.

In these cases I use "echo b > /proc/sysrq-trigger". Sometimes
I type "sync" first, and that will often hang, then I have to
pop the "reset" button. In some rare instances, after the client
restarts, there can be file system corruption on the client's
local file systems (say, empty files that used to have content),
or for example missing entries in /var/log/messages.

I could be wrong, but I attribute this to the sync(2) hanging on
the stuck NFS mount before it can flush dirty data in the local
file systems on the client.

I suspect this is not uncommon during typical NFS usage scenarios,
and it could be one cause of a stuck shutdown when an NFS mount
is stuck.

So what I'm wondering is if you remount "soft,timeo=1" and let
the NFS mount unstick itself, is that enough to allow an already
running sync(2) to complete? It probably is.


> Thanks for the feedback
> 
>> 
>>> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
>>> ---
>>> fs/nfs/inode.c                 |  6 ++++++
>>> fs/nfs/super.c                 | 30 +++++++++++++++++++++++++-----
>>> include/uapi/linux/nfs_mount.h |  1 +
>>> 3 files changed, 32 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>>> index 134d9f560240..55b7cd40508d 100644
>>> --- a/fs/nfs/inode.c
>>> +++ b/fs/nfs/inode.c
>>> @@ -723,6 +723,12 @@ static void
>>> nfs_readdirplus_parent_cache_hit(struct dentry *dentry)
>>> 
>>> static bool nfs_need_revalidate_inode(struct inode *inode)
>>> {
>>> +	/* If the server has failed, it is not going to respond,
>>> so don't try
>>> +	 * and revalidated (otherwise, the serverfailed flag can't
>>> be cleared by
>>> +	 * a remount)
>>> +	 */
>>> +	if (NFS_SERVER(inode)->flags & NFS_MOUNT_SERVERFAILED)
>>> +		return false;
>>> 	if (NFS_I(inode)->cache_validity &
>>> 			(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_LABEL))
>>> 		return true;
>>> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
>>> index 777a0cc22704..bca38e1cdd85 100644
>>> --- a/fs/nfs/super.c
>>> +++ b/fs/nfs/super.c
>>> @@ -90,6 +90,7 @@ enum {
>>> 	Opt_resvport, Opt_noresvport,
>>> 	Opt_fscache, Opt_nofscache,
>>> 	Opt_migration, Opt_nomigration,
>>> +	Opt_serverfailed, Opt_noserverfailed,
>>> 
>>> 	/* Mount options that take integer arguments */
>>> 	Opt_port,
>>> @@ -151,6 +152,8 @@ static const match_table_t
>>> nfs_mount_option_tokens = {
>>> 	{ Opt_nofscache, "nofsc" },
>>> 	{ Opt_migration, "migration" },
>>> 	{ Opt_nomigration, "nomigration" },
>>> +	{ Opt_serverfailed, "serverfailed" },
>>> +	{ Opt_noserverfailed, "noserverfailed" },
>>> 
>>> 	{ Opt_port, "port=%s" },
>>> 	{ Opt_rsize, "rsize=%s" },
>>> @@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct
>>> seq_file *m, struct nfs_server *nfss,
>>> 		{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
>>> 		{ NFS_MOUNT_UNSHARED, ",nosharecache", "" },
>>> 		{ NFS_MOUNT_NORESVPORT, ",noresvport", "" },
>>> +		{ NFS_MOUNT_SERVERFAILED, ",serverfailed", "" },
>>> 		{ 0, NULL, NULL }
>>> 	};
>>> 	const struct proc_nfs_info *nfs_infop;
>>> @@ -1330,6 +1334,11 @@ static int nfs_parse_mount_options(char
>>> *raw,
>>> 		case Opt_nomigration:
>>> 			mnt->options &= ~NFS_OPTION_MIGRATION;
>>> 			break;
>>> +		case Opt_serverfailed:
>>> +		case Opt_noserverfailed:
>>> +			set_flag(mnt, NFS_MOUNT_SERVERFAILED,
>>> +				 token == Opt_serverfailed);
>>> +			break;
>>> 
>>> 		/*
>>> 		 * options that take numeric values
>>> @@ -2192,6 +2201,8 @@ static int nfs_validate_text_mount_data(void
>>> *options,
>>> 	return -EINVAL;
>>> }
>>> 
>>> +#define NFS_REMOUNT_CHANGE_FLAGS (NFS_MOUNT_SERVERFAILED)
>>> +
>>> #define NFS_REMOUNT_CMP_FLAGMASK ~(NFS_MOUNT_INTR \
>>> 		| NFS_MOUNT_SECURE \
>>> 		| NFS_MOUNT_TCP \
>>> @@ -2200,7 +2211,8 @@ static int nfs_validate_text_mount_data(void
>>> *options,
>>> 		| NFS_MOUNT_NONLM \
>>> 		| NFS_MOUNT_BROKEN_SUID \
>>> 		| NFS_MOUNT_STRICTLOCK \
>>> -		| NFS_MOUNT_LEGACY_INTERFACE)
>>> +		| NFS_MOUNT_LEGACY_INTERFACE \
>>> +		| NFS_REMOUNT_CHANGE_FLAGS)
>>> 
>>> #define NFS_MOUNT_CMP_FLAGMASK (NFS_REMOUNT_CMP_FLAGMASK & \
>>> 		~(NFS_MOUNT_UNSHARED | NFS_MOUNT_NORESVPORT))
>>> @@ -2209,12 +2221,15 @@ static int
>>> nfs_compare_and_set_remount_data(struct nfs_server *nfss,
>>> 				 struct nfs_parsed_mount_data *data)
>>> {
>>> +	int changed_flags_mask = data->flags_mask &
>>> NFS_REMOUNT_CHANGE_FLAGS;
>>> +	struct rpc_clnt *cl = nfss->client;
>>> +
>>> 	if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK ||
>>> 	    data->rsize != nfss->rsize ||
>>> 	    data->wsize != nfss->wsize ||
>>> 	    data->version != nfss->nfs_client->rpc_ops->version ||
>>> 	    data->minorversion != nfss->nfs_client->cl_minorversion ||
>>> -	    !nfs_auth_info_match(&data->auth_info, nfss->client-
>>>> cl_auth->au_flavor) ||
>>> +	    !nfs_auth_info_match(&data->auth_info, cl->cl_auth-
>>>> au_flavor) ||
>>> 	    data->acregmin != nfss->acregmin / HZ ||
>>> 	    data->acregmax != nfss->acregmax / HZ ||
>>> 	    data->acdirmin != nfss->acdirmin / HZ ||
>>> @@ -2225,13 +2240,16 @@ nfs_compare_and_set_remount_data(struct
>>> nfs_server *nfss,
>>> 			  (struct sockaddr *)&nfss->nfs_client-
>>>> cl_addr))
>>> 		return -EINVAL;
>>> 
>>> -	if (data->retrans != nfss->client->cl_timeout->to_retries
>>> ||
>>> -	    data->timeo != (10U * nfss->client->cl_timeout-
>>>> to_initval / HZ)) {
>>> +	/* Update flags */
>>> +	nfss->flags = (nfss->flags & ~changed_flags_mask) |
>>> +		(data->flags & changed_flags_mask);
>>> +
>>> +	if (data->retrans != cl->cl_timeout->to_retries ||
>>> +	    data->timeo != (10U * cl->cl_timeout->to_initval /
>>> HZ)) {
>>> 		/* Note that this will affect all mounts from the same
>>> server,
>>> 		 * that use the same protocol.  The timeouts are always
>>> forced
>>> 		 * to be the same.
>>> 		 */
>>> -		struct rpc_clnt *cl = nfss->client;
>>> 		if (cl->cl_timeout != &cl->cl_timeout_default)
>>> 			memcpy(&cl->cl_timeout_default, cl->cl_timeout,
>>> 			       sizeof(struct rpc_timeout));
>>> @@ -2239,6 +2257,8 @@ nfs_compare_and_set_remount_data(struct
>>> nfs_server *nfss,
>>> 		cl->cl_timeout_default.to_initval = data->timeo * HZ /
>>> 10U;
>>> 	}
>>> 
>>> +	cl->cl_kill_new_tasks = !!(nfss->flags &
>>> NFS_MOUNT_SERVERFAILED);
>>> +
>>> 	return 0;
>>> }
>>> 
>>> diff --git a/include/uapi/linux/nfs_mount.h
>>> b/include/uapi/linux/nfs_mount.h
>>> index e44e00616ab5..8ad931cdca20 100644
>>> --- a/include/uapi/linux/nfs_mount.h
>>> +++ b/include/uapi/linux/nfs_mount.h
>>> @@ -74,5 +74,6 @@ struct nfs_mount_data {
>>> 
>>> #define NFS_MOUNT_LOCAL_FLOCK	0x100000
>>> #define NFS_MOUNT_LOCAL_FCNTL	0x200000
>>> +#define NFS_MOUNT_SERVERFAILED	0x400000
>>> 
>>> #endif
>>> -- 
>>> 2.13.6
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-
>>> nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-11-13 17:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10 22:37 [RFC v2 0/7] NFS Force Unmounting Joshua Watt
2017-11-10 22:37 ` [RFC v2 1/7] SUNRPC: Add flag to kill new tasks Joshua Watt
2017-11-10 22:37 ` [RFC v2 2/7] SUNRPC: Expose kill_new_tasks in debugfs Joshua Watt
2017-11-10 22:37 ` [RFC v2 3/7] SUNRPC: Simplify client shutdown Joshua Watt
2017-11-10 22:37 ` [RFC v2 4/7] NFS: Add mount flags mask Joshua Watt
2017-11-10 22:37 ` [RFC v2 5/7] NFS: Add serverfailed mount option Joshua Watt
2017-11-10 22:45   ` Chuck Lever
2017-11-13 16:29     ` Joshua Watt
2017-11-13 17:23       ` Chuck Lever
2017-11-10 22:37 ` [RFC v2 6/7] NFS: Propagate NFS_MOUNT_UNSHARED to clients Joshua Watt
2017-11-10 22:37 ` [RFC v2 7/7] NFS: Propagate operations to unshared clients Joshua Watt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.