All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS Force Unmounting
@ 2017-10-25 17:11 Joshua Watt
  2017-10-30 20:20 ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: Joshua Watt @ 2017-10-25 17:11 UTC (permalink / raw)
  To: linux-nfs

Hello,

I'm working on a networking embedded system where NFS servers can come
and go from the network, and I've discovered that the Kernel NFS server
make it difficult to cleanup applications in a timely manner when the
server disappears (and yes, I am mounting with "soft" and relatively
short timeouts). I currently have a user space mechanism that can
quickly detect when the server disappears, and does a umount() with the
MNT_FORCE and MNT_DETACH flags. Using MNT_DETACH prevents new accesses
to files on the defunct remote server, and I have traced through the
code to see that MNT_FORCE does indeed cancel any current RPC tasks
with -EIO. However, this isn't sufficient for my use case because if a
user space application isn't currently waiting on an RCP task that gets
canceled, it will have to timeout again before it detects the
disconnect. For example, if a simple client is copying a file from the
NFS server, and happens to not be waiting on the RPC task in the read()
call when umount() occurs, it will be none the wiser and loop around to
call read() again, which must then try the whole NFS timeout + recovery
before the failure is detected. If a client is more complex and has a
lot of open file descriptor, it will typical have to wait for each one
to timeout, leading to very long delays.

The (naive?) solution seems to be to add some flag in either the NFS
client or the RPC client that gets set in nfs_umount_begin(). This
would cause all subsequent operations to fail with an error code
instead of having to be queued as an RPC task and the and then timing
out. In our example client, the application would then get the -EIO
immediately on the next (and all subsequent) read() calls.

There does seem to be some precedence for doing this (especially with
network file systems), as both cifs (CifsExiting) and ceph
(CEPH_MOUNT_SHUTDOWN) appear to implement this behavior (at least from
looking at the code. I haven't verified runtime behavior).

Are there any pitfalls I'm oversimplifying?

Thanks,
Joshua Watt

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-11-10 14:16 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-25 17:11 NFS Force Unmounting Joshua Watt
2017-10-30 20:20 ` J. Bruce Fields
2017-10-30 21:04   ` Joshua Watt
2017-10-30 21:09   ` NeilBrown
2017-10-31 14:41     ` Jeff Layton
2017-10-31 14:55       ` Chuck Lever
2017-10-31 17:04         ` Joshua Watt
2017-10-31 19:46           ` Chuck Lever
2017-11-01  0:53       ` NeilBrown
2017-11-01  2:22         ` Chuck Lever
2017-11-01 14:38           ` Joshua Watt
2017-11-02  0:15           ` NeilBrown
2017-11-02 19:46             ` Chuck Lever
2017-11-02 21:51               ` NeilBrown
2017-11-01 17:24     ` Jeff Layton
2017-11-01 23:13       ` NeilBrown
2017-11-02 12:09         ` Jeff Layton
2017-11-02 14:54           ` Joshua Watt
2017-11-08  3:30             ` NeilBrown
2017-11-08 12:08               ` Jeff Layton
2017-11-08 15:52                 ` J. Bruce Fields
2017-11-08 22:34                   ` NeilBrown
2017-11-08 23:52                     ` Trond Myklebust
2017-11-09 19:48                       ` Joshua Watt
2017-11-10  0:16                         ` NeilBrown
2017-11-08 14:59             ` [RFC 0/4] " Joshua Watt
2017-11-08 14:59               ` [RFC 1/4] SUNRPC: Add flag to kill new tasks Joshua Watt
2017-11-10  1:39                 ` NeilBrown
2017-11-08 14:59               ` [RFC 2/4] SUNRPC: Kill client tasks from debugfs Joshua Watt
2017-11-10  1:47                 ` NeilBrown
2017-11-10 14:13                   ` Joshua Watt
2017-11-08 14:59               ` [RFC 3/4] SUNRPC: Simplify client shutdown Joshua Watt
2017-11-10  1:50                 ` NeilBrown
2017-11-08 14:59               ` [RFC 4/4] NFS: Add forcekill mount option Joshua Watt
2017-11-10  2:01                 ` NeilBrown
2017-11-10 14:16                   ` Joshua Watt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.