From: NeilBrown <neilb-IBi9RG/b67k@public.gmane.org>
To: Joshua Watt <jpewhacker-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Trond Myklebust
<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>,
"J . Bruce Fields"
<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RFC v4 0/9] NFS Force Unmounting
Date: Fri, 15 Dec 2017 08:52:39 +1100 [thread overview]
Message-ID: <876099nfiw.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <1513275773.3888.20.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 5786 bytes --]
On Thu, Dec 14 2017, Joshua Watt wrote:
> On Fri, 2017-12-08 at 13:10 +1100, NeilBrown wrote:
>> On Wed, Dec 06 2017, Jeff Layton wrote:
>>
>> > On Wed, 2017-12-06 at 10:34 +1100, NeilBrown wrote:
>> > >
>> > > The new semantic for MNT_DETACH|MNT_FORCE is interesting.
>> > > As it was never possible before (from /bin/umount), it should be
>> > > safe to
>> > > add a new meaning.
>> > > The meaning is effectively "detach the filesystem from the
>> > > namespace and
>> > > detach the transport from the filesystem", which sounds like it
>> > > is
>> > > useful.
>> > > It is worth highlighting this, and maybe even cc:ing
>> > > linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org ... done that.
>> > >
>> >
>> > I'm not thrilled with the new flag combo, personally. Given that
>> > we're
>> > introducing new behavior here, I think it wouldn't hurt to add a
>> > new
>> > UMOUNT_* flag for this (UMOUNT_NUKE_FROM_ORBIT?).
>>
>> Suppose we did... MNT_FORCE_PONIES. What would be the semantics of
>> this
>> flag? Once we had it, would anyone ever want to use MNT_FORCE again?
>>
>> MNT_FORCE is already fairly heavy handled. It abort an arbitrary
>> collections of RPC requests being sent for the given filesystem, no
>> matter where else that filesystem might be mounted.
>> Is it ever safe to use this flag unless you have good reason to
>> believe
>> that the server is not available and there is no point pretending any
>> more?
>> And if that is the case, why not use the new MNT_FORCE_PONIES which
>> is
>> at least predictable and reliable.
>>
>> We've talking a lot about the one NFS filesystem being mounted in
>> multiple containers. MNT_FORCE is already a problem for such mounts
>> as
>> one contains can kill requests generated from another
>> container. Maybe
>> MNT_FORCE needs to be restricted to "real" root.
>> Once we restrict it, do we need to keep it from being too harsh?
>>
>> What would be really nice is a timeout for umount, and for sync.
>> The timeout only starts when the filesystem stops making progress for
>> writeback. If it eventually does timeout, then the caller can fall
>> back
>> to MNT_DETACH if they are in a container, or MNT_FORCE if not.
>> (Maybe MNT_FORCE should map to MNT_DETACH in a container??? or maybe
>> not).
>>
>> There is a lot here that still isn't clear to me, but one this does
>> seem
>> to be becoming clear: MNT_FORCE as it stands is nearly useless and
>> it
>> would serve is well to find a semantic that it actually useful, and
>> impose that.
>
> Trying to keep the discussion going... does anyone else have thoughts
> on this?
It's a challenge, isn't it ... keeping people on-task to make forward
progress.
If only we could all meet in the canteen at 4pm every Friday and discuss
these things over drinks. I don't suppose any of the video conference
tools support timeshifting, so we can each do 4pm in our own time
zone....
I would like to arrange that nothing can block indefinitely on
->s_umount. This probably means that the various "flush data" calls
made under this lock need a timeout, or to be interruptible.
Then both umount and remount could be sure of getting ->s_umount
without undue delay.
Then I would like MNT_FORCE *not* to abort requests before trying to get
the lock, but instead to be passed down to ->kill_sb().
We probably cannot pass it explicitly, but could set a flag while
->s_umount is held.
This flag might be handled by generic_shutdown_super(), causing
it to purge any unwritten data, rather than call sync_filesystems().
This way, if the filesystem is mounted elsewhere, then the MNT_FORCE has
no effect. If it is a final mount, then it cleans up properly.
Your need to cause writes to start failing would be achieved by
performing a remount, either just setting "soft,retrans=0,timeo=1", or
by setting some special-purpose mount option.
In order for s_umount not to be held indefinite, the generic things that
need to be fixed include:
__writeback_inodes_wb() calls writeback_sb_inodes() under the lock.
This needs to be interruptible
Same for try_to_writeback_inodes_sb() -> __writeback_inodes_sb_nr()
sync_sync and do_sync_work call iterate_supers() with various handlers, and these need
to be interruptible.
and do_remount_sb needs to not block.
Finding a way to interrupt those writeback calls would be tricky,
especially as we need to trigger the interrupt without holding s_umount.
I really like the idea that an umount attempt would interrupt a sync().
Currently sync() can block indefinitely, which is occasionally
inconvenient.
If "umount" (or "umount -f" at least) on a filesystem would abort the
sync of that filesystem, take the lock and clean up more forcefully,
that would make for fairly clean shutdown processing.
1/ call sync() in a separate thread.
2/ wait until Dirty in /proc/meminfo stops changing
3/ umount -f every remaining filesystem. Even if the
umount fails, the sync will abort.
Part of this effort would require making sure that SIGKILL really kills
processes blocked on filesystem IO.
So:
1/ make sure all filesystem IO waits are TASK_KILLABLE
2/ find a way to interrupt any write-back wait when there is a pending
remount or umount. Possibly the write-back thing would need to
retry after the umount/remount, I'm not sure.
3/ Cause MNT_FORCE to set a superblock flag, and have
generic_shutdown_super() and/or ->kill_sb() interpret this flag to
be very forceful
4/ Possibly introduce new NFS mount option which causes all requests
to fail
5/ Teach NFS to support remount of this option, and of soft, retrans,
timeo.
How does that sound?
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-12-14 21:52 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20171117174552.18722-1-JPEWhacker@gmail.com>
[not found] ` <1512398194.7031.56.camel@gmail.com>
[not found] ` <1512398194.7031.56.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-05 23:34 ` [RFC v4 0/9] NFS Force Unmounting NeilBrown
[not found] ` <87indksq9q.fsf-wvvUuzkyo1HefUI2i7LXDhCRmIWqnp/j@public.gmane.org>
2017-12-06 13:03 ` Jeff Layton
[not found] ` <1512565420.4048.21.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-12-06 16:40 ` Joshua Watt
2017-12-08 2:10 ` NeilBrown
[not found] ` <87bmjaq89r.fsf-wvvUuzkyo1HefUI2i7LXDhCRmIWqnp/j@public.gmane.org>
2017-12-14 18:22 ` Joshua Watt
[not found] ` <1513275773.3888.20.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-14 21:52 ` NeilBrown [this message]
[not found] ` <876099nfiw.fsf-wvvUuzkyo1HefUI2i7LXDhCRmIWqnp/j@public.gmane.org>
2017-12-18 21:48 ` Joshua Watt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=876099nfiw.fsf@notabene.neil.brown.name \
--to=neilb-ibi9rg/b67k@public.gmane.org \
--cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
--cc=dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jpewhacker-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).