From: Alan Jenkins <alan.christopher.jenkins@gmail.com>
To: David Howells <dhowells@redhat.com>, viro@zeniv.linux.org.uk
Cc: linux-api@vger.kernel.org, torvalds@linux-foundation.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/33] vfs: syscall: Add open_tree(2) to reference or clone a mount [ver #11]
Date: Thu, 2 Aug 2018 18:31:06 +0100 [thread overview]
Message-ID: <7a292a44-7e17-572b-2d1e-0085ef400010@gmail.com> (raw)
In-Reply-To: <153313705165.13253.4602180607294286849.stgit@warthog.procyon.org.uk>
Hi
I found this interesting, though I don't entirely follow the kernel
mount/unmount code. I had one puzzle about the code, and two questions
which I was largely able to answer.
On 01/08/18 16:24, David Howells wrote:
> +void dissolve_on_fput(struct vfsmount *mnt)
> +{
> + namespace_lock();
> + lock_mount_hash();
> + mntget(mnt);
> + umount_tree(real_mount(mnt), UMOUNT_SYNC);
> + unlock_mount_hash();
> + namespace_unlock();
> +}
Can I ask why UMOUNT_SYNC is used here? I feel like I must have missed
something, but doesn't it skip the IS_MNT_LOCKED() check in
disconnect_mount() ?
UMOUNT_SYNC seems used for non-lazy unmounts, and in internal cleanups
where userspace wouldn't be able to see. But I think userspace can keep
watching in this case, e.g. by `fd2 = openat(fd, ".", O_PATH)` (or `fd2
= open_tree(fd, ".", 0)` ?). I would think this function should avoid
using UMOUNT_SYNC, like lazy unmount avoids it.
> From: Al Viro <viro@zeniv.linux.org.uk>
>
> open_tree(dfd, pathname, flags)
>
> Returns an O_PATH-opened file descriptor or an error.
> dfd and pathname specify the location to open, in usual
> fashion (see e.g. fstatat(2)). flags should be an OR of
> some of the following:
> * AT_PATH_EMPTY, AT_NO_AUTOMOUNT, AT_SYMLINK_NOFOLLOW -
> same meanings as usual
> * OPEN_TREE_CLOEXEC - make the resulting descriptor
> close-on-exec
> * OPEN_TREE_CLONE or OPEN_TREE_CLONE | AT_RECURSIVE -
> instead of opening the location in question, create a detached
> mount tree matching the subtree rooted at location specified by
> dfd/pathname. With AT_RECURSIVE the entire subtree is cloned,
> without it - only the part within in the mount containing the
> location in question. In other words, the same as mount --rbind
> or mount --bind would've taken.
One of the limitations documented for `mount --bind`, is that `mount -o
bind,ro` is not atomic. There's a workaround if you need it, it's just
a bit clunky. I wondered if it was possible to improve `mount` by
changing the mount flags between OPEN_TREE_CLONE and move_mount().
fd = open_tree(..., OPEN_TREE_CLONE);
fchdir(fd);
mount(NULL, ".", NULL, MS_REMOUNT | MS_BIND | newbindflags, NULL);
move_mount(fd, ...);
Another closely-related limitation of `mount`, is that it can't
atomically set the propagation type at mount time.
My conclusion was the above doesn't quite work yet. do_remount() still
uses check_mnt(), so it doesn't accept detached mounts. I wonder if it
can be changed in future.
> The detached tree will be
> dissolved on the final close of obtained file.
My last question turned out very dull, feel free to ignore.
It seems the only way to use MNT_FORCE[1], is to first attach the
filesystem somewhere (and close the file descriptor). I wondered if
there was a way to make things more regular. close_and_umount() feels
too obscure to live though...
[1] "Ask the filesystem to abort pending requests before attempting
theunmount. This may allow the unmount to complete without waitingfor an
inaccessible server. If, after aborting requests, someprocesses still
have active references to the filesystem, theunmount will still fail."
...and I suppose it's much less useful than I thought. The point of
MNT_FORCE is to kick out processes that were blocked _trying to access a
file by name_, e.g. open() or stat(). But if we're considering a
detached mount, then it's impossible to access it by name alone. You
need an fd (or cwd or root), which would stop the filesystem being
unmounted anyway. close_and_umount(fd, MNT_FORCE) is pointless unless
your process has other threads accessing the filesystem through the same
fd, but that's a really bad idea anyway.
It could prevent someone else getting stuck indefinitely on
/proc/$PID/fd, but that's still very obscure.
Regards
Alan
next prev parent reply other threads:[~2018-08-02 19:23 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-01 15:23 [PATCH 00/33] VFS: Introduce filesystem context [ver #11] David Howells
2018-08-01 15:24 ` [PATCH 01/33] vfs: syscall: Add open_tree(2) to reference or clone a mount " David Howells
2018-08-02 17:31 ` Alan Jenkins [this message]
2018-08-02 21:29 ` Al Viro
2018-08-02 21:51 ` David Howells
2018-08-02 23:46 ` Alan Jenkins
2018-08-01 15:24 ` [PATCH 02/33] vfs: syscall: Add move_mount(2) to move mounts around " David Howells
2018-08-01 15:24 ` [PATCH 03/33] teach move_mount(2) to work with OPEN_TREE_CLONE " David Howells
2018-10-12 14:25 ` Alan Jenkins
2018-08-01 15:24 ` [PATCH 04/33] vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled " David Howells
2018-08-01 15:24 ` [PATCH 05/33] vfs: Introduce the basic header for the new mount API's filesystem context " David Howells
2018-08-01 15:24 ` [PATCH 06/33] vfs: Introduce logging functions " David Howells
2018-08-01 15:24 ` [PATCH 07/33] vfs: Add configuration parser helpers " David Howells
2018-08-01 15:24 ` [PATCH 08/33] vfs: Add LSM hooks for the new mount API " David Howells
2018-08-01 20:50 ` James Morris
2018-08-01 22:53 ` David Howells
2018-08-01 15:25 ` [PATCH 09/33] selinux: Implement the new mount API LSM hooks " David Howells
2018-08-01 15:25 ` [PATCH 10/33] smack: Implement filesystem context security " David Howells
2018-08-01 15:25 ` [PATCH 11/33] apparmor: Implement security hooks for the new mount API " David Howells
2018-08-01 15:25 ` [PATCH 12/33] tomoyo: " David Howells
2018-08-01 15:25 ` [PATCH 13/33] vfs: Separate changing mount flags full remount " David Howells
2018-08-01 15:25 ` [PATCH 14/33] vfs: Implement a filesystem superblock creation/configuration context " David Howells
2018-09-11 17:46 ` Guenter Roeck
2018-09-11 21:52 ` David Howells
2018-09-11 22:07 ` Guenter Roeck
2018-09-11 23:17 ` David Howells
2018-09-11 23:54 ` Guenter Roeck
2018-09-18 9:07 ` Sergey Senozhatsky
2018-09-18 9:40 ` Sergey Senozhatsky
2018-09-18 14:06 ` Guenter Roeck
2018-09-19 1:12 ` Sergey Senozhatsky
2018-09-19 1:26 ` Sergey Senozhatsky
2018-09-18 15:34 ` David Howells
2018-09-18 16:39 ` David Howells
2018-09-19 1:15 ` Sergey Senozhatsky
2018-09-18 17:43 ` David Howells
2018-09-18 9:54 ` Sergey Senozhatsky
2018-09-18 15:28 ` David Howells
2018-08-01 15:25 ` [PATCH 15/33] vfs: Remove unused code after filesystem context changes " David Howells
2018-08-01 15:25 ` [PATCH 16/33] procfs: Move proc_fill_super() to fs/proc/root.c " David Howells
2018-08-01 15:26 ` [PATCH 17/33] proc: Add fs_context support to procfs " David Howells
2018-08-01 15:26 ` [PATCH 18/33] ipc: Convert mqueue fs to fs_context " David Howells
2018-08-01 15:26 ` [PATCH 19/33] cpuset: Use " David Howells
2018-08-01 15:26 ` [PATCH 20/33] kernfs, sysfs, cgroup, intel_rdt: Support " David Howells
2018-08-01 15:26 ` [PATCH 21/33] hugetlbfs: Convert to " David Howells
2018-08-01 15:26 ` [PATCH 22/33] vfs: Remove kern_mount_data() " David Howells
2018-08-01 15:26 ` [PATCH 23/33] vfs: Provide documentation for new mount API " David Howells
2018-08-01 15:26 ` [PATCH 24/33] Make anon_inodes unconditional " David Howells
2018-08-01 15:26 ` [PATCH 25/33] vfs: syscall: Add fsopen() to prepare for superblock creation " David Howells
2018-08-01 15:27 ` [PATCH 26/33] vfs: Implement logging through fs_context " David Howells
2018-08-01 15:27 ` [PATCH 27/33] vfs: Add some logging to the core users of the fs_context log " David Howells
2018-08-01 15:27 ` [PATCH 28/33] vfs: syscall: Add fsconfig() for configuring and managing a context " David Howells
2018-08-06 17:28 ` Eric W. Biederman
2018-08-09 14:14 ` David Howells
2018-08-09 14:24 ` David Howells
2018-08-09 14:35 ` Miklos Szeredi
2018-08-09 15:32 ` Eric W. Biederman
2018-08-09 16:33 ` David Howells
2018-08-11 20:20 ` David Howells
2018-08-11 23:26 ` Andy Lutomirski
2018-08-01 15:27 ` [PATCH 29/33] vfs: syscall: Add fsmount() to create a mount for a superblock " David Howells
2018-08-01 15:27 ` [PATCH 30/33] vfs: syscall: Add fspick() to select a superblock for reconfiguration " David Howells
2018-08-24 14:51 ` Miklos Szeredi
2018-08-24 14:54 ` Andy Lutomirski
2018-08-01 15:27 ` [PATCH 31/33] afs: Add fs_context support " David Howells
2018-08-01 15:27 ` [PATCH 32/33] afs: Use fs_context to pass parameters over automount " David Howells
2018-08-01 15:27 ` [PATCH 33/33] vfs: Add a sample program for the new mount API " David Howells
2018-08-10 14:05 ` BUG: Mount ignores mount options Eric W. Biederman
2018-08-10 14:36 ` Andy Lutomirski
2018-08-10 15:17 ` Eric W. Biederman
2018-08-10 15:24 ` Al Viro
2018-08-10 15:11 ` Tetsuo Handa
2018-08-10 15:13 ` David Howells
2018-08-10 15:16 ` Al Viro
2018-08-11 1:05 ` Eric W. Biederman
2018-08-11 1:46 ` Theodore Y. Ts'o
2018-08-11 4:48 ` Eric W. Biederman
2018-08-11 17:47 ` Casey Schaufler
2018-08-15 4:03 ` Eric W. Biederman
2018-08-11 1:58 ` Al Viro
2018-08-11 2:17 ` Al Viro
2018-08-11 4:43 ` Eric W. Biederman
2018-08-13 12:54 ` Miklos Szeredi
2018-08-10 15:11 ` David Howells
2018-08-10 15:39 ` Theodore Y. Ts'o
2018-08-10 15:55 ` Casey Schaufler
2018-08-10 16:11 ` David Howells
2018-08-10 18:00 ` Eric W. Biederman
2018-08-10 15:53 ` David Howells
2018-08-10 16:14 ` Theodore Y. Ts'o
2018-08-10 20:06 ` Andy Lutomirski
2018-08-10 20:46 ` Theodore Y. Ts'o
2018-08-10 22:12 ` Darrick J. Wong
2018-08-10 23:54 ` Theodore Y. Ts'o
2018-08-11 0:38 ` Darrick J. Wong
2018-08-11 1:32 ` Eric W. Biederman
2018-08-13 16:35 ` Alan Cox
2018-08-13 16:48 ` Andy Lutomirski
2018-08-13 17:29 ` Al Viro
2018-08-13 19:00 ` James Morris
2018-08-13 19:20 ` Casey Schaufler
2018-08-15 23:29 ` Serge E. Hallyn
2018-08-11 0:28 ` Eric W. Biederman
2018-08-11 1:19 ` Eric W. Biederman
2018-08-11 7:29 ` David Howells
2018-08-11 16:31 ` Andy Lutomirski
2018-08-11 16:51 ` Al Viro
2018-08-15 16:31 ` Should we split the network filesystem setup into two phases? David Howells
2018-08-15 16:51 ` Andy Lutomirski
2018-08-16 3:51 ` Steve French
2018-08-16 5:06 ` Eric W. Biederman
2018-08-16 16:24 ` Steve French
2018-08-16 17:21 ` Eric W. Biederman
2018-08-16 17:23 ` Aurélien Aptel
2018-08-16 18:36 ` Steve French
2018-08-17 23:11 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7a292a44-7e17-572b-2d1e-0085ef400010@gmail.com \
--to=alan.christopher.jenkins@gmail.com \
--cc=dhowells@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).