linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: David Howells <dhowells@redhat.com>
Cc: trond.myklebust@hammerspace.com, anna.schumaker@netapp.com,
	sfrench@samba.org, steved@redhat.com, viro@zeniv.linux.org.uk,
	torvalds@linux-foundation.org,
	"Eric W. Biederman" <ebiederm@redhat.com>,
	linux-api@vger.kernel.org, linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org,
	linux-afs@lists.infradead.org, ceph-devel@vger.kernel.org,
	v9fs-developer@lists.sourceforge.net
Subject: Re: Should we split the network filesystem setup into two phases?
Date: Thu, 16 Aug 2018 00:06:06 -0500	[thread overview]
Message-ID: <87pnyiew8x.fsf@xmission.com> (raw)
In-Reply-To: <17763.1534350685@warthog.procyon.org.uk> (David Howells's message of "Wed, 15 Aug 2018 17:31:25 +0100")

David Howells <dhowells@redhat.com> writes:

> Having just re-ported NFS on top of the new mount API stuff, I find that I
> don't really like the idea of superblocks being separated by communication
> parameters - especially when it might seem reasonable to be able to adjust
> those parameters.
>
> Does it make sense to abstract out the remote peer and allow (a) that to be
> configured separately from any superblocks using it and (b) that to be used to
> create superblocks?
>
> Note that what a 'remote peer' is would be different for different
> filesystems:
>
>  (*) For NFS, it would probably be a named server, with address(es) attached
>      to the name.  In lieu of actually having a name, the initial IP address
>      could be used.
>
>  (*) For CIFS, it would probably be a named server.  I'm not sure if CIFS
>      allows an abstraction for a share that can move about inside a domain.
>
>  (*) For AFS, it would be a cell, I think, where the actual fileserver(s) used
>      are a matter of direction from the Volume Location server.
>
>  (*) For 9P and Ceph, I don't really know.
>
> What could be configured?  Well, addresses, ports, timeouts.  Maybe protocol
> level negotiation - though not being able to explicitly specify, say, the
> particular version and minorversion on an NFS share would be problematic for
> backward compatibility.
>
> One advantage it could give us is that it might make it easier if someone asks
> for server X to query userspace in some way for the default parameters for X
> are.
>
> What might this look like in terms of userspace?  Well, we could overload the
> new mount API:
>
> 	peer1 = fsopen("nfs", FSOPEN_CREATE_PEER);
> 	fsconfig(peer1, FSCONFIG_SET_NS, "net", NULL, netns_fd);
> 	fsconfig(peer1, FSCONFIG_SET_STRING, "peer_name", "server.home");
> 	fsconfig(peer1, FSCONFIG_SET_STRING, "vers", "4.2");
> 	fsconfig(peer1, FSCONFIG_SET_STRING, "address", "tcp:192.168.1.1");
> 	fsconfig(peer1, FSCONFIG_SET_STRING, "address", "tcp:192.168.1.2");
> 	fsconfig(peer1, FSCONFIG_SET_STRING, "timeo", "122");
> 	fsconfig(peer1, FSCONFIG_CMD_SET_UP_PEER, NULL, NULL, 0);
>
> 	peer2 = fsopen("nfs", FSOPEN_CREATE_PEER);
> 	fsconfig(peer2, FSCONFIG_SET_NS, "net", NULL, netns_fd);
> 	fsconfig(peer2, FSCONFIG_SET_STRING, "peer_name", "server2.home");
> 	fsconfig(peer2, FSCONFIG_SET_STRING, "vers", "3");
> 	fsconfig(peer2, FSCONFIG_SET_STRING, "address", "tcp:192.168.1.3");
> 	fsconfig(peer2, FSCONFIG_SET_STRING, "address", "udp:192.168.1.4+6001");
> 	fsconfig(peer2, FSCONFIG_CMD_SET_UP_PEER, NULL, NULL, 0);
>
> 	fs = fsopen("nfs", 0);
> 	fsconfig(fs, FSCONFIG_SET_PEER, "peer.1", NULL, peer1);
> 	fsconfig(fs, FSCONFIG_SET_PEER, "peer.2", NULL, peer2);
> 	fsconfig(fs, FSCONFIG_SET_STRING, "source", "/home/dhowells", 0);
> 	m = fsmount(fs, 0, 0);
>
> [Note that Eric's oft-repeated point about the 'creation' operation altering
>  established parameters still stands here.]
>
> You could also then reopen it for configuration, maybe by:
>
> 	peer = fspick(AT_FDCWD, "/mnt", FSPICK_PEER);
>
> or:
>
> 	peer = fspick(AT_FDCWD, "nfs:server.home", FSPICK_PEER_BY_NAME);
>
> though it might be better to give it its own syscall:
>
> 	peer = fspeer("nfs", "server.home", O_CLOEXEC);
> 	fsconfig(peer, FSCONFIG_SET_NS, "net", NULL, netns_fd);
> 	...
> 	fsconfig(peer, FSCONFIG_CMD_SET_UP_PEER, NULL, NULL, 0);
>
> In terms of alternative interfaces, I'm not sure how easy it would be to make
> it like cgroups where you go and create a dir in a special filesystem, say,
> "/sys/peers/nfs", because the peers records and names would have to be network
> namespaced.  Also, it might make it more difficult to use to create a root fs.
>
> On the other hand, being able to adjust the peer configuration by:
>
> 	echo 71 >/sys/peers/nfs/server.home/timeo
>
> does have a certain appeal.
>
> Also, netlink might be the right option, but I'm not sure how you'd pin the
> resultant object whilst you make use of it.
>
> A further thought is that is it worth making this idea more general and
> encompassing non-network devices also?  This would run into issues of some
> logical sources being visible across namespaces and but not others.

Even network filesystems are going to have challenges of filesystems
being visible in some network namespaces and not others.  As some
filesystems will be visible on the internet and some filesystems will
only be visible on the appropriate local network.  Network namespaces
are sometimes used to deal with the case of local networks with
overlapping ip addresses.

I think you are proposing a model for network filesystems that is
essentially the same situation where we are with most block devices
filesystems today.  Where some parameters identitify the local
filesystem instance and some parameters identify how the kernel
interacts with that filesystem instance.


For system efficiency there is a strong argument for having the fewest
number of filesystem instances we can.  Otherwise we will be caching the
same data twice and wasting space in RAM etc.


So I like the idea.


At least for devpts we always create a new filesystem instance every
time mount(2) is called.  NFS seems to have the option to create a new
filesystem instance every time mount(2) is called as well, (even if the
filesystem parameters are the same).  And depending on the case I can
see the attraction for other filesystems as well.

So I don't think we can completely abandon the option for filesystems
to always create a new filesystem instance when mount(8) is called.



I most definitely support thinking this through and figuring out how it
best make sense for the new filesystem API to create new filesystem
instances or fail to create new filesystems instances.


Eric


  parent reply	other threads:[~2018-08-16  5:06 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-01 15:23 [PATCH 00/33] VFS: Introduce filesystem context [ver #11] David Howells
2018-08-01 15:24 ` [PATCH 01/33] vfs: syscall: Add open_tree(2) to reference or clone a mount " David Howells
2018-08-02 17:31   ` Alan Jenkins
2018-08-02 21:29     ` Al Viro
2018-08-02 21:51   ` David Howells
2018-08-02 23:46     ` Alan Jenkins
2018-08-01 15:24 ` [PATCH 02/33] vfs: syscall: Add move_mount(2) to move mounts around " David Howells
2018-08-01 15:24 ` [PATCH 03/33] teach move_mount(2) to work with OPEN_TREE_CLONE " David Howells
2018-10-12 14:25   ` Alan Jenkins
2018-08-01 15:24 ` [PATCH 04/33] vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled " David Howells
2018-08-01 15:24 ` [PATCH 05/33] vfs: Introduce the basic header for the new mount API's filesystem context " David Howells
2018-08-01 15:24 ` [PATCH 06/33] vfs: Introduce logging functions " David Howells
2018-08-01 15:24 ` [PATCH 07/33] vfs: Add configuration parser helpers " David Howells
2018-08-01 15:24 ` [PATCH 08/33] vfs: Add LSM hooks for the new mount API " David Howells
2018-08-01 20:50   ` James Morris
2018-08-01 22:53   ` David Howells
2018-08-01 15:25 ` [PATCH 09/33] selinux: Implement the new mount API LSM hooks " David Howells
2018-08-01 15:25 ` [PATCH 10/33] smack: Implement filesystem context security " David Howells
2018-08-01 15:25 ` [PATCH 11/33] apparmor: Implement security hooks for the new mount API " David Howells
2018-08-01 15:25 ` [PATCH 12/33] tomoyo: " David Howells
2018-08-01 15:25 ` [PATCH 13/33] vfs: Separate changing mount flags full remount " David Howells
2018-08-01 15:25 ` [PATCH 14/33] vfs: Implement a filesystem superblock creation/configuration context " David Howells
2018-09-11 17:46   ` Guenter Roeck
2018-09-11 21:52   ` David Howells
2018-09-11 22:07     ` Guenter Roeck
2018-09-11 23:17     ` David Howells
2018-09-11 23:54       ` Guenter Roeck
2018-09-18  9:07         ` Sergey Senozhatsky
2018-09-18  9:40           ` Sergey Senozhatsky
2018-09-18 14:06           ` Guenter Roeck
2018-09-19  1:12             ` Sergey Senozhatsky
2018-09-19  1:26               ` Sergey Senozhatsky
2018-09-18 15:34         ` David Howells
2018-09-18 16:39         ` David Howells
2018-09-19  1:15           ` Sergey Senozhatsky
2018-09-18 17:43         ` David Howells
2018-09-18  9:54   ` Sergey Senozhatsky
2018-09-18 15:28   ` David Howells
2018-08-01 15:25 ` [PATCH 15/33] vfs: Remove unused code after filesystem context changes " David Howells
2018-08-01 15:25 ` [PATCH 16/33] procfs: Move proc_fill_super() to fs/proc/root.c " David Howells
2018-08-01 15:26 ` [PATCH 17/33] proc: Add fs_context support to procfs " David Howells
2018-08-01 15:26 ` [PATCH 18/33] ipc: Convert mqueue fs to fs_context " David Howells
2018-08-01 15:26 ` [PATCH 19/33] cpuset: Use " David Howells
2018-08-01 15:26 ` [PATCH 20/33] kernfs, sysfs, cgroup, intel_rdt: Support " David Howells
2018-08-01 15:26 ` [PATCH 21/33] hugetlbfs: Convert to " David Howells
2018-08-01 15:26 ` [PATCH 22/33] vfs: Remove kern_mount_data() " David Howells
2018-08-01 15:26 ` [PATCH 23/33] vfs: Provide documentation for new mount API " David Howells
2018-08-01 15:26 ` [PATCH 24/33] Make anon_inodes unconditional " David Howells
2018-08-01 15:26 ` [PATCH 25/33] vfs: syscall: Add fsopen() to prepare for superblock creation " David Howells
2018-08-01 15:27 ` [PATCH 26/33] vfs: Implement logging through fs_context " David Howells
2018-08-01 15:27 ` [PATCH 27/33] vfs: Add some logging to the core users of the fs_context log " David Howells
2018-08-01 15:27 ` [PATCH 28/33] vfs: syscall: Add fsconfig() for configuring and managing a context " David Howells
2018-08-06 17:28   ` Eric W. Biederman
2018-08-09 14:14   ` David Howells
2018-08-09 14:24   ` David Howells
2018-08-09 14:35     ` Miklos Szeredi
2018-08-09 15:32     ` Eric W. Biederman
2018-08-09 16:33     ` David Howells
2018-08-11 20:20     ` David Howells
2018-08-11 23:26       ` Andy Lutomirski
2018-08-01 15:27 ` [PATCH 29/33] vfs: syscall: Add fsmount() to create a mount for a superblock " David Howells
2018-08-01 15:27 ` [PATCH 30/33] vfs: syscall: Add fspick() to select a superblock for reconfiguration " David Howells
2018-08-24 14:51   ` Miklos Szeredi
2018-08-24 14:54     ` Andy Lutomirski
2018-08-01 15:27 ` [PATCH 31/33] afs: Add fs_context support " David Howells
2018-08-01 15:27 ` [PATCH 32/33] afs: Use fs_context to pass parameters over automount " David Howells
2018-08-01 15:27 ` [PATCH 33/33] vfs: Add a sample program for the new mount API " David Howells
2018-08-10 14:05 ` BUG: Mount ignores mount options Eric W. Biederman
2018-08-10 14:36   ` Andy Lutomirski
2018-08-10 15:17     ` Eric W. Biederman
2018-08-10 15:24     ` Al Viro
2018-08-10 15:11   ` Tetsuo Handa
2018-08-10 15:13   ` David Howells
2018-08-10 15:16   ` Al Viro
2018-08-11  1:05     ` Eric W. Biederman
2018-08-11  1:46       ` Theodore Y. Ts'o
2018-08-11  4:48         ` Eric W. Biederman
2018-08-11 17:47           ` Casey Schaufler
2018-08-15  4:03             ` Eric W. Biederman
2018-08-11  1:58       ` Al Viro
2018-08-11  2:17         ` Al Viro
2018-08-11  4:43           ` Eric W. Biederman
2018-08-13 12:54         ` Miklos Szeredi
2018-08-10 15:11 ` David Howells
2018-08-10 15:39   ` Theodore Y. Ts'o
2018-08-10 15:55     ` Casey Schaufler
2018-08-10 16:11     ` David Howells
2018-08-10 18:00     ` Eric W. Biederman
2018-08-10 15:53   ` David Howells
2018-08-10 16:14     ` Theodore Y. Ts'o
2018-08-10 20:06       ` Andy Lutomirski
2018-08-10 20:46         ` Theodore Y. Ts'o
2018-08-10 22:12           ` Darrick J. Wong
2018-08-10 23:54             ` Theodore Y. Ts'o
2018-08-11  0:38               ` Darrick J. Wong
2018-08-11  1:32                 ` Eric W. Biederman
2018-08-13 16:35         ` Alan Cox
2018-08-13 16:48           ` Andy Lutomirski
2018-08-13 17:29             ` Al Viro
2018-08-13 19:00               ` James Morris
2018-08-13 19:20                 ` Casey Schaufler
2018-08-15 23:29                 ` Serge E. Hallyn
2018-08-11  0:28       ` Eric W. Biederman
2018-08-11  1:19   ` Eric W. Biederman
2018-08-11  7:29   ` David Howells
2018-08-11 16:31     ` Andy Lutomirski
2018-08-11 16:51       ` Al Viro
2018-08-15 16:31 ` Should we split the network filesystem setup into two phases? David Howells
2018-08-15 16:51   ` Andy Lutomirski
2018-08-16  3:51   ` Steve French
2018-08-16  5:06   ` Eric W. Biederman [this message]
2018-08-16 16:24     ` Steve French
2018-08-16 17:21       ` Eric W. Biederman
2018-08-16 17:23       ` Aurélien Aptel
2018-08-16 18:36         ` Steve French
2018-08-17 23:11     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pnyiew8x.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=anna.schumaker@netapp.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=sfrench@samba.org \
    --cc=steved@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).