All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-09 18:45 Trond Myklebust
  2009-02-09 18:45   ` Trond Myklebust
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel, linux-nfs; +Cc: linux-kernel, viro

The following two patches attempt to improve NFSv4's ability to look up
the mount path on a remote server.

The first patch adds VFS support for walking the remote path, using a
temporary mount namespace to represent the server's namespace, so that
symlinks and referrals can be followed across remote filesystem and
server boundaries.
The second patch then uses this VFS helper in the NFSv4 mount code.

A later set of patches will clean out the legacy nfs4_path_walk, however
I'm hoping for some commentary from the VFS folks on whether or not this
approach is acceptable before I continue.

Cheers
  Trond

---

Trond Myklebust (2):
      NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
      VFS: Add a VFS helper function vfs_remote_path_lookup()


 fs/namei.c                    |   75 ++++++++++++++++++++-
 fs/namespace.c                |   56 +++++++++++++--
 fs/nfs/super.c                |  149 +++++++++++++++++++++++++++++++++++------
 include/linux/mnt_namespace.h |    2 +
 include/linux/namei.h         |    2 +
 include/linux/nsproxy.h       |    1 
 kernel/nsproxy.c              |   11 +++
 7 files changed, 267 insertions(+), 29 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/2] VFS: Add a VFS helper function vfs_remote_path_lookup()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel, linux-nfs; +Cc: linux-kernel, viro

The purpose of this patch is to improve the mount path lookup support for
filesystems such as NFSv4, which require you to look up a mount path
string in a remote server's namespace.

Traversing such a path is pretty much identical to walking a local path,
in that it may involve following symlinks and even following referrals to
volumes that reside on other servers. Since the standard VFS path lookup
code already supports all these features (using in-kernel automounts for
following referrals) it would be nice to be able to reuse that code rather
than special case the mount path lookup in the NFS client.

This patch therefore defines a VFS helper function that sets up a temporary
mount namespace to represent the server namespace, and has the current
task pivot into that prior to doing the path lookup. Upon completion, it
pivots back into the original namespace, and destroys the private one.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/namei.c                    |   75 ++++++++++++++++++++++++++++++++++++++++-
 fs/namespace.c                |   56 ++++++++++++++++++++++++++-----
 include/linux/mnt_namespace.h |    2 +
 include/linux/namei.h         |    2 +
 include/linux/nsproxy.h       |    1 +
 kernel/nsproxy.c              |   11 ++++++
 6 files changed, 138 insertions(+), 9 deletions(-)


diff --git a/fs/namei.c b/fs/namei.c
index bbc15c2..9280299 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -25,7 +25,7 @@
 #include <linux/personality.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
-#include <linux/mount.h>
+#include <linux/mnt_namespace.h>
 #include <linux/audit.h>
 #include <linux/capability.h>
 #include <linux/file.h>
@@ -1119,6 +1119,79 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 
 /**
+ * vfs_remote_path_lookup - look up a path in a remote server namespace
+ * @dentry:  pointer to dentry of the base directory
+ * @mnt: pointer to vfs mount of the base directory
+ * @name: pointer to file name
+ * @flags: lookup flags
+ * @nd: pointer to nameidata
+ *
+ * This function creates a private mount namespace and sets 'mnt' as the
+ * root volume before looking up the path 'name'.
+ * It is intended for use by filesystems like NFSv4, which has a mount
+ * path that is relative to a remote server's namespace, and where walking
+ * that path may involve following referrals/links to volumes that reside
+ * on yet other servers. The resulting in-kernel automount can now be
+ * done safely as it affects the private namespace only.
+ */
+int vfs_remote_path_lookup(struct dentry *dentry,
+		struct vfsmount *mnt, const char *name,
+		unsigned int flags, struct nameidata *nd)
+{
+	struct nsproxy *new_nsproxy, *orig_nsproxy;
+	struct mnt_namespace *new_mnt_ns;
+	struct fs_struct *new_fs, *orig_fs;
+	int error = -ENOMEM;
+
+	new_fs = copy_fs_struct(current->fs);
+	if (new_fs == NULL)
+		goto out_err;
+
+	new_mnt_ns = create_private_mnt_ns(mnt, new_fs);
+	if (new_mnt_ns == NULL)
+		goto out_put_fs_struct;
+
+	/* Create a private copy of current->nsproxy */
+	new_nsproxy = unshare_current_nsproxy();
+	error = PTR_ERR(new_nsproxy);
+	if (IS_ERR(new_nsproxy))
+		goto out_put_mnt_ns;
+
+	/* ...and substitute the private mount namespace */
+	put_mnt_ns(new_nsproxy->mnt_ns);
+	new_nsproxy->mnt_ns = new_mnt_ns;
+	get_mnt_ns(new_mnt_ns);
+
+	/* Save the old nsproxy */
+	orig_nsproxy = current->nsproxy;
+	get_nsproxy(orig_nsproxy);
+
+	/* Pivot into the new mount namespace */
+	switch_task_namespaces(current, new_nsproxy);
+	task_lock(current);
+	orig_fs = current->fs;
+	current->fs = new_fs;
+	task_unlock(current);
+
+	error = vfs_path_lookup(dentry, mnt, name, flags, nd);
+
+	/* Pivot back into the original namespace */
+	task_lock(current);
+	current->fs = orig_fs;
+	task_unlock(current);
+	switch_task_namespaces(current, orig_nsproxy);
+
+out_put_mnt_ns:
+	put_mnt_ns(new_mnt_ns);
+
+out_put_fs_struct:
+	put_fs_struct(new_fs);
+out_err:
+	return error;
+}
+EXPORT_SYMBOL_GPL(vfs_remote_path_lookup);
+
+/**
  * path_lookup_open - lookup a file path with open intent
  * @dfd: the directory to use as base, or AT_FDCWD
  * @name: pointer to file name
diff --git a/fs/namespace.c b/fs/namespace.c
index 228d8c4..72f20a6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1963,6 +1963,21 @@ dput_out:
 	return retval;
 }
 
+static struct mnt_namespace *alloc_mnt_ns(void)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
+	if (!new_ns)
+		return ERR_PTR(-ENOMEM);
+	atomic_set(&new_ns->count, 1);
+	new_ns->root = NULL;
+	INIT_LIST_HEAD(&new_ns->list);
+	init_waitqueue_head(&new_ns->poll);
+	new_ns->event = 0;
+	return new_ns;
+}
+
 /*
  * Allocate a new namespace structure and populate it with contents
  * copied from the namespace of the passed in task structure.
@@ -1974,14 +1989,9 @@ static struct mnt_namespace *dup_mnt_ns(struct mnt_namespace *mnt_ns,
 	struct vfsmount *rootmnt = NULL, *pwdmnt = NULL;
 	struct vfsmount *p, *q;
 
-	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
-	if (!new_ns)
-		return ERR_PTR(-ENOMEM);
-
-	atomic_set(&new_ns->count, 1);
-	INIT_LIST_HEAD(&new_ns->list);
-	init_waitqueue_head(&new_ns->poll);
-	new_ns->event = 0;
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
 
 	down_write(&namespace_sem);
 	/* First pass: copy the tree topology */
@@ -2045,6 +2055,36 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns,
 	return new_ns;
 }
 
+struct mnt_namespace *create_private_mnt_ns(struct vfsmount *mnt_root,
+		struct fs_struct *fs)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
+
+	/* We're starting a completely fresh namespace, so we shouldn't need
+	 * to lock
+	 */
+	mnt_root->mnt_ns = new_ns;
+	new_ns->root = mnt_root;
+	list_add(&new_ns->list, &new_ns->root->mnt_list);
+
+	/* Also assume that the fs_struct is private, hence no locks... */
+	if (fs) {
+		dput(fs->pwd.dentry);
+		mntput(fs->pwd.mnt);
+		dput(fs->root.dentry);
+		mntput(fs->root.mnt);
+		fs->root.mnt = mntget(new_ns->root);
+		fs->root.dentry = dget(new_ns->root->mnt_root);
+		fs->pwd.mnt = mntget(new_ns->root);
+		fs->pwd.dentry = dget(new_ns->root->mnt_root);
+	}
+	return new_ns;
+}
+
 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name,
 		char __user *, type, unsigned long, flags, void __user *, data)
 {
diff --git a/include/linux/mnt_namespace.h b/include/linux/mnt_namespace.h
index 830bbcd..e81c076 100644
--- a/include/linux/mnt_namespace.h
+++ b/include/linux/mnt_namespace.h
@@ -22,6 +22,8 @@ struct proc_mounts {
 	int event;
 };
 
+extern struct mnt_namespace *create_private_mnt_ns(struct vfsmount *,
+		struct fs_struct *);
 extern struct mnt_namespace *copy_mnt_ns(unsigned long, struct mnt_namespace *,
 		struct fs_struct *);
 extern void __put_mnt_ns(struct mnt_namespace *ns);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..b9c0d1e 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -68,6 +68,8 @@ extern int kern_path(const char *, unsigned, struct path *);
 extern int path_lookup(const char *, unsigned, struct nameidata *);
 extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
 			   const char *, unsigned int, struct nameidata *);
+extern int vfs_remote_path_lookup(struct dentry *, struct vfsmount *,
+		const char *, unsigned int , struct nameidata *);
 
 extern int path_lookup_open(int dfd, const char *name, unsigned lookup_flags, struct nameidata *, int open_flags);
 extern struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry *dentry,
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index afad7de..3c12b79 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -67,6 +67,7 @@ void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
 int unshare_nsproxy_namespaces(unsigned long, struct nsproxy **,
 	struct fs_struct *);
+struct nsproxy *unshare_current_nsproxy(void);
 
 static inline void put_nsproxy(struct nsproxy *ns)
 {
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 63598dc..05ea102 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -169,6 +169,17 @@ void free_nsproxy(struct nsproxy *ns)
 }
 
 /*
+ * Unshare just the current->nsproxy itself.
+ */
+struct nsproxy *unshare_current_nsproxy(void)
+{
+	if (!capable(CAP_SYS_ADMIN))
+		return ERR_PTR(-EPERM);
+
+	return create_new_namespaces(0, current, current->fs);
+}
+
+/*
  * Called from unshare. Unshare all the namespaces part of nsproxy.
  * On success, returns the new nsproxy.
  */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/2] NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel, linux-nfs; +Cc: linux-kernel, viro

The NFSv4 mount code currently has several limitations which we really
don't want to keep. For one thing, it will not follow symlinks, and neither
will it follow referrals.
The correct approach here should be to reuse the existing VFS and NFS path
lookup code instead of duplicating it inside nfs4_path_walk().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/super.c |  149 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 129 insertions(+), 20 deletions(-)


diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index d6686f4..b75e551 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -42,6 +42,7 @@
 #include <linux/smp_lock.h>
 #include <linux/seq_file.h>
 #include <linux/mount.h>
+#include <linux/namei.h>
 #include <linux/nfs_idmap.h>
 #include <linux/vfs.h>
 #include <linux/inet.h>
@@ -264,10 +265,14 @@ static const struct super_operations nfs_sops = {
 #ifdef CONFIG_NFS_V4
 static int nfs4_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_xdev_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_referral_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static void nfs4_kill_super(struct super_block *sb);
 
 static struct file_system_type nfs4_fs_type = {
@@ -278,6 +283,14 @@ static struct file_system_type nfs4_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_xdev_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -286,6 +299,14 @@ struct file_system_type nfs4_xdev_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_referral_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_referral_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_referral_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -2330,12 +2351,12 @@ out_no_client_address:
 }
 
 /*
- * Get the superblock for an NFS4 mountpoint
+ * Get the superblock for the NFS4 root partition
  */
-static int nfs4_get_sb(struct file_system_type *fs_type,
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
 {
-	struct nfs_parsed_mount_data *data;
+	struct nfs_parsed_mount_data *data = raw_data;
 	struct super_block *s;
 	struct nfs_server *server;
 	struct nfs_fh *mntfh;
@@ -2346,18 +2367,12 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	};
 	int error = -ENOMEM;
 
-	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	mntfh = kzalloc(sizeof(*mntfh), GFP_KERNEL);
 	if (data == NULL || mntfh == NULL)
 		goto out_free_fh;
 
 	security_init_mnt_opts(&data->lsm_opts);
 
-	/* Validate the mount data */
-	error = nfs4_validate_mount_data(raw_data, data, dev_name);
-	if (error < 0)
-		goto out;
-
 	/* Get a volume representation */
 	server = nfs4_create_server(data, mntfh);
 	if (IS_ERR(server)) {
@@ -2370,7 +2385,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 		compare_super = NULL;
 
 	/* Get a superblock - note that we may end up sharing one that already exists */
-	s = sget(fs_type, compare_super, nfs_set_super, &sb_mntdata);
+	s = sget(&nfs4_fs_type, compare_super, nfs_set_super, &sb_mntdata);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
 		goto out_free;
@@ -2406,13 +2421,9 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	error = 0;
 
 out:
-	kfree(data->client_address);
-	kfree(data->nfs_server.export_path);
-	kfree(data->nfs_server.hostname);
 	security_free_mnt_opts(&data->lsm_opts);
 out_free_fh:
 	kfree(mntfh);
-	kfree(data);
 	return error;
 
 out_free:
@@ -2427,6 +2438,75 @@ error_splat_super:
 	goto out;
 }
 
+static int nfs_follow_remote_path(struct vfsmount *root_mnt,
+		const char *export_path, struct vfsmount *mnt_target)
+{
+	struct nameidata nd;
+	struct super_block *s;
+	int error;
+
+	error = vfs_remote_path_lookup(root_mnt->mnt_root, root_mnt,
+			export_path, LOOKUP_FOLLOW, &nd);
+
+	mntput(root_mnt);
+	if (error)
+		return error;
+
+	s = nd.path.mnt->mnt_sb;
+	down_write(&s->s_umount);
+	if (s->s_root) {
+		atomic_inc(&s->s_active);
+		mnt_target->mnt_root = dget(nd.path.dentry);
+		mnt_target->mnt_sb = s;
+		s->s_flags |= MS_ACTIVE;
+	}
+
+	path_put(&nd.path);
+	return 0;
+}
+
+/*
+ * Get the superblock for an NFS4 mountpoint
+ */
+static int nfs4_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
+{
+	struct nfs_parsed_mount_data *data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error = -ENOMEM;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (data == NULL)
+		goto out_free_data;
+
+	/* Validate the mount data */
+	error = nfs4_validate_mount_data(raw_data, data, dev_name);
+	if (error < 0)
+		goto out;
+
+	export_path = data->nfs_server.export_path;
+	data->nfs_server.export_path = "/";
+	root_mnt = vfs_kern_mount(&nfs4_remote_fs_type, flags, dev_name, data);
+	data->nfs_server.export_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+
+out:
+	kfree(data->client_address);
+	kfree(data->nfs_server.export_path);
+	kfree(data->nfs_server.hostname);
+out_free_data:
+	kfree(data);
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 static void nfs4_kill_super(struct super_block *sb)
 {
 	struct nfs_server *server = NFS_SB(sb);
@@ -2522,12 +2602,9 @@ error_splat_super:
 	return error;
 }
 
-/*
- * Create an NFS4 server record on referral traversal
- */
-static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
-				const char *dev_name, void *raw_data,
-				struct vfsmount *mnt)
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
 {
 	struct nfs_clone_mount *data = raw_data;
 	struct super_block *s;
@@ -2607,4 +2684,36 @@ error_splat_super:
 	return error;
 }
 
+/*
+ * Create an NFS4 server record on referral traversal
+ */
+static int nfs4_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
+{
+	struct nfs_clone_mount *data = raw_data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error;
+
+	dprintk("--> nfs4_referral_get_sb()\n");
+
+	export_path = data->mnt_path;
+	data->mnt_path = "/";
+
+	root_mnt = vfs_kern_mount(&nfs4_remote_referral_fs_type,
+			flags, dev_name, data);
+	data->mnt_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+out:
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 #endif /* CONFIG_NFS_V4 */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/2] VFS: Add a VFS helper function vfs_remote_path_lookup()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

The purpose of this patch is to improve the mount path lookup support for
filesystems such as NFSv4, which require you to look up a mount path
string in a remote server's namespace.

Traversing such a path is pretty much identical to walking a local path,
in that it may involve following symlinks and even following referrals to
volumes that reside on other servers. Since the standard VFS path lookup
code already supports all these features (using in-kernel automounts for
following referrals) it would be nice to be able to reuse that code rather
than special case the mount path lookup in the NFS client.

This patch therefore defines a VFS helper function that sets up a temporary
mount namespace to represent the server namespace, and has the current
task pivot into that prior to doing the path lookup. Upon completion, it
pivots back into the original namespace, and destroys the private one.

Signed-off-by: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
---

 fs/namei.c                    |   75 ++++++++++++++++++++++++++++++++++++++++-
 fs/namespace.c                |   56 ++++++++++++++++++++++++++-----
 include/linux/mnt_namespace.h |    2 +
 include/linux/namei.h         |    2 +
 include/linux/nsproxy.h       |    1 +
 kernel/nsproxy.c              |   11 ++++++
 6 files changed, 138 insertions(+), 9 deletions(-)


diff --git a/fs/namei.c b/fs/namei.c
index bbc15c2..9280299 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -25,7 +25,7 @@
 #include <linux/personality.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
-#include <linux/mount.h>
+#include <linux/mnt_namespace.h>
 #include <linux/audit.h>
 #include <linux/capability.h>
 #include <linux/file.h>
@@ -1119,6 +1119,79 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 
 /**
+ * vfs_remote_path_lookup - look up a path in a remote server namespace
+ * @dentry:  pointer to dentry of the base directory
+ * @mnt: pointer to vfs mount of the base directory
+ * @name: pointer to file name
+ * @flags: lookup flags
+ * @nd: pointer to nameidata
+ *
+ * This function creates a private mount namespace and sets 'mnt' as the
+ * root volume before looking up the path 'name'.
+ * It is intended for use by filesystems like NFSv4, which has a mount
+ * path that is relative to a remote server's namespace, and where walking
+ * that path may involve following referrals/links to volumes that reside
+ * on yet other servers. The resulting in-kernel automount can now be
+ * done safely as it affects the private namespace only.
+ */
+int vfs_remote_path_lookup(struct dentry *dentry,
+		struct vfsmount *mnt, const char *name,
+		unsigned int flags, struct nameidata *nd)
+{
+	struct nsproxy *new_nsproxy, *orig_nsproxy;
+	struct mnt_namespace *new_mnt_ns;
+	struct fs_struct *new_fs, *orig_fs;
+	int error = -ENOMEM;
+
+	new_fs = copy_fs_struct(current->fs);
+	if (new_fs == NULL)
+		goto out_err;
+
+	new_mnt_ns = create_private_mnt_ns(mnt, new_fs);
+	if (new_mnt_ns == NULL)
+		goto out_put_fs_struct;
+
+	/* Create a private copy of current->nsproxy */
+	new_nsproxy = unshare_current_nsproxy();
+	error = PTR_ERR(new_nsproxy);
+	if (IS_ERR(new_nsproxy))
+		goto out_put_mnt_ns;
+
+	/* ...and substitute the private mount namespace */
+	put_mnt_ns(new_nsproxy->mnt_ns);
+	new_nsproxy->mnt_ns = new_mnt_ns;
+	get_mnt_ns(new_mnt_ns);
+
+	/* Save the old nsproxy */
+	orig_nsproxy = current->nsproxy;
+	get_nsproxy(orig_nsproxy);
+
+	/* Pivot into the new mount namespace */
+	switch_task_namespaces(current, new_nsproxy);
+	task_lock(current);
+	orig_fs = current->fs;
+	current->fs = new_fs;
+	task_unlock(current);
+
+	error = vfs_path_lookup(dentry, mnt, name, flags, nd);
+
+	/* Pivot back into the original namespace */
+	task_lock(current);
+	current->fs = orig_fs;
+	task_unlock(current);
+	switch_task_namespaces(current, orig_nsproxy);
+
+out_put_mnt_ns:
+	put_mnt_ns(new_mnt_ns);
+
+out_put_fs_struct:
+	put_fs_struct(new_fs);
+out_err:
+	return error;
+}
+EXPORT_SYMBOL_GPL(vfs_remote_path_lookup);
+
+/**
  * path_lookup_open - lookup a file path with open intent
  * @dfd: the directory to use as base, or AT_FDCWD
  * @name: pointer to file name
diff --git a/fs/namespace.c b/fs/namespace.c
index 228d8c4..72f20a6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1963,6 +1963,21 @@ dput_out:
 	return retval;
 }
 
+static struct mnt_namespace *alloc_mnt_ns(void)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
+	if (!new_ns)
+		return ERR_PTR(-ENOMEM);
+	atomic_set(&new_ns->count, 1);
+	new_ns->root = NULL;
+	INIT_LIST_HEAD(&new_ns->list);
+	init_waitqueue_head(&new_ns->poll);
+	new_ns->event = 0;
+	return new_ns;
+}
+
 /*
  * Allocate a new namespace structure and populate it with contents
  * copied from the namespace of the passed in task structure.
@@ -1974,14 +1989,9 @@ static struct mnt_namespace *dup_mnt_ns(struct mnt_namespace *mnt_ns,
 	struct vfsmount *rootmnt = NULL, *pwdmnt = NULL;
 	struct vfsmount *p, *q;
 
-	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
-	if (!new_ns)
-		return ERR_PTR(-ENOMEM);
-
-	atomic_set(&new_ns->count, 1);
-	INIT_LIST_HEAD(&new_ns->list);
-	init_waitqueue_head(&new_ns->poll);
-	new_ns->event = 0;
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
 
 	down_write(&namespace_sem);
 	/* First pass: copy the tree topology */
@@ -2045,6 +2055,36 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns,
 	return new_ns;
 }
 
+struct mnt_namespace *create_private_mnt_ns(struct vfsmount *mnt_root,
+		struct fs_struct *fs)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
+
+	/* We're starting a completely fresh namespace, so we shouldn't need
+	 * to lock
+	 */
+	mnt_root->mnt_ns = new_ns;
+	new_ns->root = mnt_root;
+	list_add(&new_ns->list, &new_ns->root->mnt_list);
+
+	/* Also assume that the fs_struct is private, hence no locks... */
+	if (fs) {
+		dput(fs->pwd.dentry);
+		mntput(fs->pwd.mnt);
+		dput(fs->root.dentry);
+		mntput(fs->root.mnt);
+		fs->root.mnt = mntget(new_ns->root);
+		fs->root.dentry = dget(new_ns->root->mnt_root);
+		fs->pwd.mnt = mntget(new_ns->root);
+		fs->pwd.dentry = dget(new_ns->root->mnt_root);
+	}
+	return new_ns;
+}
+
 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name,
 		char __user *, type, unsigned long, flags, void __user *, data)
 {
diff --git a/include/linux/mnt_namespace.h b/include/linux/mnt_namespace.h
index 830bbcd..e81c076 100644
--- a/include/linux/mnt_namespace.h
+++ b/include/linux/mnt_namespace.h
@@ -22,6 +22,8 @@ struct proc_mounts {
 	int event;
 };
 
+extern struct mnt_namespace *create_private_mnt_ns(struct vfsmount *,
+		struct fs_struct *);
 extern struct mnt_namespace *copy_mnt_ns(unsigned long, struct mnt_namespace *,
 		struct fs_struct *);
 extern void __put_mnt_ns(struct mnt_namespace *ns);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..b9c0d1e 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -68,6 +68,8 @@ extern int kern_path(const char *, unsigned, struct path *);
 extern int path_lookup(const char *, unsigned, struct nameidata *);
 extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
 			   const char *, unsigned int, struct nameidata *);
+extern int vfs_remote_path_lookup(struct dentry *, struct vfsmount *,
+		const char *, unsigned int , struct nameidata *);
 
 extern int path_lookup_open(int dfd, const char *name, unsigned lookup_flags, struct nameidata *, int open_flags);
 extern struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry *dentry,
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index afad7de..3c12b79 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -67,6 +67,7 @@ void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
 int unshare_nsproxy_namespaces(unsigned long, struct nsproxy **,
 	struct fs_struct *);
+struct nsproxy *unshare_current_nsproxy(void);
 
 static inline void put_nsproxy(struct nsproxy *ns)
 {
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 63598dc..05ea102 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -169,6 +169,17 @@ void free_nsproxy(struct nsproxy *ns)
 }
 
 /*
+ * Unshare just the current->nsproxy itself.
+ */
+struct nsproxy *unshare_current_nsproxy(void)
+{
+	if (!capable(CAP_SYS_ADMIN))
+		return ERR_PTR(-EPERM);
+
+	return create_new_namespaces(0, current, current->fs);
+}
+
+/*
  * Called from unshare. Unshare all the namespaces part of nsproxy.
  * On success, returns the new nsproxy.
  */

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/2] NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

The NFSv4 mount code currently has several limitations which we really
don't want to keep. For one thing, it will not follow symlinks, and neither
will it follow referrals.
The correct approach here should be to reuse the existing VFS and NFS path
lookup code instead of duplicating it inside nfs4_path_walk().

Signed-off-by: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
---

 fs/nfs/super.c |  149 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 129 insertions(+), 20 deletions(-)


diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index d6686f4..b75e551 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -42,6 +42,7 @@
 #include <linux/smp_lock.h>
 #include <linux/seq_file.h>
 #include <linux/mount.h>
+#include <linux/namei.h>
 #include <linux/nfs_idmap.h>
 #include <linux/vfs.h>
 #include <linux/inet.h>
@@ -264,10 +265,14 @@ static const struct super_operations nfs_sops = {
 #ifdef CONFIG_NFS_V4
 static int nfs4_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_xdev_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_referral_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static void nfs4_kill_super(struct super_block *sb);
 
 static struct file_system_type nfs4_fs_type = {
@@ -278,6 +283,14 @@ static struct file_system_type nfs4_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_xdev_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -286,6 +299,14 @@ struct file_system_type nfs4_xdev_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_referral_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_referral_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_referral_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -2330,12 +2351,12 @@ out_no_client_address:
 }
 
 /*
- * Get the superblock for an NFS4 mountpoint
+ * Get the superblock for the NFS4 root partition
  */
-static int nfs4_get_sb(struct file_system_type *fs_type,
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
 {
-	struct nfs_parsed_mount_data *data;
+	struct nfs_parsed_mount_data *data = raw_data;
 	struct super_block *s;
 	struct nfs_server *server;
 	struct nfs_fh *mntfh;
@@ -2346,18 +2367,12 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	};
 	int error = -ENOMEM;
 
-	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	mntfh = kzalloc(sizeof(*mntfh), GFP_KERNEL);
 	if (data == NULL || mntfh == NULL)
 		goto out_free_fh;
 
 	security_init_mnt_opts(&data->lsm_opts);
 
-	/* Validate the mount data */
-	error = nfs4_validate_mount_data(raw_data, data, dev_name);
-	if (error < 0)
-		goto out;
-
 	/* Get a volume representation */
 	server = nfs4_create_server(data, mntfh);
 	if (IS_ERR(server)) {
@@ -2370,7 +2385,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 		compare_super = NULL;
 
 	/* Get a superblock - note that we may end up sharing one that already exists */
-	s = sget(fs_type, compare_super, nfs_set_super, &sb_mntdata);
+	s = sget(&nfs4_fs_type, compare_super, nfs_set_super, &sb_mntdata);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
 		goto out_free;
@@ -2406,13 +2421,9 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	error = 0;
 
 out:
-	kfree(data->client_address);
-	kfree(data->nfs_server.export_path);
-	kfree(data->nfs_server.hostname);
 	security_free_mnt_opts(&data->lsm_opts);
 out_free_fh:
 	kfree(mntfh);
-	kfree(data);
 	return error;
 
 out_free:
@@ -2427,6 +2438,75 @@ error_splat_super:
 	goto out;
 }
 
+static int nfs_follow_remote_path(struct vfsmount *root_mnt,
+		const char *export_path, struct vfsmount *mnt_target)
+{
+	struct nameidata nd;
+	struct super_block *s;
+	int error;
+
+	error = vfs_remote_path_lookup(root_mnt->mnt_root, root_mnt,
+			export_path, LOOKUP_FOLLOW, &nd);
+
+	mntput(root_mnt);
+	if (error)
+		return error;
+
+	s = nd.path.mnt->mnt_sb;
+	down_write(&s->s_umount);
+	if (s->s_root) {
+		atomic_inc(&s->s_active);
+		mnt_target->mnt_root = dget(nd.path.dentry);
+		mnt_target->mnt_sb = s;
+		s->s_flags |= MS_ACTIVE;
+	}
+
+	path_put(&nd.path);
+	return 0;
+}
+
+/*
+ * Get the superblock for an NFS4 mountpoint
+ */
+static int nfs4_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
+{
+	struct nfs_parsed_mount_data *data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error = -ENOMEM;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (data == NULL)
+		goto out_free_data;
+
+	/* Validate the mount data */
+	error = nfs4_validate_mount_data(raw_data, data, dev_name);
+	if (error < 0)
+		goto out;
+
+	export_path = data->nfs_server.export_path;
+	data->nfs_server.export_path = "/";
+	root_mnt = vfs_kern_mount(&nfs4_remote_fs_type, flags, dev_name, data);
+	data->nfs_server.export_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+
+out:
+	kfree(data->client_address);
+	kfree(data->nfs_server.export_path);
+	kfree(data->nfs_server.hostname);
+out_free_data:
+	kfree(data);
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 static void nfs4_kill_super(struct super_block *sb)
 {
 	struct nfs_server *server = NFS_SB(sb);
@@ -2522,12 +2602,9 @@ error_splat_super:
 	return error;
 }
 
-/*
- * Create an NFS4 server record on referral traversal
- */
-static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
-				const char *dev_name, void *raw_data,
-				struct vfsmount *mnt)
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
 {
 	struct nfs_clone_mount *data = raw_data;
 	struct super_block *s;
@@ -2607,4 +2684,36 @@ error_splat_super:
 	return error;
 }
 
+/*
+ * Create an NFS4 server record on referral traversal
+ */
+static int nfs4_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
+{
+	struct nfs_clone_mount *data = raw_data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error;
+
+	dprintk("--> nfs4_referral_get_sb()\n");
+
+	export_path = data->mnt_path;
+	data->mnt_path = "/";
+
+	root_mnt = vfs_kern_mount(&nfs4_remote_referral_fs_type,
+			flags, dev_name, data);
+	data->mnt_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+out:
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 #endif /* CONFIG_NFS_V4 */

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/2] VFS: Add a VFS helper function vfs_remote_path_lookup()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel, linux-nfs; +Cc: linux-kernel, viro

The purpose of this patch is to improve the mount path lookup support for
filesystems such as NFSv4, which require you to look up a mount path
string in a remote server's namespace.

Traversing such a path is pretty much identical to walking a local path,
in that it may involve following symlinks and even following referrals to
volumes that reside on other servers. Since the standard VFS path lookup
code already supports all these features (using in-kernel automounts for
following referrals) it would be nice to be able to reuse that code rather
than special case the mount path lookup in the NFS client.

This patch therefore defines a VFS helper function that sets up a temporary
mount namespace to represent the server namespace, and has the current
task pivot into that prior to doing the path lookup. Upon completion, it
pivots back into the original namespace, and destroys the private one.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/namei.c                    |   75 ++++++++++++++++++++++++++++++++++++++++-
 fs/namespace.c                |   56 ++++++++++++++++++++++++++-----
 include/linux/mnt_namespace.h |    2 +
 include/linux/namei.h         |    2 +
 include/linux/nsproxy.h       |    1 +
 kernel/nsproxy.c              |   11 ++++++
 6 files changed, 138 insertions(+), 9 deletions(-)


diff --git a/fs/namei.c b/fs/namei.c
index bbc15c2..9280299 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -25,7 +25,7 @@
 #include <linux/personality.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
-#include <linux/mount.h>
+#include <linux/mnt_namespace.h>
 #include <linux/audit.h>
 #include <linux/capability.h>
 #include <linux/file.h>
@@ -1119,6 +1119,79 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 
 /**
+ * vfs_remote_path_lookup - look up a path in a remote server namespace
+ * @dentry:  pointer to dentry of the base directory
+ * @mnt: pointer to vfs mount of the base directory
+ * @name: pointer to file name
+ * @flags: lookup flags
+ * @nd: pointer to nameidata
+ *
+ * This function creates a private mount namespace and sets 'mnt' as the
+ * root volume before looking up the path 'name'.
+ * It is intended for use by filesystems like NFSv4, which has a mount
+ * path that is relative to a remote server's namespace, and where walking
+ * that path may involve following referrals/links to volumes that reside
+ * on yet other servers. The resulting in-kernel automount can now be
+ * done safely as it affects the private namespace only.
+ */
+int vfs_remote_path_lookup(struct dentry *dentry,
+		struct vfsmount *mnt, const char *name,
+		unsigned int flags, struct nameidata *nd)
+{
+	struct nsproxy *new_nsproxy, *orig_nsproxy;
+	struct mnt_namespace *new_mnt_ns;
+	struct fs_struct *new_fs, *orig_fs;
+	int error = -ENOMEM;
+
+	new_fs = copy_fs_struct(current->fs);
+	if (new_fs == NULL)
+		goto out_err;
+
+	new_mnt_ns = create_private_mnt_ns(mnt, new_fs);
+	if (new_mnt_ns == NULL)
+		goto out_put_fs_struct;
+
+	/* Create a private copy of current->nsproxy */
+	new_nsproxy = unshare_current_nsproxy();
+	error = PTR_ERR(new_nsproxy);
+	if (IS_ERR(new_nsproxy))
+		goto out_put_mnt_ns;
+
+	/* ...and substitute the private mount namespace */
+	put_mnt_ns(new_nsproxy->mnt_ns);
+	new_nsproxy->mnt_ns = new_mnt_ns;
+	get_mnt_ns(new_mnt_ns);
+
+	/* Save the old nsproxy */
+	orig_nsproxy = current->nsproxy;
+	get_nsproxy(orig_nsproxy);
+
+	/* Pivot into the new mount namespace */
+	switch_task_namespaces(current, new_nsproxy);
+	task_lock(current);
+	orig_fs = current->fs;
+	current->fs = new_fs;
+	task_unlock(current);
+
+	error = vfs_path_lookup(dentry, mnt, name, flags, nd);
+
+	/* Pivot back into the original namespace */
+	task_lock(current);
+	current->fs = orig_fs;
+	task_unlock(current);
+	switch_task_namespaces(current, orig_nsproxy);
+
+out_put_mnt_ns:
+	put_mnt_ns(new_mnt_ns);
+
+out_put_fs_struct:
+	put_fs_struct(new_fs);
+out_err:
+	return error;
+}
+EXPORT_SYMBOL_GPL(vfs_remote_path_lookup);
+
+/**
  * path_lookup_open - lookup a file path with open intent
  * @dfd: the directory to use as base, or AT_FDCWD
  * @name: pointer to file name
diff --git a/fs/namespace.c b/fs/namespace.c
index 228d8c4..72f20a6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1963,6 +1963,21 @@ dput_out:
 	return retval;
 }
 
+static struct mnt_namespace *alloc_mnt_ns(void)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
+	if (!new_ns)
+		return ERR_PTR(-ENOMEM);
+	atomic_set(&new_ns->count, 1);
+	new_ns->root = NULL;
+	INIT_LIST_HEAD(&new_ns->list);
+	init_waitqueue_head(&new_ns->poll);
+	new_ns->event = 0;
+	return new_ns;
+}
+
 /*
  * Allocate a new namespace structure and populate it with contents
  * copied from the namespace of the passed in task structure.
@@ -1974,14 +1989,9 @@ static struct mnt_namespace *dup_mnt_ns(struct mnt_namespace *mnt_ns,
 	struct vfsmount *rootmnt = NULL, *pwdmnt = NULL;
 	struct vfsmount *p, *q;
 
-	new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
-	if (!new_ns)
-		return ERR_PTR(-ENOMEM);
-
-	atomic_set(&new_ns->count, 1);
-	INIT_LIST_HEAD(&new_ns->list);
-	init_waitqueue_head(&new_ns->poll);
-	new_ns->event = 0;
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
 
 	down_write(&namespace_sem);
 	/* First pass: copy the tree topology */
@@ -2045,6 +2055,36 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns,
 	return new_ns;
 }
 
+struct mnt_namespace *create_private_mnt_ns(struct vfsmount *mnt_root,
+		struct fs_struct *fs)
+{
+	struct mnt_namespace *new_ns;
+
+	new_ns = alloc_mnt_ns();
+	if (IS_ERR(new_ns))
+		return new_ns;
+
+	/* We're starting a completely fresh namespace, so we shouldn't need
+	 * to lock
+	 */
+	mnt_root->mnt_ns = new_ns;
+	new_ns->root = mnt_root;
+	list_add(&new_ns->list, &new_ns->root->mnt_list);
+
+	/* Also assume that the fs_struct is private, hence no locks... */
+	if (fs) {
+		dput(fs->pwd.dentry);
+		mntput(fs->pwd.mnt);
+		dput(fs->root.dentry);
+		mntput(fs->root.mnt);
+		fs->root.mnt = mntget(new_ns->root);
+		fs->root.dentry = dget(new_ns->root->mnt_root);
+		fs->pwd.mnt = mntget(new_ns->root);
+		fs->pwd.dentry = dget(new_ns->root->mnt_root);
+	}
+	return new_ns;
+}
+
 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name,
 		char __user *, type, unsigned long, flags, void __user *, data)
 {
diff --git a/include/linux/mnt_namespace.h b/include/linux/mnt_namespace.h
index 830bbcd..e81c076 100644
--- a/include/linux/mnt_namespace.h
+++ b/include/linux/mnt_namespace.h
@@ -22,6 +22,8 @@ struct proc_mounts {
 	int event;
 };
 
+extern struct mnt_namespace *create_private_mnt_ns(struct vfsmount *,
+		struct fs_struct *);
 extern struct mnt_namespace *copy_mnt_ns(unsigned long, struct mnt_namespace *,
 		struct fs_struct *);
 extern void __put_mnt_ns(struct mnt_namespace *ns);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..b9c0d1e 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -68,6 +68,8 @@ extern int kern_path(const char *, unsigned, struct path *);
 extern int path_lookup(const char *, unsigned, struct nameidata *);
 extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
 			   const char *, unsigned int, struct nameidata *);
+extern int vfs_remote_path_lookup(struct dentry *, struct vfsmount *,
+		const char *, unsigned int , struct nameidata *);
 
 extern int path_lookup_open(int dfd, const char *name, unsigned lookup_flags, struct nameidata *, int open_flags);
 extern struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry *dentry,
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index afad7de..3c12b79 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -67,6 +67,7 @@ void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
 int unshare_nsproxy_namespaces(unsigned long, struct nsproxy **,
 	struct fs_struct *);
+struct nsproxy *unshare_current_nsproxy(void);
 
 static inline void put_nsproxy(struct nsproxy *ns)
 {
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 63598dc..05ea102 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -169,6 +169,17 @@ void free_nsproxy(struct nsproxy *ns)
 }
 
 /*
+ * Unshare just the current->nsproxy itself.
+ */
+struct nsproxy *unshare_current_nsproxy(void)
+{
+	if (!capable(CAP_SYS_ADMIN))
+		return ERR_PTR(-EPERM);
+
+	return create_new_namespaces(0, current, current->fs);
+}
+
+/*
  * Called from unshare. Unshare all the namespaces part of nsproxy.
  * On success, returns the new nsproxy.
  */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/2] NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
@ 2009-02-09 18:45   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-09 18:45 UTC (permalink / raw)
  To: linux-fsdevel, linux-nfs; +Cc: linux-kernel, viro

The NFSv4 mount code currently has several limitations which we really
don't want to keep. For one thing, it will not follow symlinks, and neither
will it follow referrals.
The correct approach here should be to reuse the existing VFS and NFS path
lookup code instead of duplicating it inside nfs4_path_walk().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/super.c |  149 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 129 insertions(+), 20 deletions(-)


diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index d6686f4..b75e551 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -42,6 +42,7 @@
 #include <linux/smp_lock.h>
 #include <linux/seq_file.h>
 #include <linux/mount.h>
+#include <linux/namei.h>
 #include <linux/nfs_idmap.h>
 #include <linux/vfs.h>
 #include <linux/inet.h>
@@ -264,10 +265,14 @@ static const struct super_operations nfs_sops = {
 #ifdef CONFIG_NFS_V4
 static int nfs4_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_xdev_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static int nfs4_referral_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt);
 static void nfs4_kill_super(struct super_block *sb);
 
 static struct file_system_type nfs4_fs_type = {
@@ -278,6 +283,14 @@ static struct file_system_type nfs4_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_xdev_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -286,6 +299,14 @@ struct file_system_type nfs4_xdev_fs_type = {
 	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
 };
 
+static struct file_system_type nfs4_remote_referral_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "nfs4",
+	.get_sb		= nfs4_remote_referral_get_sb,
+	.kill_sb	= nfs4_kill_super,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+};
+
 struct file_system_type nfs4_referral_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "nfs4",
@@ -2330,12 +2351,12 @@ out_no_client_address:
 }
 
 /*
- * Get the superblock for an NFS4 mountpoint
+ * Get the superblock for the NFS4 root partition
  */
-static int nfs4_get_sb(struct file_system_type *fs_type,
+static int nfs4_remote_get_sb(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
 {
-	struct nfs_parsed_mount_data *data;
+	struct nfs_parsed_mount_data *data = raw_data;
 	struct super_block *s;
 	struct nfs_server *server;
 	struct nfs_fh *mntfh;
@@ -2346,18 +2367,12 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	};
 	int error = -ENOMEM;
 
-	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	mntfh = kzalloc(sizeof(*mntfh), GFP_KERNEL);
 	if (data == NULL || mntfh == NULL)
 		goto out_free_fh;
 
 	security_init_mnt_opts(&data->lsm_opts);
 
-	/* Validate the mount data */
-	error = nfs4_validate_mount_data(raw_data, data, dev_name);
-	if (error < 0)
-		goto out;
-
 	/* Get a volume representation */
 	server = nfs4_create_server(data, mntfh);
 	if (IS_ERR(server)) {
@@ -2370,7 +2385,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 		compare_super = NULL;
 
 	/* Get a superblock - note that we may end up sharing one that already exists */
-	s = sget(fs_type, compare_super, nfs_set_super, &sb_mntdata);
+	s = sget(&nfs4_fs_type, compare_super, nfs_set_super, &sb_mntdata);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
 		goto out_free;
@@ -2406,13 +2421,9 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
 	error = 0;
 
 out:
-	kfree(data->client_address);
-	kfree(data->nfs_server.export_path);
-	kfree(data->nfs_server.hostname);
 	security_free_mnt_opts(&data->lsm_opts);
 out_free_fh:
 	kfree(mntfh);
-	kfree(data);
 	return error;
 
 out_free:
@@ -2427,6 +2438,75 @@ error_splat_super:
 	goto out;
 }
 
+static int nfs_follow_remote_path(struct vfsmount *root_mnt,
+		const char *export_path, struct vfsmount *mnt_target)
+{
+	struct nameidata nd;
+	struct super_block *s;
+	int error;
+
+	error = vfs_remote_path_lookup(root_mnt->mnt_root, root_mnt,
+			export_path, LOOKUP_FOLLOW, &nd);
+
+	mntput(root_mnt);
+	if (error)
+		return error;
+
+	s = nd.path.mnt->mnt_sb;
+	down_write(&s->s_umount);
+	if (s->s_root) {
+		atomic_inc(&s->s_active);
+		mnt_target->mnt_root = dget(nd.path.dentry);
+		mnt_target->mnt_sb = s;
+		s->s_flags |= MS_ACTIVE;
+	}
+
+	path_put(&nd.path);
+	return 0;
+}
+
+/*
+ * Get the superblock for an NFS4 mountpoint
+ */
+static int nfs4_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
+{
+	struct nfs_parsed_mount_data *data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error = -ENOMEM;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (data == NULL)
+		goto out_free_data;
+
+	/* Validate the mount data */
+	error = nfs4_validate_mount_data(raw_data, data, dev_name);
+	if (error < 0)
+		goto out;
+
+	export_path = data->nfs_server.export_path;
+	data->nfs_server.export_path = "/";
+	root_mnt = vfs_kern_mount(&nfs4_remote_fs_type, flags, dev_name, data);
+	data->nfs_server.export_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+
+out:
+	kfree(data->client_address);
+	kfree(data->nfs_server.export_path);
+	kfree(data->nfs_server.hostname);
+out_free_data:
+	kfree(data);
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 static void nfs4_kill_super(struct super_block *sb)
 {
 	struct nfs_server *server = NFS_SB(sb);
@@ -2522,12 +2602,9 @@ error_splat_super:
 	return error;
 }
 
-/*
- * Create an NFS4 server record on referral traversal
- */
-static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
-				const char *dev_name, void *raw_data,
-				struct vfsmount *mnt)
+static int nfs4_remote_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
 {
 	struct nfs_clone_mount *data = raw_data;
 	struct super_block *s;
@@ -2607,4 +2684,36 @@ error_splat_super:
 	return error;
 }
 
+/*
+ * Create an NFS4 server record on referral traversal
+ */
+static int nfs4_referral_get_sb(struct file_system_type *fs_type,
+		int flags, const char *dev_name, void *raw_data,
+		struct vfsmount *mnt)
+{
+	struct nfs_clone_mount *data = raw_data;
+	char *export_path;
+	struct vfsmount *root_mnt;
+	int error;
+
+	dprintk("--> nfs4_referral_get_sb()\n");
+
+	export_path = data->mnt_path;
+	data->mnt_path = "/";
+
+	root_mnt = vfs_kern_mount(&nfs4_remote_referral_fs_type,
+			flags, dev_name, data);
+	data->mnt_path = export_path;
+
+	error = PTR_ERR(root_mnt);
+	if (IS_ERR(root_mnt))
+		goto out;
+
+	error = nfs_follow_remote_path(root_mnt, export_path, mnt);
+out:
+	dprintk("<-- nfs4_referral_get_sb() = %d%s\n", error,
+			error != 0 ? " [error]" : "");
+	return error;
+}
+
 #endif /* CONFIG_NFS_V4 */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-10 15:58   ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-10 15:58 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> The following two patches attempt to improve NFSv4's ability to look up
> the mount path on a remote server.
> 
> The first patch adds VFS support for walking the remote path, using a
> temporary mount namespace to represent the server's namespace, so that
> symlinks

I'm a bit confused about the symlink case--I take it you're assuming
that symlinks in the pseudofs should be interpreted as relative to the
server's namespace (in keeping with traditional implementations of
server exports), while symlinks elsewhere should continue to be
intepreted relative to the client's namespace.

Do the rfc's say anything about this?

--b.

> and referrals can be followed across remote filesystem and
> server boundaries.
> The second patch then uses this VFS helper in the NFSv4 mount code.
> 
> A later set of patches will clean out the legacy nfs4_path_walk, however
> I'm hoping for some commentary from the VFS folks on whether or not this
> approach is acceptable before I continue.
> 
> Cheers
>   Trond
> 
> ---
> 
> Trond Myklebust (2):
>       NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
>       VFS: Add a VFS helper function vfs_remote_path_lookup()
> 
> 
>  fs/namei.c                    |   75 ++++++++++++++++++++-
>  fs/namespace.c                |   56 +++++++++++++--
>  fs/nfs/super.c                |  149 +++++++++++++++++++++++++++++++++++------
>  include/linux/mnt_namespace.h |    2 +
>  include/linux/namei.h         |    2 +
>  include/linux/nsproxy.h       |    1 
>  kernel/nsproxy.c              |   11 +++
>  7 files changed, 267 insertions(+), 29 deletions(-)
> 
> -- 
> Signature
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-10 15:58   ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-10 15:58 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> The following two patches attempt to improve NFSv4's ability to look up
> the mount path on a remote server.
> 
> The first patch adds VFS support for walking the remote path, using a
> temporary mount namespace to represent the server's namespace, so that
> symlinks

I'm a bit confused about the symlink case--I take it you're assuming
that symlinks in the pseudofs should be interpreted as relative to the
server's namespace (in keeping with traditional implementations of
server exports), while symlinks elsewhere should continue to be
intepreted relative to the client's namespace.

Do the rfc's say anything about this?

--b.

> and referrals can be followed across remote filesystem and
> server boundaries.
> The second patch then uses this VFS helper in the NFSv4 mount code.
> 
> A later set of patches will clean out the legacy nfs4_path_walk, however
> I'm hoping for some commentary from the VFS folks on whether or not this
> approach is acceptable before I continue.
> 
> Cheers
>   Trond
> 
> ---
> 
> Trond Myklebust (2):
>       NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
>       VFS: Add a VFS helper function vfs_remote_path_lookup()
> 
> 
>  fs/namei.c                    |   75 ++++++++++++++++++++-
>  fs/namespace.c                |   56 +++++++++++++--
>  fs/nfs/super.c                |  149 +++++++++++++++++++++++++++++++++++------
>  include/linux/mnt_namespace.h |    2 +
>  include/linux/namei.h         |    2 +
>  include/linux/nsproxy.h       |    1 
>  kernel/nsproxy.c              |   11 +++
>  7 files changed, 267 insertions(+), 29 deletions(-)
> 
> -- 
> Signature
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-10 15:58   ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-10 15:58 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> The following two patches attempt to improve NFSv4's ability to look up
> the mount path on a remote server.
> 
> The first patch adds VFS support for walking the remote path, using a
> temporary mount namespace to represent the server's namespace, so that
> symlinks

I'm a bit confused about the symlink case--I take it you're assuming
that symlinks in the pseudofs should be interpreted as relative to the
server's namespace (in keeping with traditional implementations of
server exports), while symlinks elsewhere should continue to be
intepreted relative to the client's namespace.

Do the rfc's say anything about this?

--b.

> and referrals can be followed across remote filesystem and
> server boundaries.
> The second patch then uses this VFS helper in the NFSv4 mount code.
> 
> A later set of patches will clean out the legacy nfs4_path_walk, however
> I'm hoping for some commentary from the VFS folks on whether or not this
> approach is acceptable before I continue.
> 
> Cheers
>   Trond
> 
> ---
> 
> Trond Myklebust (2):
>       NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk()
>       VFS: Add a VFS helper function vfs_remote_path_lookup()
> 
> 
>  fs/namei.c                    |   75 ++++++++++++++++++++-
>  fs/namespace.c                |   56 +++++++++++++--
>  fs/nfs/super.c                |  149 +++++++++++++++++++++++++++++++++++------
>  include/linux/mnt_namespace.h |    2 +
>  include/linux/namei.h         |    2 +
>  include/linux/nsproxy.h       |    1 
>  kernel/nsproxy.c              |   11 +++
>  7 files changed, 267 insertions(+), 29 deletions(-)
> 
> -- 
> Signature
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
  2009-02-10 15:58   ` J. Bruce Fields
  (?)
  (?)
@ 2009-02-10 18:31   ` Trond Myklebust
  2009-02-10 21:48     ` J. Bruce Fields
  -1 siblings, 1 reply; 18+ messages in thread
From: Trond Myklebust @ 2009-02-10 18:31 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > The following two patches attempt to improve NFSv4's ability to look up
> > the mount path on a remote server.
> > 
> > The first patch adds VFS support for walking the remote path, using a
> > temporary mount namespace to represent the server's namespace, so that
> > symlinks
> 
> I'm a bit confused about the symlink case--I take it you're assuming
> that symlinks in the pseudofs should be interpreted as relative to the
> server's namespace (in keeping with traditional implementations of
> server exports), while symlinks elsewhere should continue to be
> intepreted relative to the client's namespace.
> 
> Do the rfc's say anything about this?

No, the RFCs say nothing, but interpreting symlinks as being relative to
the server namespace would be consistent with the mount behaviour of
NFSv2/v3. It also makes me uncomfortable to have a remote mount path
that could refer back to the client's namespace: that would not be an
NFS mount, but a local bind mount...

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
  2009-02-10 18:31   ` Trond Myklebust
@ 2009-02-10 21:48     ` J. Bruce Fields
  2009-02-10 22:48         ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-10 21:48 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > The following two patches attempt to improve NFSv4's ability to look up
> > > the mount path on a remote server.
> > > 
> > > The first patch adds VFS support for walking the remote path, using a
> > > temporary mount namespace to represent the server's namespace, so that
> > > symlinks
> > 
> > I'm a bit confused about the symlink case--I take it you're assuming
> > that symlinks in the pseudofs should be interpreted as relative to the
> > server's namespace (in keeping with traditional implementations of
> > server exports), while symlinks elsewhere should continue to be
> > intepreted relative to the client's namespace.

Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
entirely well defined--a complicated namespace may transition between
"pseudofs" and "real" filesystems multiple times.  So it's really a
statement about the client's mount behavior: symlinks found along the
mount path will be interpreted one way, symlinks found elsewhere
another.  Right?

Though put that way it's harder to decide what to store in a symlink,
since you can't necessarily control which paths a given client may
decide to mount.

> > Do the rfc's say anything about this?
> 
> No, the RFCs say nothing, but interpreting symlinks as being relative to
> the server namespace would be consistent with the mount behaviour of
> NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> that could refer back to the client's namespace: that would not be an
> NFS mount, but a local bind mount...

Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
be different after

	mount file:/path/symlink/ /mntsymlink/
	mount file:/path/	  /mnt/

I see your point, though it might also be an argument for continuing to
error out on symlinks.

It could also be argued that if a given symlink is expected to be
interpreted on the server side, then the server should just go ahead and
do that for the client, rather than returning it as a symlink.

Seems worth at least mentioning to the ietf group, as different behavior
across different clients would be confusing.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-10 22:48         ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-10 22:48 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > the mount path on a remote server.
> > > > 
> > > > The first patch adds VFS support for walking the remote path, using a
> > > > temporary mount namespace to represent the server's namespace, so that
> > > > symlinks
> > > 
> > > I'm a bit confused about the symlink case--I take it you're assuming
> > > that symlinks in the pseudofs should be interpreted as relative to the
> > > server's namespace (in keeping with traditional implementations of
> > > server exports), while symlinks elsewhere should continue to be
> > > intepreted relative to the client's namespace.
> 
> Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> entirely well defined--a complicated namespace may transition between
> "pseudofs" and "real" filesystems multiple times.  So it's really a
> statement about the client's mount behavior: symlinks found along the
> mount path will be interpreted one way, symlinks found elsewhere
> another.  Right?
> 
> Though put that way it's harder to decide what to store in a symlink,
> since you can't necessarily control which paths a given client may
> decide to mount.

That has been the nature of an NFS mount path string since it was first
introduced in NFSv2: it refers to the server namespace. People haven't
complained about this previously, so why should we start changing the
meaning of the mount path when we move to NFSv4?

> > > Do the rfc's say anything about this?
> > 
> > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > the server namespace would be consistent with the mount behaviour of
> > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > that could refer back to the client's namespace: that would not be an
> > NFS mount, but a local bind mount...
> 
> Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> be different after
> 
> 	mount file:/path/symlink/ /mntsymlink/
> 	mount file:/path/	  /mnt/

So, what then if I do

	ln -s ../foo /bar/baz/symlink

on the server, then compare

	mount server:/bar/baz /mnt
and
	mount server:/bar/baz/symlink /mnt

Would you argue that those two should produce the same result? My
interpretation would be as follows:

In the first case, the symlink is visible as /mnt/symlink, and so
'cd /mnt/symlink' will take you to the local path '/foo' on the client.

In the second case, I'd be very surprised if the mount code did anything
other than to follow /bar/baz/symlink to remote path /bar/foo, and then
mount that on '/mnt'

If you agree that the above behaviour is correct, then how would you
argue that replacing '/bar/baz/symlink' with an absolute symlink
(i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
do a bind mount?

> I see your point, though it might also be an argument for continuing to
> error out on symlinks.

Again, why? We don't do that today with NFSv2/v3.

> It could also be argued that if a given symlink is expected to be
> interpreted on the server side, then the server should just go ahead and
> do that for the client, rather than returning it as a symlink.

How would the server distinguish between a client that is doing a lookup
of a mount path and one that is looking up a normal path?

> Seems worth at least mentioning to the ietf group, as different behavior
> across different clients would be confusing.

...as would different behaviour across different versions of NFS.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-10 22:48         ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-10 22:48 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > the mount path on a remote server.
> > > > 
> > > > The first patch adds VFS support for walking the remote path, using a
> > > > temporary mount namespace to represent the server's namespace, so that
> > > > symlinks
> > > 
> > > I'm a bit confused about the symlink case--I take it you're assuming
> > > that symlinks in the pseudofs should be interpreted as relative to the
> > > server's namespace (in keeping with traditional implementations of
> > > server exports), while symlinks elsewhere should continue to be
> > > intepreted relative to the client's namespace.
> 
> Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> entirely well defined--a complicated namespace may transition between
> "pseudofs" and "real" filesystems multiple times.  So it's really a
> statement about the client's mount behavior: symlinks found along the
> mount path will be interpreted one way, symlinks found elsewhere
> another.  Right?
> 
> Though put that way it's harder to decide what to store in a symlink,
> since you can't necessarily control which paths a given client may
> decide to mount.

That has been the nature of an NFS mount path string since it was first
introduced in NFSv2: it refers to the server namespace. People haven't
complained about this previously, so why should we start changing the
meaning of the mount path when we move to NFSv4?

> > > Do the rfc's say anything about this?
> > 
> > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > the server namespace would be consistent with the mount behaviour of
> > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > that could refer back to the client's namespace: that would not be an
> > NFS mount, but a local bind mount...
> 
> Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> be different after
> 
> 	mount file:/path/symlink/ /mntsymlink/
> 	mount file:/path/	  /mnt/

So, what then if I do

	ln -s ../foo /bar/baz/symlink

on the server, then compare

	mount server:/bar/baz /mnt
and
	mount server:/bar/baz/symlink /mnt

Would you argue that those two should produce the same result? My
interpretation would be as follows:

In the first case, the symlink is visible as /mnt/symlink, and so
'cd /mnt/symlink' will take you to the local path '/foo' on the client.

In the second case, I'd be very surprised if the mount code did anything
other than to follow /bar/baz/symlink to remote path /bar/foo, and then
mount that on '/mnt'

If you agree that the above behaviour is correct, then how would you
argue that replacing '/bar/baz/symlink' with an absolute symlink
(i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
do a bind mount?

> I see your point, though it might also be an argument for continuing to
> error out on symlinks.

Again, why? We don't do that today with NFSv2/v3.

> It could also be argued that if a given symlink is expected to be
> interpreted on the server side, then the server should just go ahead and
> do that for the client, rather than returning it as a symlink.

How would the server distinguish between a client that is doing a lookup
of a mount path and one that is looking up a normal path?

> Seems worth at least mentioning to the ietf group, as different behavior
> across different clients would be confusing.

...as would different behaviour across different versions of NFS.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
www.netapp.com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-11 19:53           ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-11 19:53 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Tue, Feb 10, 2009 at 05:48:55PM -0500, Trond Myklebust wrote:
> On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> > On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > > the mount path on a remote server.
> > > > > 
> > > > > The first patch adds VFS support for walking the remote path, using a
> > > > > temporary mount namespace to represent the server's namespace, so that
> > > > > symlinks
> > > > 
> > > > I'm a bit confused about the symlink case--I take it you're assuming
> > > > that symlinks in the pseudofs should be interpreted as relative to the
> > > > server's namespace (in keeping with traditional implementations of
> > > > server exports), while symlinks elsewhere should continue to be
> > > > intepreted relative to the client's namespace.
> > 
> > Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> > entirely well defined--a complicated namespace may transition between
> > "pseudofs" and "real" filesystems multiple times.  So it's really a
> > statement about the client's mount behavior: symlinks found along the
> > mount path will be interpreted one way, symlinks found elsewhere
> > another.  Right?
> > 
> > Though put that way it's harder to decide what to store in a symlink,
> > since you can't necessarily control which paths a given client may
> > decide to mount.
> 
> That has been the nature of an NFS mount path string since it was first
> introduced in NFSv2: it refers to the server namespace.
> People haven't complained about this previously, so why should we
> start changing the meaning of the mount path when we move to NFSv4?

It wasn't previously possible for servers to expose symlinks in the
mount path to clients, so it's not clear to me how to apply precedent.

> 
> > > > Do the rfc's say anything about this?
> > > 
> > > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > > the server namespace would be consistent with the mount behaviour of
> > > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > > that could refer back to the client's namespace: that would not be an
> > > NFS mount, but a local bind mount...
> > 
> > Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> > be different after
> > 
> > 	mount file:/path/symlink/ /mntsymlink/
> > 	mount file:/path/	  /mnt/
> 
> So, what then if I do
> 
> 	ln -s ../foo /bar/baz/symlink
> 
> on the server, then compare
> 
> 	mount server:/bar/baz /mnt
> and
> 	mount server:/bar/baz/symlink /mnt
> 
> Would you argue that those two should produce the same result? My
> interpretation would be as follows:
> 
> In the first case, the symlink is visible as /mnt/symlink, and so
> 'cd /mnt/symlink' will take you to the local path '/foo' on the client.
> 
> In the second case, I'd be very surprised if the mount code did anything
> other than to follow /bar/baz/symlink to remote path /bar/foo, and then
> mount that on '/mnt'
> 
> If you agree that the above behaviour is correct, then how would you
> argue that replacing '/bar/baz/symlink' with an absolute symlink
> (i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
> do a bind mount?

I certainly agree that mount shouldn't do a bind mount in that case.

> > I see your point, though it might also be an argument for continuing to
> > error out on symlinks.
> 
> Again, why? We don't do that today with NFSv2/v3.

The question doesn't arise with NFSv2/v3, since the mount protocol can't
return a symlink to the client.

> > It could also be argued that if a given symlink is expected to be
> > interpreted on the server side, then the server should just go ahead and
> > do that for the client, rather than returning it as a symlink.
> 
> How would the server distinguish between a client that is doing a lookup
> of a mount path and one that is looking up a normal path?

Exactly, it can't--that's what worries me.  Under your proposal, the
server will return symlinks to the client which the client will
sometimes interpret relative to the server namespace, and sometimes
relative to the client namespace.

Since the server can't know which the client will do, I don't see how to
make any sensible decision about what the value of that symlink should
be.

In your example, if the intention of creating /bar/baz/symlink was
really to direct clients mounting that path to mount /bar/foo, I wonder
if the most helpful thing might just be for the server to return
a filehandle for the directory /bar/foo/ instead of for the symlink
/bar/foo/symlink.

> > Seems worth at least mentioning to the ietf group, as different behavior
> > across different clients would be confusing.
> 
> ...as would different behaviour across different versions of NFS.

Again, I don't understand how v2/v3 precedent applies here.

In any case, I think this is a subtle argument, and something that other
client implementers will come across, hence worth documenting on the
ietf list.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-11 19:53           ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-11 19:53 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

On Tue, Feb 10, 2009 at 05:48:55PM -0500, Trond Myklebust wrote:
> On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> > On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > > the mount path on a remote server.
> > > > > 
> > > > > The first patch adds VFS support for walking the remote path, using a
> > > > > temporary mount namespace to represent the server's namespace, so that
> > > > > symlinks
> > > > 
> > > > I'm a bit confused about the symlink case--I take it you're assuming
> > > > that symlinks in the pseudofs should be interpreted as relative to the
> > > > server's namespace (in keeping with traditional implementations of
> > > > server exports), while symlinks elsewhere should continue to be
> > > > intepreted relative to the client's namespace.
> > 
> > Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> > entirely well defined--a complicated namespace may transition between
> > "pseudofs" and "real" filesystems multiple times.  So it's really a
> > statement about the client's mount behavior: symlinks found along the
> > mount path will be interpreted one way, symlinks found elsewhere
> > another.  Right?
> > 
> > Though put that way it's harder to decide what to store in a symlink,
> > since you can't necessarily control which paths a given client may
> > decide to mount.
> 
> That has been the nature of an NFS mount path string since it was first
> introduced in NFSv2: it refers to the server namespace.
> People haven't complained about this previously, so why should we
> start changing the meaning of the mount path when we move to NFSv4?

It wasn't previously possible for servers to expose symlinks in the
mount path to clients, so it's not clear to me how to apply precedent.

> 
> > > > Do the rfc's say anything about this?
> > > 
> > > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > > the server namespace would be consistent with the mount behaviour of
> > > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > > that could refer back to the client's namespace: that would not be an
> > > NFS mount, but a local bind mount...
> > 
> > Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> > be different after
> > 
> > 	mount file:/path/symlink/ /mntsymlink/
> > 	mount file:/path/	  /mnt/
> 
> So, what then if I do
> 
> 	ln -s ../foo /bar/baz/symlink
> 
> on the server, then compare
> 
> 	mount server:/bar/baz /mnt
> and
> 	mount server:/bar/baz/symlink /mnt
> 
> Would you argue that those two should produce the same result? My
> interpretation would be as follows:
> 
> In the first case, the symlink is visible as /mnt/symlink, and so
> 'cd /mnt/symlink' will take you to the local path '/foo' on the client.
> 
> In the second case, I'd be very surprised if the mount code did anything
> other than to follow /bar/baz/symlink to remote path /bar/foo, and then
> mount that on '/mnt'
> 
> If you agree that the above behaviour is correct, then how would you
> argue that replacing '/bar/baz/symlink' with an absolute symlink
> (i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
> do a bind mount?

I certainly agree that mount shouldn't do a bind mount in that case.

> > I see your point, though it might also be an argument for continuing to
> > error out on symlinks.
> 
> Again, why? We don't do that today with NFSv2/v3.

The question doesn't arise with NFSv2/v3, since the mount protocol can't
return a symlink to the client.

> > It could also be argued that if a given symlink is expected to be
> > interpreted on the server side, then the server should just go ahead and
> > do that for the client, rather than returning it as a symlink.
> 
> How would the server distinguish between a client that is doing a lookup
> of a mount path and one that is looking up a normal path?

Exactly, it can't--that's what worries me.  Under your proposal, the
server will return symlinks to the client which the client will
sometimes interpret relative to the server namespace, and sometimes
relative to the client namespace.

Since the server can't know which the client will do, I don't see how to
make any sensible decision about what the value of that symlink should
be.

In your example, if the intention of creating /bar/baz/symlink was
really to direct clients mounting that path to mount /bar/foo, I wonder
if the most helpful thing might just be for the server to return
a filehandle for the directory /bar/foo/ instead of for the symlink
/bar/foo/symlink.

> > Seems worth at least mentioning to the ietf group, as different behavior
> > across different clients would be confusing.
> 
> ...as would different behaviour across different versions of NFS.

Again, I don't understand how v2/v3 precedent applies here.

In any case, I think this is a subtle argument, and something that other
client implementers will come across, hence worth documenting on the
ietf list.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
@ 2009-02-11 19:53           ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2009-02-11 19:53 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Tue, Feb 10, 2009 at 05:48:55PM -0500, Trond Myklebust wrote:
> On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> > On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > > the mount path on a remote server.
> > > > > 
> > > > > The first patch adds VFS support for walking the remote path, using a
> > > > > temporary mount namespace to represent the server's namespace, so that
> > > > > symlinks
> > > > 
> > > > I'm a bit confused about the symlink case--I take it you're assuming
> > > > that symlinks in the pseudofs should be interpreted as relative to the
> > > > server's namespace (in keeping with traditional implementations of
> > > > server exports), while symlinks elsewhere should continue to be
> > > > intepreted relative to the client's namespace.
> > 
> > Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> > entirely well defined--a complicated namespace may transition between
> > "pseudofs" and "real" filesystems multiple times.  So it's really a
> > statement about the client's mount behavior: symlinks found along the
> > mount path will be interpreted one way, symlinks found elsewhere
> > another.  Right?
> > 
> > Though put that way it's harder to decide what to store in a symlink,
> > since you can't necessarily control which paths a given client may
> > decide to mount.
> 
> That has been the nature of an NFS mount path string since it was first
> introduced in NFSv2: it refers to the server namespace.
> People haven't complained about this previously, so why should we
> start changing the meaning of the mount path when we move to NFSv4?

It wasn't previously possible for servers to expose symlinks in the
mount path to clients, so it's not clear to me how to apply precedent.

> 
> > > > Do the rfc's say anything about this?
> > > 
> > > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > > the server namespace would be consistent with the mount behaviour of
> > > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > > that could refer back to the client's namespace: that would not be an
> > > NFS mount, but a local bind mount...
> > 
> > Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> > be different after
> > 
> > 	mount file:/path/symlink/ /mntsymlink/
> > 	mount file:/path/	  /mnt/
> 
> So, what then if I do
> 
> 	ln -s ../foo /bar/baz/symlink
> 
> on the server, then compare
> 
> 	mount server:/bar/baz /mnt
> and
> 	mount server:/bar/baz/symlink /mnt
> 
> Would you argue that those two should produce the same result? My
> interpretation would be as follows:
> 
> In the first case, the symlink is visible as /mnt/symlink, and so
> 'cd /mnt/symlink' will take you to the local path '/foo' on the client.
> 
> In the second case, I'd be very surprised if the mount code did anything
> other than to follow /bar/baz/symlink to remote path /bar/foo, and then
> mount that on '/mnt'
> 
> If you agree that the above behaviour is correct, then how would you
> argue that replacing '/bar/baz/symlink' with an absolute symlink
> (i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
> do a bind mount?

I certainly agree that mount shouldn't do a bind mount in that case.

> > I see your point, though it might also be an argument for continuing to
> > error out on symlinks.
> 
> Again, why? We don't do that today with NFSv2/v3.

The question doesn't arise with NFSv2/v3, since the mount protocol can't
return a symlink to the client.

> > It could also be argued that if a given symlink is expected to be
> > interpreted on the server side, then the server should just go ahead and
> > do that for the client, rather than returning it as a symlink.
> 
> How would the server distinguish between a client that is doing a lookup
> of a mount path and one that is looking up a normal path?

Exactly, it can't--that's what worries me.  Under your proposal, the
server will return symlinks to the client which the client will
sometimes interpret relative to the server namespace, and sometimes
relative to the client namespace.

Since the server can't know which the client will do, I don't see how to
make any sensible decision about what the value of that symlink should
be.

In your example, if the intention of creating /bar/baz/symlink was
really to direct clients mounting that path to mount /bar/foo, I wonder
if the most helpful thing might just be for the server to return
a filehandle for the directory /bar/foo/ instead of for the symlink
/bar/foo/symlink.

> > Seems worth at least mentioning to the ietf group, as different behavior
> > across different clients would be confusing.
> 
> ...as would different behaviour across different versions of NFS.

Again, I don't understand how v2/v3 precedent applies here.

In any case, I think this is a subtle argument, and something that other
client implementers will come across, hence worth documenting on the
ietf list.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace
  2009-02-11 19:53           ` J. Bruce Fields
  (?)
  (?)
@ 2009-02-11 20:59           ` Trond Myklebust
  -1 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2009-02-11 20:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-fsdevel, linux-nfs, linux-kernel, viro

On Wed, 2009-02-11 at 14:53 -0500, J. Bruce Fields wrote:
> On Tue, Feb 10, 2009 at 05:48:55PM -0500, Trond Myklebust wrote:
> > On Tue, 2009-02-10 at 16:48 -0500, J. Bruce Fields wrote:
> > > On Tue, Feb 10, 2009 at 01:31:48PM -0500, Trond Myklebust wrote:
> > > > On Tue, 2009-02-10 at 10:58 -0500, J. Bruce Fields wrote:
> > > > > On Mon, Feb 09, 2009 at 01:45:34PM -0500, Trond Myklebust wrote:
> > > > > > The following two patches attempt to improve NFSv4's ability to look up
> > > > > > the mount path on a remote server.
> > > > > > 
> > > > > > The first patch adds VFS support for walking the remote path, using a
> > > > > > temporary mount namespace to represent the server's namespace, so that
> > > > > > symlinks
> > > > > 
> > > > > I'm a bit confused about the symlink case--I take it you're assuming
> > > > > that symlinks in the pseudofs should be interpreted as relative to the
> > > > > server's namespace (in keeping with traditional implementations of
> > > > > server exports), while symlinks elsewhere should continue to be
> > > > > intepreted relative to the client's namespace.
> > > 
> > > Maybe I shouldn't have said "symlinks in the pseudofs", as that's not
> > > entirely well defined--a complicated namespace may transition between
> > > "pseudofs" and "real" filesystems multiple times.  So it's really a
> > > statement about the client's mount behavior: symlinks found along the
> > > mount path will be interpreted one way, symlinks found elsewhere
> > > another.  Right?
> > > 
> > > Though put that way it's harder to decide what to store in a symlink,
> > > since you can't necessarily control which paths a given client may
> > > decide to mount.
> > 
> > That has been the nature of an NFS mount path string since it was first
> > introduced in NFSv2: it refers to the server namespace.
> > People haven't complained about this previously, so why should we
> > start changing the meaning of the mount path when we move to NFSv4?
> 
> It wasn't previously possible for servers to expose symlinks in the
> mount path to clients, so it's not clear to me how to apply precedent.

What are you talking about? Of course it was possible! What is
the /bar/baz mount example below doing that is specific to NFSv4? The
only difference here is the question of who is interpreting the mount
path.

The precedent is that in NFSv2/v3, the client sent _all_ mount path
lookup requests to the server's mount daemon. The mount daemon then did
a path lookup in its namespace, following symlinks if necessary, and
returned an NFSv2/v3 filehandle for the endpoint.
In all lookup requests that were sent to the NFS server, symlinks were
never followed, but were returned to the NFS client for interpretation
relative to the user's namespace.

In NFSv4, the mount path is looked up by the client. Why should it act
any differently to the NFSv2/v3 mount daemon, and interpret symlinks
relative to some other namespace?
After the mount is complete, then all subsequent lookups are interpreted
relative to the user's namespace, and so are the symlinks.

> > > > > Do the rfc's say anything about this?
> > > > 
> > > > No, the RFCs say nothing, but interpreting symlinks as being relative to
> > > > the server namespace would be consistent with the mount behaviour of
> > > > NFSv2/v3. It also makes me uncomfortable to have a remote mount path
> > > > that could refer back to the client's namespace: that would not be an
> > > > NFS mount, but a local bind mount...
> > > 
> > > Some may be surprised to find that /mntsymlink/ and /mnt/symlink/ will
> > > be different after
> > > 
> > > 	mount file:/path/symlink/ /mntsymlink/
> > > 	mount file:/path/	  /mnt/
> > 
> > So, what then if I do
> > 
> > 	ln -s ../foo /bar/baz/symlink
> > 
> > on the server, then compare
> > 
> > 	mount server:/bar/baz /mnt
> > and
> > 	mount server:/bar/baz/symlink /mnt
> > 
> > Would you argue that those two should produce the same result? My
> > interpretation would be as follows:
> > 
> > In the first case, the symlink is visible as /mnt/symlink, and so
> > 'cd /mnt/symlink' will take you to the local path '/foo' on the client.
> > 
> > In the second case, I'd be very surprised if the mount code did anything
> > other than to follow /bar/baz/symlink to remote path /bar/foo, and then
> > mount that on '/mnt'
> > 
> > If you agree that the above behaviour is correct, then how would you
> > argue that replacing '/bar/baz/symlink' with an absolute symlink
> > (i.e. 'ln -sf /bar/foo /bar/baz/symlink') should suddenly cause mount to
> > do a bind mount?
> 
> I certainly agree that mount shouldn't do a bind mount in that case.
> 
> > > I see your point, though it might also be an argument for continuing to
> > > error out on symlinks.
> > 
> > Again, why? We don't do that today with NFSv2/v3.
> 
> The question doesn't arise with NFSv2/v3, since the mount protocol can't
> return a symlink to the client.

The mount daemon returns the filehandle of the end point mount path
after interpreting the symlink relative to its namespace. It doesn't
error out.

Why should an NFSv4 mount that uses the exact same mount parameters
suddenly have to return an EINVAL when it could do exactly the same
thing as NFSv2/v3 did?

> > > It could also be argued that if a given symlink is expected to be
> > > interpreted on the server side, then the server should just go ahead and
> > > do that for the client, rather than returning it as a symlink.
> > 
> > How would the server distinguish between a client that is doing a lookup
> > of a mount path and one that is looking up a normal path?
> 
> Exactly, it can't--that's what worries me.  Under your proposal, the
> server will return symlinks to the client which the client will
> sometimes interpret relative to the server namespace, and sometimes
> relative to the client namespace.

No! It means that we _always_ interprets mount paths as being relative
to the server namespace, and that we _always_ interprets user path
lookups as being relative to the user's namespace.

That is fully consistent with all previous practice, and means that

1)
    mount -t nfs -overs=2 server:/bar/baz/mnt
    mount -t nfs -overs=3 server:/bar/baz /mnt
and
    mount -t nfs4 server:/bar/baz /mnt

always works and produces the same result. A subsequent
'ls /mnt/symlink' will refer to the resulting user namespace, and so the
symlink may point to a local file or directory

2)
    mount -t nfs -overs=2 server:/bar/baz/symlink /mnt
    mount -t nfs -overs=3 server:/bar/baz/symlink /mnt
and
    mount -t nfs4 server:/bar/baz/symlink /mnt

always produce the same result (but different to case 1). We mount the
directory that 'symlink' points to on the server. A subsequent 'ls /mnt'
shows no symlink, but points to the same NFS mounted directory in all 3
cases.

3)
    mount -t nfs -overs=2 server:/bar/baz/symlink/foo /mnt
    mount -t nfs -overs=3 server:/bar/baz/symlink/foo /mnt
and
    mount -t nfs4 server:/bar/baz/symlink/foo /mnt

always produce the same result (but different to cases 1 and 2). We
mount the same sub-directory of case 2 onto /mnt.

> Since the server can't know which the client will do, I don't see how to
> make any sensible decision about what the value of that symlink should
> be.
> 
> In your example, if the intention of creating /bar/baz/symlink was
> really to direct clients mounting that path to mount /bar/foo, I wonder
> if the most helpful thing might just be for the server to return
> a filehandle for the directory /bar/foo/ instead of for the symlink
> /bar/foo/symlink.

It can't do that. It has no idea whether or not this is a mount path
lookup or a user path lookup, and as I've said before, the two refer to
completely different namespaces.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-02-11 21:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-09 18:45 [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace Trond Myklebust
2009-02-09 18:45 ` [RFC PATCH 2/2] NFSv4: Use vfs_path_lookup() instead of nfs4_path_walk() Trond Myklebust
2009-02-09 18:45   ` Trond Myklebust
2009-02-09 18:45   ` Trond Myklebust
2009-02-09 18:45 ` [RFC PATCH 1/2] VFS: Add a VFS helper function vfs_remote_path_lookup() Trond Myklebust
2009-02-09 18:45   ` Trond Myklebust
2009-02-09 18:45   ` Trond Myklebust
2009-02-10 15:58 ` [RFC PATCH 0/2] Add VFS support for looking up paths on remote servers using a temporary mount namespace J. Bruce Fields
2009-02-10 15:58   ` J. Bruce Fields
2009-02-10 15:58   ` J. Bruce Fields
2009-02-10 18:31   ` Trond Myklebust
2009-02-10 21:48     ` J. Bruce Fields
2009-02-10 22:48       ` Trond Myklebust
2009-02-10 22:48         ` Trond Myklebust
2009-02-11 19:53         ` J. Bruce Fields
2009-02-11 19:53           ` J. Bruce Fields
2009-02-11 19:53           ` J. Bruce Fields
2009-02-11 20:59           ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.