linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Second attempt at contained helper execution
@ 2015-01-14  9:32 Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 1/5] nsproxy - refactor setns() Ian Kent
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

This series is a further attempt to find how (or even an acceptable
way) to execute a usermode helper in a contained environment.

Being an attempt to find how to do this no testing has been done and
won't be until a suitable approach can be agreed on, if at all.

>From previous discussion seperation between the caller and the
execution environment is required for security reasons.

It was suggested that a thread be created for each mount and be used
as the basis for the execution environment. There are a number of
problems with this, not the least of which is scaling to a large
numbers of mounts, and there may not be a mount corresponding the the
needed callback which amounts to creating the process from the context
of the caller which we don't want to do.

But now, when a usermode helper is executed the root init namespace is
used and has proven to be adequate. So perhaps it will also be adequate
to use the same approach for contained execution by using the container
init namespace as the basis for the execution.

That's essentially all this series attempts to do.

There are other difficulties to tackle as well, such as how to decide
if contained helper execution is needed. For example, if a mount has
been propagated to a container or bound into the container tree (such
as with the --volume option of "docker run") the root init namespace
may need to be used and not the container namespace.

There's also the rather resource heavy method that is used here to
enter the target namespace which probably needs work but is out of
scope for this series if in fact this approach is even acceptable.

Comments please?

---

Ian Kent (5):
      nsproxy - refactor setns()
      kmod - rename call_usermodehelper() flags parameter
      kmod - teach call_usermodehelper() to use a namespace
      KEYS - rename call_usermodehelper_keys() flags parameter
      KEYS: exec request-key within the requesting task's init namespace


 include/linux/kmod.h        |   21 ++++++-
 include/linux/nsproxy.h     |    1 
 kernel/kmod.c               |  135 +++++++++++++++++++++++++++++++++++++++----
 kernel/nsproxy.c            |   21 ++++---
 security/keys/request_key.c |   51 ++++++++++++++--
 5 files changed, 201 insertions(+), 28 deletions(-)

--
Ian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/5] nsproxy - refactor setns()
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
@ 2015-01-14  9:32 ` Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 2/5] kmod - rename call_usermodehelper() flags parameter Ian Kent
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

For usermode helpers to execute within a namspace a slightly different
entry point to setns() that takes a namspace inode is needed.

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
---
 include/linux/nsproxy.h |    1 +
 kernel/nsproxy.c        |   21 ++++++++++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 35fa08f..c75bf12 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -62,6 +62,7 @@ extern struct nsproxy init_nsproxy;
  *
  */
 
+int setns_inode(struct inode *inode, int nstype);
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 49746c8..27cc544 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -218,20 +218,15 @@ void exit_task_namespaces(struct task_struct *p)
 	switch_task_namespaces(p, NULL);
 }
 
-SYSCALL_DEFINE2(setns, int, fd, int, nstype)
+int setns_inode(struct inode *inode, int nstype)
 {
 	struct task_struct *tsk = current;
 	struct nsproxy *new_nsproxy;
-	struct file *file;
 	struct ns_common *ns;
 	int err;
 
-	file = proc_ns_fget(fd);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
 	err = -EINVAL;
-	ns = get_proc_ns(file_inode(file));
+	ns = get_proc_ns(inode);
 	if (nstype && (ns->ops->type != nstype))
 		goto out;
 
@@ -248,6 +243,18 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype)
 	}
 	switch_task_namespaces(tsk, new_nsproxy);
 out:
+	return err;
+}
+
+SYSCALL_DEFINE2(setns, int, fd, int, nstype)
+{
+	struct file *file;
+	int err;
+
+	file = proc_ns_fget(fd);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+	err = setns_inode(file_inode(file), nstype);
 	fput(file);
 	return err;
 }


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/5] kmod - rename call_usermodehelper() flags parameter
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 1/5] nsproxy - refactor setns() Ian Kent
@ 2015-01-14  9:32 ` Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace Ian Kent
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

The wait parameter of call_usermodehelper() is not quite a parameter
that describes the wait behaviour alone and will later be used to
request exec within a namespace.

So change its name to flags.

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
---
 include/linux/kmod.h |    4 ++--
 kernel/kmod.c        |   16 ++++++++--------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 0555cc6..15bdeed 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -67,7 +67,7 @@ struct subprocess_info {
 };
 
 extern int
-call_usermodehelper(char *path, char **argv, char **envp, int wait);
+call_usermodehelper(char *path, char **argv, char **envp, int flags);
 
 extern struct subprocess_info *
 call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
@@ -75,7 +75,7 @@ call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
 			  void (*cleanup)(struct subprocess_info *), void *data);
 
 extern int
-call_usermodehelper_exec(struct subprocess_info *info, int wait);
+call_usermodehelper_exec(struct subprocess_info *info, int flags);
 
 extern struct ctl_table usermodehelper_table[];
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 2777f40..14c0188 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -534,7 +534,7 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
  * asynchronously if wait is not set, and runs as a child of keventd.
  * (ie. it runs with full root capabilities).
  */
-int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
+int call_usermodehelper_exec(struct subprocess_info *sub_info, int flags)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
@@ -553,14 +553,14 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 	 * This makes it possible to use umh_complete to free
 	 * the data structure in case of UMH_NO_WAIT.
 	 */
-	sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &done;
-	sub_info->wait = wait;
+	sub_info->complete = (flags == UMH_NO_WAIT) ? NULL : &done;
+	sub_info->wait = flags;
 
 	queue_work(khelper_wq, &sub_info->work);
-	if (wait == UMH_NO_WAIT)	/* task has freed sub_info */
+	if (flags == UMH_NO_WAIT)	/* task has freed sub_info */
 		goto unlock;
 
-	if (wait & UMH_KILLABLE) {
+	if (flags & UMH_KILLABLE) {
 		retval = wait_for_completion_killable(&done);
 		if (!retval)
 			goto wait_done;
@@ -595,17 +595,17 @@ EXPORT_SYMBOL(call_usermodehelper_exec);
  * This function is the equivalent to use call_usermodehelper_setup() and
  * call_usermodehelper_exec().
  */
-int call_usermodehelper(char *path, char **argv, char **envp, int wait)
+int call_usermodehelper(char *path, char **argv, char **envp, int flags)
 {
 	struct subprocess_info *info;
-	gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
+	gfp_t gfp_mask = (flags == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
 
 	info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
 					 NULL, NULL, NULL);
 	if (info == NULL)
 		return -ENOMEM;
 
-	return call_usermodehelper_exec(info, wait);
+	return call_usermodehelper_exec(info, flags);
 }
 EXPORT_SYMBOL(call_usermodehelper);
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 1/5] nsproxy - refactor setns() Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 2/5] kmod - rename call_usermodehelper() flags parameter Ian Kent
@ 2015-01-14  9:32 ` Ian Kent
  2015-01-15 16:45   ` Jeff Layton
  2015-01-14  9:32 ` [RFC PATCH 4/5] KEYS - rename call_usermodehelper_keys() flags parameter Ian Kent
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

The call_usermodehelper() function executes all binaries in the
global "init" root context. This doesn't allow a binary to be run
within a namespace (eg. the namespace of a container).

Both containerized NFS client and NFS server need the ability to
execute a binary in a container's context. To do this use the init
process of the callers environment is used to setup the namespaces
in the same way the root init process is used otherwise.

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
---
 include/linux/kmod.h |   17 +++++++
 kernel/kmod.c        |  119 +++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 133 insertions(+), 3 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 15bdeed..487e68e 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -52,6 +52,7 @@ struct file;
 #define UMH_WAIT_EXEC	1	/* wait for the exec, but not the process */
 #define UMH_WAIT_PROC	2	/* wait for the process to complete */
 #define UMH_KILLABLE	4	/* wait for EXEC/PROC killable */
+#define UMH_USE_NS	8	/* exec using caller's init namespace */
 
 struct subprocess_info {
 	struct work_struct work;
@@ -69,6 +70,22 @@ struct subprocess_info {
 extern int
 call_usermodehelper(char *path, char **argv, char **envp, int flags);
 
+#if !defined(CONFIG_PROC_FS) || !defined(CONFIG_NAMESPACES)
+inline int umh_get_init_pid(pid_t *pid)
+{
+	*pid = 0;
+	return 0;
+}
+
+inline int umh_enter_ns(pid_t pid, struct cred *new)
+{
+	return -ENOTSUP;
+}
+#else
+int umh_get_init_pid(pid_t *pid);
+int umh_enter_ns(pid_t pid, struct cred *new);
+#endif
+
 extern struct subprocess_info *
 call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
 			  int (*init)(struct subprocess_info *info, struct cred *new),
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 14c0188..2179e58 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -582,6 +582,97 @@ unlock:
 }
 EXPORT_SYMBOL(call_usermodehelper_exec);
 
+#if defined(CONFIG_PROC_FS) && defined(CONFIG_NAMESPACES)
+#define NS_PATH_MAX	35
+#define NS_PATH_FMT	"%lu/ns/%s"
+
+/* Note namespace name order is significant */
+static const char *ns_names[] = { "user", "ipc", "uts", "net", "pid", "mnt", NULL };
+
+int umh_get_init_pid(pid_t *pid)
+{
+	struct task_struct *tsk;
+	pid_t init_pid;
+
+	rcu_read_lock();
+	tsk = find_task_by_vpid(1);
+	if (tsk)
+		get_task_struct(tsk);
+	rcu_read_unlock();
+	if (!tsk)
+		return -ESRCH;
+	init_pid = task_pid_nr(tsk);
+	put_task_struct(tsk);
+
+	*pid = init_pid;
+
+	return 0;
+}
+EXPORT_SYMBOL(umh_get_init_pid);
+
+int umh_enter_ns(pid_t pid, struct cred *new)
+{
+	char path[NS_PATH_MAX];
+	struct vfsmount *mnt;
+	const char *name;
+	int err = 0;
+
+	/*
+	 * The user mode thread runner runs in the root init namespace
+	 * so it will see all system pids.
+	 */
+	mnt = task_active_pid_ns(current)->proc_mnt;
+
+	for (name = ns_names[0]; *name; name++) {
+		struct file *this;
+		int len;
+
+		len = snprintf(path,
+			       NS_PATH_MAX, NS_PATH_FMT,
+			       (unsigned long) pid, name);
+		if (len >= NS_PATH_MAX) {
+			err = -ENAMETOOLONG;
+			break;
+		}
+
+		this = file_open_root(mnt->mnt_root, mnt, path, O_RDONLY);
+		if (unlikely(IS_ERR(this))) {
+			err = PTR_ERR(this);
+			break;
+		}
+
+		err = setns_inode(file_inode(this), 0);
+		fput(this);
+		if (err)
+			break;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL(umh_enter_ns);
+
+static int umh_set_ns(struct subprocess_info *info, struct cred *new)
+{
+	pid_t *init_pid = (pid_t *) info->data;
+
+	return umh_enter_ns(*init_pid, new);
+}
+
+static void umh_free_ns(struct subprocess_info *info)
+{
+	kfree(info->data);
+}
+#else
+static int umh_set_ns(struct subprocess_info *info, struct cred *new)
+{
+	return 0;
+}
+
+static void umh_free_ns(struct subprocess_info *info)
+{
+}
+#endif
+
 /**
  * call_usermodehelper() - prepare and start a usermode application
  * @path: path to usermode executable
@@ -599,11 +690,33 @@ int call_usermodehelper(char *path, char **argv, char **envp, int flags)
 {
 	struct subprocess_info *info;
 	gfp_t gfp_mask = (flags == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
+	unsigned int use_ns = flags & UMH_USE_NS;
+	pid_t init_pid = 0;
+	pid_t *pid = NULL;
 
-	info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
-					 NULL, NULL, NULL);
-	if (info == NULL)
+	if (use_ns) {
+		int ret = umh_get_init_pid(&init_pid);
+		if (ret)
+			return ret;
+	}
+
+	if (!init_pid)
+		info = call_usermodehelper_setup(path, argv, envp,
+						 gfp_mask, NULL, NULL, NULL);
+	else {
+		pid = kmalloc(sizeof(pid_t), gfp_mask);
+		if (!pid)
+			return -ENOMEM;
+		*pid = init_pid;
+		info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
+						 umh_set_ns, umh_free_ns,
+						 pid);
+	}
+	if (info == NULL) {
+		if (pid)
+			kfree(pid);
 		return -ENOMEM;
+	}
 
 	return call_usermodehelper_exec(info, flags);
 }


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 4/5] KEYS - rename call_usermodehelper_keys() flags parameter
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
                   ` (2 preceding siblings ...)
  2015-01-14  9:32 ` [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace Ian Kent
@ 2015-01-14  9:32 ` Ian Kent
  2015-01-14  9:32 ` [RFC PATCH 5/5] KEYS: exec request-key within the requesting task's init namespace Ian Kent
  2015-01-14 21:55 ` [RFC PATCH 0/5] Second attempt at contained helper execution J. Bruce Fields
  5 siblings, 0 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

The wait parameter of call_usermodehelper_keys() will later be used to
request exec within a namespace.

So change its name to flags.

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
---
 security/keys/request_key.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 0c7aea4..9e79bbf 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -73,7 +73,7 @@ static void umh_keys_cleanup(struct subprocess_info *info)
  * Call a usermode helper with a specific session keyring.
  */
 static int call_usermodehelper_keys(char *path, char **argv, char **envp,
-					struct key *session_keyring, int wait)
+					struct key *session_keyring, int flags)
 {
 	struct subprocess_info *info;
 
@@ -84,7 +84,7 @@ static int call_usermodehelper_keys(char *path, char **argv, char **envp,
 		return -ENOMEM;
 
 	key_get(session_keyring);
-	return call_usermodehelper_exec(info, wait);
+	return call_usermodehelper_exec(info, flags);
 }
 
 /*


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 5/5] KEYS: exec request-key within the requesting task's init namespace
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
                   ` (3 preceding siblings ...)
  2015-01-14  9:32 ` [RFC PATCH 4/5] KEYS - rename call_usermodehelper_keys() flags parameter Ian Kent
@ 2015-01-14  9:32 ` Ian Kent
  2015-01-14 21:55 ` [RFC PATCH 0/5] Second attempt at contained helper execution J. Bruce Fields
  5 siblings, 0 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-14  9:32 UTC (permalink / raw)
  To: Kernel Mailing List
  Cc: David Howells, Oleg Nesterov, Trond Myklebust, J. Bruce Fields,
	Benjamin Coddington, Al Viro, Jeff Layton, Eric W. Biederman

Containerized request key helper callbacks need the ability to execute
a binary in a container's context. To do this calling an in kernel
equivalent of setns(2) should be sufficient since the user mode helper
execution kernel thread ultimately calls do_execve().

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
---
 security/keys/request_key.c |   47 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 9e79bbf..d986ef3 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -17,6 +17,8 @@
 #include <linux/err.h>
 #include <linux/keyctl.h>
 #include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/nsproxy.h>
 #include "internal.h"
 
 #define key_negative_timeout	60	/* default timeout on a negative key's existence */
@@ -46,6 +48,11 @@ void complete_request_key(struct key_construction *cons, int error)
 }
 EXPORT_SYMBOL(complete_request_key);
 
+struct request_key_info {
+	struct key	*keyring;
+	pid_t		init_pid;
+};
+
 /*
  * Initialise a usermode helper that is going to have a specific session
  * keyring.
@@ -55,9 +62,18 @@ EXPORT_SYMBOL(complete_request_key);
  */
 static int umh_keys_init(struct subprocess_info *info, struct cred *cred)
 {
-	struct key *keyring = info->data;
+	struct request_key_info *rki = info->data;
+	pid_t init_pid = rki->init_pid;
+
+	if (init_pid) {
+		int err;
+
+		err = umh_enter_ns(init_pid, cred);
+		if (err)
+			return err;
+	}
 
-	return install_session_keyring_to_cred(cred, keyring);
+	return install_session_keyring_to_cred(cred, rki->keyring);
 }
 
 /*
@@ -65,8 +81,9 @@ static int umh_keys_init(struct subprocess_info *info, struct cred *cred)
  */
 static void umh_keys_cleanup(struct subprocess_info *info)
 {
-	struct key *keyring = info->data;
-	key_put(keyring);
+	struct request_key_info *rki = info->data;
+	key_put(rki->keyring);
+	kfree(rki);
 }
 
 /*
@@ -76,12 +93,30 @@ static int call_usermodehelper_keys(char *path, char **argv, char **envp,
 					struct key *session_keyring, int flags)
 {
 	struct subprocess_info *info;
+	struct request_key_info *rki;
+	unsigned int use_ns = flags & UMH_USE_NS;
+	pid_t init_pid = 0;
+
+	rki = kmalloc(sizeof(*rki), GFP_KERNEL);
+	if (!rki)
+		return -ENOMEM;
+
+	if (use_ns) {
+		int ret = umh_get_init_pid(&init_pid);
+		if (ret)
+			return ret;
+	}
+
+	rki->keyring = session_keyring;
+	rki->init_pid = init_pid;
 
 	info = call_usermodehelper_setup(path, argv, envp, GFP_KERNEL,
 					  umh_keys_init, umh_keys_cleanup,
-					  session_keyring);
-	if (!info)
+					  rki);
+	if (!info) {
+		kfree(rki);
 		return -ENOMEM;
+	}
 
 	key_get(session_keyring);
 	return call_usermodehelper_exec(info, flags);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
                   ` (4 preceding siblings ...)
  2015-01-14  9:32 ` [RFC PATCH 5/5] KEYS: exec request-key within the requesting task's init namespace Ian Kent
@ 2015-01-14 21:55 ` J. Bruce Fields
  2015-01-14 22:10   ` J. Bruce Fields
  5 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2015-01-14 21:55 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> This series is a further attempt to find how (or even an acceptable
> way) to execute a usermode helper in a contained environment.
> 
> Being an attempt to find how to do this no testing has been done and
> won't be until a suitable approach can be agreed on, if at all.
> 
> >From previous discussion seperation between the caller and the
> execution environment is required for security reasons.
> 
> It was suggested that a thread be created for each mount and be used
> as the basis for the execution environment. There are a number of
> problems with this, not the least of which is scaling to a large
> numbers of mounts, and there may not be a mount corresponding the the
> needed callback

Remind me what example you're thinking of here?

--b.

> which amounts to creating the process from the context
> of the caller which we don't want to do.
> 
> But now, when a usermode helper is executed the root init namespace is
> used and has proven to be adequate. So perhaps it will also be adequate
> to use the same approach for contained execution by using the container
> init namespace as the basis for the execution.
> 
> That's essentially all this series attempts to do.
> 
> There are other difficulties to tackle as well, such as how to decide
> if contained helper execution is needed. For example, if a mount has
> been propagated to a container or bound into the container tree (such
> as with the --volume option of "docker run") the root init namespace
> may need to be used and not the container namespace.
> 
> There's also the rather resource heavy method that is used here to
> enter the target namespace which probably needs work but is out of
> scope for this series if in fact this approach is even acceptable.
> 
> Comments please?
> 
> ---
> 
> Ian Kent (5):
>       nsproxy - refactor setns()
>       kmod - rename call_usermodehelper() flags parameter
>       kmod - teach call_usermodehelper() to use a namespace
>       KEYS - rename call_usermodehelper_keys() flags parameter
>       KEYS: exec request-key within the requesting task's init namespace
> 
> 
>  include/linux/kmod.h        |   21 ++++++-
>  include/linux/nsproxy.h     |    1 
>  kernel/kmod.c               |  135 +++++++++++++++++++++++++++++++++++++++----
>  kernel/nsproxy.c            |   21 ++++---
>  security/keys/request_key.c |   51 ++++++++++++++--
>  5 files changed, 201 insertions(+), 28 deletions(-)
> 
> --
> Ian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-14 21:55 ` [RFC PATCH 0/5] Second attempt at contained helper execution J. Bruce Fields
@ 2015-01-14 22:10   ` J. Bruce Fields
  2015-01-15  0:26     ` Ian Kent
  0 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2015-01-14 22:10 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

> On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > There are other difficulties to tackle as well, such as how to decide
> > if contained helper execution is needed. For example, if a mount has
> > been propagated to a container or bound into the container tree (such
> > as with the --volume option of "docker run") the root init namespace
> > may need to be used and not the container namespace.

I think you have to go through each of the existing upcall examples and
decide what's needed for each.

At least for the nfsv4 idmapper I would've thought the namespace the
mount was done in would be the right choice, hence my previous question.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-14 22:10   ` J. Bruce Fields
@ 2015-01-15  0:26     ` Ian Kent
  2015-01-15 16:27       ` J. Bruce Fields
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Kent @ 2015-01-15  0:26 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > There are other difficulties to tackle as well, such as how to decide
> > > if contained helper execution is needed. For example, if a mount has
> > > been propagated to a container or bound into the container tree (such
> > > as with the --volume option of "docker run") the root init namespace
> > > may need to be used and not the container namespace.
> 
> I think you have to go through each of the existing upcall examples and
> decide what's needed for each.
> 
> At least for the nfsv4 idmapper I would've thought the namespace the
> mount was done in would be the right choice, hence my previous question.

Probably but you don't necessarily know what namespace the mount was
done in. It may have been propagated from another namespace or (although
I don't think it works yet) bound from another container using the
volumes-from docker option.

At least I believe that's a problem and I agree that, once a suitable
method of running helpers is found each case will need to be looked at.

Ian



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-15  0:26     ` Ian Kent
@ 2015-01-15 16:27       ` J. Bruce Fields
  2015-01-16  1:01         ` Ian Kent
  0 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2015-01-15 16:27 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > There are other difficulties to tackle as well, such as how to decide
> > > > if contained helper execution is needed. For example, if a mount has
> > > > been propagated to a container or bound into the container tree (such
> > > > as with the --volume option of "docker run") the root init namespace
> > > > may need to be used and not the container namespace.
> > 
> > I think you have to go through each of the existing upcall examples and
> > decide what's needed for each.
> > 
> > At least for the nfsv4 idmapper I would've thought the namespace the
> > mount was done in would be the right choice, hence my previous question.
> 
> Probably but you don't necessarily know what namespace the mount was
> done in. It may have been propagated from another namespace or (although
> I don't think it works yet) bound from another container using the
> volumes-from docker option.

Name-id mappings should be associated with the superblock, I guess--so
don't you store a pointer to the right thing there?

--b.

> 
> At least I believe that's a problem and I agree that, once a suitable
> method of running helpers is found each case will need to be looked at.
> 
> Ian
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace
  2015-01-14  9:32 ` [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace Ian Kent
@ 2015-01-15 16:45   ` Jeff Layton
  2015-01-16  1:18     ` Ian Kent
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Layton @ 2015-01-15 16:45 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, J. Bruce Fields, Benjamin Coddington, Al Viro,
	Jeff Layton, Eric W. Biederman

On Wed, 14 Jan 2015 17:32:43 +0800
Ian Kent <ikent@redhat.com> wrote:

> The call_usermodehelper() function executes all binaries in the
> global "init" root context. This doesn't allow a binary to be run
> within a namespace (eg. the namespace of a container).
> 
> Both containerized NFS client and NFS server need the ability to
> execute a binary in a container's context. To do this use the init
> process of the callers environment is used to setup the namespaces
> in the same way the root init process is used otherwise.
> 
> Signed-off-by: Ian Kent <ikent@redhat.com>
> Cc: Benjamin Coddington <bcodding@redhat.com>
> Cc: Al Viro <viro@ZenIV.linux.org.uk>
> Cc: J. Bruce Fields <bfields@fieldses.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Trond Myklebust <trond.myklebust@primarydata.com>
> Cc: Oleg Nesterov <onestero@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Jeff Layton <jeff.layton@primarydata.com>
> ---
>  include/linux/kmod.h |   17 +++++++
>  kernel/kmod.c        |  119 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 133 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 15bdeed..487e68e 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -52,6 +52,7 @@ struct file;
>  #define UMH_WAIT_EXEC	1	/* wait for the exec, but not the process */
>  #define UMH_WAIT_PROC	2	/* wait for the process to complete */
>  #define UMH_KILLABLE	4	/* wait for EXEC/PROC killable */
> +#define UMH_USE_NS	8	/* exec using caller's init namespace */
>  
>  struct subprocess_info {
>  	struct work_struct work;
> @@ -69,6 +70,22 @@ struct subprocess_info {
>  extern int
>  call_usermodehelper(char *path, char **argv, char **envp, int flags);
>  
> +#if !defined(CONFIG_PROC_FS) || !defined(CONFIG_NAMESPACES)
> +inline int umh_get_init_pid(pid_t *pid)
> +{
> +	*pid = 0;
> +	return 0;
> +}
> +
> +inline int umh_enter_ns(pid_t pid, struct cred *new)
> +{
> +	return -ENOTSUP;
> +}
> +#else
> +int umh_get_init_pid(pid_t *pid);
> +int umh_enter_ns(pid_t pid, struct cred *new);
> +#endif
> +
>  extern struct subprocess_info *
>  call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
>  			  int (*init)(struct subprocess_info *info, struct cred *new),
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 14c0188..2179e58 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -582,6 +582,97 @@ unlock:
>  }
>  EXPORT_SYMBOL(call_usermodehelper_exec);
>  
> +#if defined(CONFIG_PROC_FS) && defined(CONFIG_NAMESPACES)
> +#define NS_PATH_MAX	35
> +#define NS_PATH_FMT	"%lu/ns/%s"
> +
> +/* Note namespace name order is significant */
> +static const char *ns_names[] = { "user", "ipc", "uts", "net", "pid", "mnt", NULL };
> +
> +int umh_get_init_pid(pid_t *pid)
> +{
> +	struct task_struct *tsk;
> +	pid_t init_pid;
> +
> +	rcu_read_lock();
> +	tsk = find_task_by_vpid(1);
> +	if (tsk)
> +		get_task_struct(tsk);
> +	rcu_read_unlock();
> +	if (!tsk)
> +		return -ESRCH;
> +	init_pid = task_pid_nr(tsk);
> +	put_task_struct(tsk);
> +
> +	*pid = init_pid;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(umh_get_init_pid);
> +

nit: pid_t == int, so you could just make this function return the
pid or a negative error code instead of dealing with a return argument.

> +int umh_enter_ns(pid_t pid, struct cred *new)
> +{
> +	char path[NS_PATH_MAX];
> +	struct vfsmount *mnt;
> +	const char *name;
> +	int err = 0;
> +
> +	/*
> +	 * The user mode thread runner runs in the root init namespace
> +	 * so it will see all system pids.
> +	 */
> +	mnt = task_active_pid_ns(current)->proc_mnt;
> +
> +	for (name = ns_names[0]; *name; name++) {
> +		struct file *this;
> +		int len;
> +
> +		len = snprintf(path,
> +			       NS_PATH_MAX, NS_PATH_FMT,
> +			       (unsigned long) pid, name);
> +		if (len >= NS_PATH_MAX) {
> +			err = -ENAMETOOLONG;
> +			break;
> +		}
> +
> +		this = file_open_root(mnt->mnt_root, mnt, path, O_RDONLY);
> +		if (unlikely(IS_ERR(this))) {
> +			err = PTR_ERR(this);
> +			break;
> +		}
> +
> +		err = setns_inode(file_inode(this), 0);
> +		fput(this);
> +		if (err)
> +			break;
> +	}
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(umh_enter_ns);
> +
> +static int umh_set_ns(struct subprocess_info *info, struct cred *new)
> +{
> +	pid_t *init_pid = (pid_t *) info->data;
> +
> +	return umh_enter_ns(*init_pid, new);
> +}
> +
> +static void umh_free_ns(struct subprocess_info *info)
> +{
> +	kfree(info->data);
> +}
> +#else
> +static int umh_set_ns(struct subprocess_info *info, struct cred *new)
> +{
> +	return 0;
> +}
> +
> +static void umh_free_ns(struct subprocess_info *info)
> +{
> +}
> +#endif
> +
>  /**
>   * call_usermodehelper() - prepare and start a usermode application
>   * @path: path to usermode executable
> @@ -599,11 +690,33 @@ int call_usermodehelper(char *path, char **argv, char **envp, int flags)
>  {
>  	struct subprocess_info *info;
>  	gfp_t gfp_mask = (flags == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
> +	unsigned int use_ns = flags & UMH_USE_NS;
> +	pid_t init_pid = 0;
> +	pid_t *pid = NULL;
>  
> -	info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
> -					 NULL, NULL, NULL);
> -	if (info == NULL)
> +	if (use_ns) {
> +		int ret = umh_get_init_pid(&init_pid);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	if (!init_pid)
> +		info = call_usermodehelper_setup(path, argv, envp,
> +						 gfp_mask, NULL, NULL, NULL);
> +	else {
> +		pid = kmalloc(sizeof(pid_t), gfp_mask);
> +		if (!pid)
> +			return -ENOMEM;
> +		*pid = init_pid;
> +		info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
> +						 umh_set_ns, umh_free_ns,
> +						 pid);
> +	}

Is this racy?

What guarantees that that pid will still be there (and in its correct
incarnation) once you get around to spawning the helper? After all,
this is just done via workqueues. Between the time that the work is
scheduled and picked up by the workqueue the whole namespace could
be torn down and a new process could grab that pid.

Maybe it would be better instead to deal with task_structs directly,
and ensure that the correct init process task_struct is pinned until
the new process is spawned? (or maybe even until it exits?)

> +	if (info == NULL) {
> +		if (pid)
> +			kfree(pid);
>  		return -ENOMEM;
> +	}
>  
>  	return call_usermodehelper_exec(info, flags);
>  }
> 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-15 16:27       ` J. Bruce Fields
@ 2015-01-16  1:01         ` Ian Kent
  2015-01-16 15:25           ` J. Bruce Fields
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Kent @ 2015-01-16  1:01 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > if contained helper execution is needed. For example, if a mount has
> > > > > been propagated to a container or bound into the container tree (such
> > > > > as with the --volume option of "docker run") the root init namespace
> > > > > may need to be used and not the container namespace.
> > > 
> > > I think you have to go through each of the existing upcall examples and
> > > decide what's needed for each.
> > > 
> > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > mount was done in would be the right choice, hence my previous question.
> > 
> > Probably but you don't necessarily know what namespace the mount was
> > done in. It may have been propagated from another namespace or (although
> > I don't think it works yet) bound from another container using the
> > volumes-from docker option.
> 
> Name-id mappings should be associated with the superblock, I guess--so
> don't you store a pointer to the right thing there?

Quite possibly but my original point was, without an acceptable
mechanism to execute the helper we can't know what might need to be done
to use it.

> 
> --b.
> 
> > 
> > At least I believe that's a problem and I agree that, once a suitable
> > method of running helpers is found each case will need to be looked at.
> > 
> > Ian
> > 
> > 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace
  2015-01-15 16:45   ` Jeff Layton
@ 2015-01-16  1:18     ` Ian Kent
  0 siblings, 0 replies; 18+ messages in thread
From: Ian Kent @ 2015-01-16  1:18 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, J. Bruce Fields, Benjamin Coddington, Al Viro,
	Eric W. Biederman

On Thu, 2015-01-15 at 11:45 -0500, Jeff Layton wrote:
> On Wed, 14 Jan 2015 17:32:43 +0800
> Ian Kent <ikent@redhat.com> wrote:
> 
> > The call_usermodehelper() function executes all binaries in the
> > global "init" root context. This doesn't allow a binary to be run
> > within a namespace (eg. the namespace of a container).
> > 
> > Both containerized NFS client and NFS server need the ability to
> > execute a binary in a container's context. To do this use the init
> > process of the callers environment is used to setup the namespaces
> > in the same way the root init process is used otherwise.
> > 
> > Signed-off-by: Ian Kent <ikent@redhat.com>
> > Cc: Benjamin Coddington <bcodding@redhat.com>
> > Cc: Al Viro <viro@ZenIV.linux.org.uk>
> > Cc: J. Bruce Fields <bfields@fieldses.org>
> > Cc: David Howells <dhowells@redhat.com>
> > Cc: Trond Myklebust <trond.myklebust@primarydata.com>
> > Cc: Oleg Nesterov <onestero@redhat.com>
> > Cc: Eric W. Biederman <ebiederm@xmission.com>
> > Cc: Jeff Layton <jeff.layton@primarydata.com>
> > ---
> >  include/linux/kmod.h |   17 +++++++
> >  kernel/kmod.c        |  119 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 133 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> > index 15bdeed..487e68e 100644
> > --- a/include/linux/kmod.h
> > +++ b/include/linux/kmod.h
> > @@ -52,6 +52,7 @@ struct file;
> >  #define UMH_WAIT_EXEC	1	/* wait for the exec, but not the process */
> >  #define UMH_WAIT_PROC	2	/* wait for the process to complete */
> >  #define UMH_KILLABLE	4	/* wait for EXEC/PROC killable */
> > +#define UMH_USE_NS	8	/* exec using caller's init namespace */
> >  
> >  struct subprocess_info {
> >  	struct work_struct work;
> > @@ -69,6 +70,22 @@ struct subprocess_info {
> >  extern int
> >  call_usermodehelper(char *path, char **argv, char **envp, int flags);
> >  
> > +#if !defined(CONFIG_PROC_FS) || !defined(CONFIG_NAMESPACES)
> > +inline int umh_get_init_pid(pid_t *pid)
> > +{
> > +	*pid = 0;
> > +	return 0;
> > +}
> > +
> > +inline int umh_enter_ns(pid_t pid, struct cred *new)
> > +{
> > +	return -ENOTSUP;
> > +}
> > +#else
> > +int umh_get_init_pid(pid_t *pid);
> > +int umh_enter_ns(pid_t pid, struct cred *new);
> > +#endif
> > +
> >  extern struct subprocess_info *
> >  call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
> >  			  int (*init)(struct subprocess_info *info, struct cred *new),
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 14c0188..2179e58 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -582,6 +582,97 @@ unlock:
> >  }
> >  EXPORT_SYMBOL(call_usermodehelper_exec);
> >  
> > +#if defined(CONFIG_PROC_FS) && defined(CONFIG_NAMESPACES)
> > +#define NS_PATH_MAX	35
> > +#define NS_PATH_FMT	"%lu/ns/%s"
> > +
> > +/* Note namespace name order is significant */
> > +static const char *ns_names[] = { "user", "ipc", "uts", "net", "pid", "mnt", NULL };
> > +
> > +int umh_get_init_pid(pid_t *pid)
> > +{
> > +	struct task_struct *tsk;
> > +	pid_t init_pid;
> > +
> > +	rcu_read_lock();
> > +	tsk = find_task_by_vpid(1);
> > +	if (tsk)
> > +		get_task_struct(tsk);
> > +	rcu_read_unlock();
> > +	if (!tsk)
> > +		return -ESRCH;
> > +	init_pid = task_pid_nr(tsk);
> > +	put_task_struct(tsk);
> > +
> > +	*pid = init_pid;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL(umh_get_init_pid);
> > +
> 
> nit: pid_t == int, so you could just make this function return the
> pid or a negative error code instead of dealing with a return argument.

True, but it worries me that pid_t might one day become an unsigned int
or long and something like that would be a hidden problem.

> 
> > +int umh_enter_ns(pid_t pid, struct cred *new)
> > +{
> > +	char path[NS_PATH_MAX];
> > +	struct vfsmount *mnt;
> > +	const char *name;
> > +	int err = 0;
> > +
> > +	/*
> > +	 * The user mode thread runner runs in the root init namespace
> > +	 * so it will see all system pids.
> > +	 */
> > +	mnt = task_active_pid_ns(current)->proc_mnt;
> > +
> > +	for (name = ns_names[0]; *name; name++) {
> > +		struct file *this;
> > +		int len;
> > +
> > +		len = snprintf(path,
> > +			       NS_PATH_MAX, NS_PATH_FMT,
> > +			       (unsigned long) pid, name);
> > +		if (len >= NS_PATH_MAX) {
> > +			err = -ENAMETOOLONG;
> > +			break;
> > +		}
> > +
> > +		this = file_open_root(mnt->mnt_root, mnt, path, O_RDONLY);
> > +		if (unlikely(IS_ERR(this))) {
> > +			err = PTR_ERR(this);
> > +			break;
> > +		}
> > +
> > +		err = setns_inode(file_inode(this), 0);
> > +		fput(this);
> > +		if (err)
> > +			break;
> > +	}
> > +
> > +	return err;
> > +}
> > +EXPORT_SYMBOL(umh_enter_ns);
> > +
> > +static int umh_set_ns(struct subprocess_info *info, struct cred *new)
> > +{
> > +	pid_t *init_pid = (pid_t *) info->data;
> > +
> > +	return umh_enter_ns(*init_pid, new);
> > +}
> > +
> > +static void umh_free_ns(struct subprocess_info *info)
> > +{
> > +	kfree(info->data);
> > +}
> > +#else
> > +static int umh_set_ns(struct subprocess_info *info, struct cred *new)
> > +{
> > +	return 0;
> > +}
> > +
> > +static void umh_free_ns(struct subprocess_info *info)
> > +{
> > +}
> > +#endif
> > +
> >  /**
> >   * call_usermodehelper() - prepare and start a usermode application
> >   * @path: path to usermode executable
> > @@ -599,11 +690,33 @@ int call_usermodehelper(char *path, char **argv, char **envp, int flags)
> >  {
> >  	struct subprocess_info *info;
> >  	gfp_t gfp_mask = (flags == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
> > +	unsigned int use_ns = flags & UMH_USE_NS;
> > +	pid_t init_pid = 0;
> > +	pid_t *pid = NULL;
> >  
> > -	info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
> > -					 NULL, NULL, NULL);
> > -	if (info == NULL)
> > +	if (use_ns) {
> > +		int ret = umh_get_init_pid(&init_pid);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	if (!init_pid)
> > +		info = call_usermodehelper_setup(path, argv, envp,
> > +						 gfp_mask, NULL, NULL, NULL);
> > +	else {
> > +		pid = kmalloc(sizeof(pid_t), gfp_mask);
> > +		if (!pid)
> > +			return -ENOMEM;
> > +		*pid = init_pid;
> > +		info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
> > +						 umh_set_ns, umh_free_ns,
> > +						 pid);
> > +	}
> 
> Is this racy?
> 
> What guarantees that that pid will still be there (and in its correct
> incarnation) once you get around to spawning the helper? After all,
> this is just done via workqueues. Between the time that the work is
> scheduled and picked up by the workqueue the whole namespace could
> be torn down and a new process could grab that pid.

Yeah, it is racy, it'll need more work.

> 
> Maybe it would be better instead to deal with task_structs directly,
> and ensure that the correct init process task_struct is pinned until
> the new process is spawned? (or maybe even until it exits?)

That might be the way to do it but lets establish if the approach is
viable before putting time into fixing it.

> 
> > +	if (info == NULL) {
> > +		if (pid)
> > +			kfree(pid);
> >  		return -ENOMEM;
> > +	}
> >  
> >  	return call_usermodehelper_exec(info, flags);
> >  }
> > 
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-16  1:01         ` Ian Kent
@ 2015-01-16 15:25           ` J. Bruce Fields
  2015-01-21  7:05             ` Ian Kent
  0 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2015-01-16 15:25 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Fri, Jan 16, 2015 at 09:01:13AM +0800, Ian Kent wrote:
> On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> > On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > > if contained helper execution is needed. For example, if a mount has
> > > > > > been propagated to a container or bound into the container tree (such
> > > > > > as with the --volume option of "docker run") the root init namespace
> > > > > > may need to be used and not the container namespace.
> > > > 
> > > > I think you have to go through each of the existing upcall examples and
> > > > decide what's needed for each.
> > > > 
> > > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > > mount was done in would be the right choice, hence my previous question.
> > > 
> > > Probably but you don't necessarily know what namespace the mount was
> > > done in. It may have been propagated from another namespace or (although
> > > I don't think it works yet) bound from another container using the
> > > volumes-from docker option.
> > 
> > Name-id mappings should be associated with the superblock, I guess--so
> > don't you store a pointer to the right thing there?
> 
> Quite possibly but my original point was, without an acceptable
> mechanism to execute the helper we can't know what might need to be done
> to use it.

At least for me it would be easier to review if it came with at least
one example user.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-16 15:25           ` J. Bruce Fields
@ 2015-01-21  7:05             ` Ian Kent
  2015-01-21 14:38               ` J. Bruce Fields
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Kent @ 2015-01-21  7:05 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Fri, 2015-01-16 at 10:25 -0500, J. Bruce Fields wrote:
> On Fri, Jan 16, 2015 at 09:01:13AM +0800, Ian Kent wrote:
> > On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> > > On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > > > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > > > if contained helper execution is needed. For example, if a mount has
> > > > > > > been propagated to a container or bound into the container tree (such
> > > > > > > as with the --volume option of "docker run") the root init namespace
> > > > > > > may need to be used and not the container namespace.
> > > > > 
> > > > > I think you have to go through each of the existing upcall examples and
> > > > > decide what's needed for each.
> > > > > 
> > > > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > > > mount was done in would be the right choice, hence my previous question.
> > > > 
> > > > Probably but you don't necessarily know what namespace the mount was
> > > > done in. It may have been propagated from another namespace or (although
> > > > I don't think it works yet) bound from another container using the
> > > > volumes-from docker option.
> > > 
> > > Name-id mappings should be associated with the superblock, I guess--so
> > > don't you store a pointer to the right thing there?
> > 
> > Quite possibly but my original point was, without an acceptable
> > mechanism to execute the helper we can't know what might need to be done
> > to use it.
> 
> At least for me it would be easier to review if it came with at least
> one example user.

Haven't seen any negative responses but perhaps people are still away
over Xmas.

In the mean time it's probably a good idea to add some use cases to the
series in case the approach is OK.

I'll have a look at the nfsd code and see if I can spot the places.

Ian



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-21  7:05             ` Ian Kent
@ 2015-01-21 14:38               ` J. Bruce Fields
  2015-01-22  1:28                 ` Ian Kent
  0 siblings, 1 reply; 18+ messages in thread
From: J. Bruce Fields @ 2015-01-21 14:38 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Wed, Jan 21, 2015 at 03:05:25PM +0800, Ian Kent wrote:
> On Fri, 2015-01-16 at 10:25 -0500, J. Bruce Fields wrote:
> > On Fri, Jan 16, 2015 at 09:01:13AM +0800, Ian Kent wrote:
> > > On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> > > > On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > > > > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > > > > if contained helper execution is needed. For example, if a mount has
> > > > > > > > been propagated to a container or bound into the container tree (such
> > > > > > > > as with the --volume option of "docker run") the root init namespace
> > > > > > > > may need to be used and not the container namespace.
> > > > > > 
> > > > > > I think you have to go through each of the existing upcall examples and
> > > > > > decide what's needed for each.
> > > > > > 
> > > > > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > > > > mount was done in would be the right choice, hence my previous question.
> > > > > 
> > > > > Probably but you don't necessarily know what namespace the mount was
> > > > > done in. It may have been propagated from another namespace or (although
> > > > > I don't think it works yet) bound from another container using the
> > > > > volumes-from docker option.
> > > > 
> > > > Name-id mappings should be associated with the superblock, I guess--so
> > > > don't you store a pointer to the right thing there?
> > > 
> > > Quite possibly but my original point was, without an acceptable
> > > mechanism to execute the helper we can't know what might need to be done
> > > to use it.
> > 
> > At least for me it would be easier to review if it came with at least
> > one example user.
> 
> Haven't seen any negative responses but perhaps people are still away
> over Xmas.
> 
> In the mean time it's probably a good idea to add some use cases to the
> series in case the approach is OK.
> 
> I'll have a look at the nfsd code and see if I can spot the places.

On the nfsd side it's just the one call_usermodehelper in
fs/nfsd/nfs4recover.c.  The tricky part is figuring out where the
namespace information should come from.

--b.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-21 14:38               ` J. Bruce Fields
@ 2015-01-22  1:28                 ` Ian Kent
  2015-02-18 20:44                   ` J. Bruce Fields
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Kent @ 2015-01-22  1:28 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Wed, 2015-01-21 at 09:38 -0500, J. Bruce Fields wrote:
> On Wed, Jan 21, 2015 at 03:05:25PM +0800, Ian Kent wrote:
> > On Fri, 2015-01-16 at 10:25 -0500, J. Bruce Fields wrote:
> > > On Fri, Jan 16, 2015 at 09:01:13AM +0800, Ian Kent wrote:
> > > > On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> > > > > On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > > > > > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > > > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > > > > > if contained helper execution is needed. For example, if a mount has
> > > > > > > > > been propagated to a container or bound into the container tree (such
> > > > > > > > > as with the --volume option of "docker run") the root init namespace
> > > > > > > > > may need to be used and not the container namespace.
> > > > > > > 
> > > > > > > I think you have to go through each of the existing upcall examples and
> > > > > > > decide what's needed for each.
> > > > > > > 
> > > > > > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > > > > > mount was done in would be the right choice, hence my previous question.
> > > > > > 
> > > > > > Probably but you don't necessarily know what namespace the mount was
> > > > > > done in. It may have been propagated from another namespace or (although
> > > > > > I don't think it works yet) bound from another container using the
> > > > > > volumes-from docker option.
> > > > > 
> > > > > Name-id mappings should be associated with the superblock, I guess--so
> > > > > don't you store a pointer to the right thing there?
> > > > 
> > > > Quite possibly but my original point was, without an acceptable
> > > > mechanism to execute the helper we can't know what might need to be done
> > > > to use it.
> > > 
> > > At least for me it would be easier to review if it came with at least
> > > one example user.
> > 
> > Haven't seen any negative responses but perhaps people are still away
> > over Xmas.
> > 
> > In the mean time it's probably a good idea to add some use cases to the
> > series in case the approach is OK.
> > 
> > I'll have a look at the nfsd code and see if I can spot the places.
> 
> On the nfsd side it's just the one call_usermodehelper in
> fs/nfsd/nfs4recover.c.  The tricky part is figuring out where the
> namespace information should come from.

I had a look at the nfsd code but haven't looked at the nfsdcltrack to
see what it does and if it expects anything that wouldn't be available.
There's also the assumption that the external application is present
within the container filesystem at the same location.

The whole idea of the current approach is that the namespace information
comes from the init process of the container it's executing in.

I believe that if nfsd is running in a container it should be able to
function entirely within the container, except for the callback issue.

I see there's a check in the callback init function to see if the net
namespace in use is the root net namespace. It looks like that check
would be enough to determine that container execution is needed.

Assuming that the various locations all contain the same struct net,
such as is stored in (struct nfs4_client) clp, then it's probably enough
to change nfsd4_umh_cltrack_upcall() to also take net as a parameter. 

Then the same check (which would be removed) used in the init function
could be used to determine if the UMH_USE_NS flag needs to be passed to
call_usermodehelper().

Passing the UMH_USE_NS flag to call_usermodehelper() will cause the
helper to be executed within the init namespace (including the net
namespace) of the container.

If that is what's needed then it might be sensible to change
nfsd4_umh_cltrack_upcall() to take a structure containing parameters to
keep it clean.

I could create patches to demonstrate the procedure but we probably
should keep that discussion separate from this one for the moment.

Ian


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/5] Second attempt at contained helper execution
  2015-01-22  1:28                 ` Ian Kent
@ 2015-02-18 20:44                   ` J. Bruce Fields
  0 siblings, 0 replies; 18+ messages in thread
From: J. Bruce Fields @ 2015-02-18 20:44 UTC (permalink / raw)
  To: Ian Kent
  Cc: Kernel Mailing List, David Howells, Oleg Nesterov,
	Trond Myklebust, Benjamin Coddington, Al Viro, Jeff Layton,
	Eric W. Biederman

On Thu, Jan 22, 2015 at 09:28:45AM +0800, Ian Kent wrote:
> On Wed, 2015-01-21 at 09:38 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 21, 2015 at 03:05:25PM +0800, Ian Kent wrote:
> > > On Fri, 2015-01-16 at 10:25 -0500, J. Bruce Fields wrote:
> > > > On Fri, Jan 16, 2015 at 09:01:13AM +0800, Ian Kent wrote:
> > > > > On Thu, 2015-01-15 at 11:27 -0500, J. Bruce Fields wrote:
> > > > > > On Thu, Jan 15, 2015 at 08:26:12AM +0800, Ian Kent wrote:
> > > > > > > On Wed, 2015-01-14 at 17:10 -0500, J. Bruce Fields wrote:
> > > > > > > > > On Wed, Jan 14, 2015 at 05:32:22PM +0800, Ian Kent wrote:
> > > > > > > > > > There are other difficulties to tackle as well, such as how to decide
> > > > > > > > > > if contained helper execution is needed. For example, if a mount has
> > > > > > > > > > been propagated to a container or bound into the container tree (such
> > > > > > > > > > as with the --volume option of "docker run") the root init namespace
> > > > > > > > > > may need to be used and not the container namespace.
> > > > > > > > 
> > > > > > > > I think you have to go through each of the existing upcall examples and
> > > > > > > > decide what's needed for each.
> > > > > > > > 
> > > > > > > > At least for the nfsv4 idmapper I would've thought the namespace the
> > > > > > > > mount was done in would be the right choice, hence my previous question.
> > > > > > > 
> > > > > > > Probably but you don't necessarily know what namespace the mount was
> > > > > > > done in. It may have been propagated from another namespace or (although
> > > > > > > I don't think it works yet) bound from another container using the
> > > > > > > volumes-from docker option.
> > > > > > 
> > > > > > Name-id mappings should be associated with the superblock, I guess--so
> > > > > > don't you store a pointer to the right thing there?
> > > > > 
> > > > > Quite possibly but my original point was, without an acceptable
> > > > > mechanism to execute the helper we can't know what might need to be done
> > > > > to use it.
> > > > 
> > > > At least for me it would be easier to review if it came with at least
> > > > one example user.
> > > 
> > > Haven't seen any negative responses but perhaps people are still away
> > > over Xmas.
> > > 
> > > In the mean time it's probably a good idea to add some use cases to the
> > > series in case the approach is OK.
> > > 
> > > I'll have a look at the nfsd code and see if I can spot the places.
> > 
> > On the nfsd side it's just the one call_usermodehelper in
> > fs/nfsd/nfs4recover.c.  The tricky part is figuring out where the
> > namespace information should come from.
> 
> I had a look at the nfsd code but haven't looked at the nfsdcltrack to
> see what it does and if it expects anything that wouldn't be available.
> There's also the assumption that the external application is present
> within the container filesystem at the same location.
> 
> The whole idea of the current approach is that the namespace information
> comes from the init process of the container it's executing in.

Note "it's executing in" doesn't make much sense for an nfsd kernel
thread.  What typically matters the rpc that the thread is current
handling, and the network interface that that rpc arrived over.

> I believe that if nfsd is running in a container it should be able to
> function entirely within the container, except for the callback issue.

nfsd has a common pool of threads which may handle requests associated
with multiple containers.

> I see there's a check in the callback init function to see if the net
> namespace in use is the root net namespace. It looks like that check
> would be enough to determine that container execution is needed.

Yes, this code is actually called from a process, the one that starts
nfsd.  Could we take a reference on the correct init task here and use
that throughout?  That would get us the right behavior.

--b.

> 
> Assuming that the various locations all contain the same struct net,
> such as is stored in (struct nfs4_client) clp, then it's probably enough
> to change nfsd4_umh_cltrack_upcall() to also take net as a parameter. 
> 
> Then the same check (which would be removed) used in the init function
> could be used to determine if the UMH_USE_NS flag needs to be passed to
> call_usermodehelper().
> 
> Passing the UMH_USE_NS flag to call_usermodehelper() will cause the
> helper to be executed within the init namespace (including the net
> namespace) of the container.
> 
> If that is what's needed then it might be sensible to change
> nfsd4_umh_cltrack_upcall() to take a structure containing parameters to
> keep it clean.
> 
> I could create patches to demonstrate the procedure but we probably
> should keep that discussion separate from this one for the moment.
> 
> Ian
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-02-18 20:45 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-14  9:32 [RFC PATCH 0/5] Second attempt at contained helper execution Ian Kent
2015-01-14  9:32 ` [RFC PATCH 1/5] nsproxy - refactor setns() Ian Kent
2015-01-14  9:32 ` [RFC PATCH 2/5] kmod - rename call_usermodehelper() flags parameter Ian Kent
2015-01-14  9:32 ` [RFC PATCH 3/5] kmod - teach call_usermodehelper() to use a namespace Ian Kent
2015-01-15 16:45   ` Jeff Layton
2015-01-16  1:18     ` Ian Kent
2015-01-14  9:32 ` [RFC PATCH 4/5] KEYS - rename call_usermodehelper_keys() flags parameter Ian Kent
2015-01-14  9:32 ` [RFC PATCH 5/5] KEYS: exec request-key within the requesting task's init namespace Ian Kent
2015-01-14 21:55 ` [RFC PATCH 0/5] Second attempt at contained helper execution J. Bruce Fields
2015-01-14 22:10   ` J. Bruce Fields
2015-01-15  0:26     ` Ian Kent
2015-01-15 16:27       ` J. Bruce Fields
2015-01-16  1:01         ` Ian Kent
2015-01-16 15:25           ` J. Bruce Fields
2015-01-21  7:05             ` Ian Kent
2015-01-21 14:38               ` J. Bruce Fields
2015-01-22  1:28                 ` Ian Kent
2015-02-18 20:44                   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).