All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v5] pidns: introduce syscall translate_pid
@ 2018-04-04 19:11 Konstantin Khlebnikov
  2018-04-04 20:31 ` Nagarathnam Muthusamy
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-04-04 19:11 UTC (permalink / raw)
  To: linux-api, linux-kernel
  Cc: Jann Horn, Serge Hallyn, Oleg Nesterov, Andy Lutomirski,
	Nagarathnam Muthusamy, Eric W. Biederman, Prakash Sangappa,
	Andrew Morton

Each process have different pids, one for each pid namespace it belongs.
When interaction happens within single pid-ns translation isn't required.
More complicated scenarios needs special handling.

For example:
- reading pid-files or logs written inside container with pid namespace
- attaching with ptrace to tasks from different pid namespace
- passing pids across pid namespaces in any kind of API

Currently there are several interfaces that could be used here:

Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.

Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
In some cases conversion pid -> vpid could be easily done using this
information, but backward translation requires scanning all tasks.

Unix socket automatically translates pid attached to SCM_CREDENTIALS.
This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
into pid namespace, this expose process and could be insecure.

This patch adds new syscall for converting pids between pid namespaces:

pid_t translate_pid(pid_t pid, int source_type, int source,
                               int target_type, int target);

@source_type and @target_type defines type of following arguments:

TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
TRANSLATE_PID_FD_PIDNS       - pidns fd, argument is file descriptor

Syscall returns pid in target pid-ns or zero if task have no pid there.

Error codes:
-EINVAL   - @source or @target couldn't be resolved into pid namespace
-ESRCH    - task with @pid is not found in @source pid-namespace

Other pid namespaces are referenced either by pid of any process who
lives inside it or by file descriptor pointing to /proc/[pid]/ns/pid.
Latter method provides better protection against races but in some
cases requires CAP_SYS_PTRACE.

Translate_pid could breach pid isolation and return pids from outer pid
namespaces iff process already has file descriptor pointing to them.


Examples:

- get pid in current pid namespace

translate_pid(pid, TRANSLATE_PID_FD_PIDNS, ns_fd,
                   TRANSLATE_PID_CURRENT_PIDNS, 0)
or
translate_pid(pid, TRANSLATE_PID_TASK_PIDNS, ns_pid,
                   TRANSLATE_PID_CURRENT_PIDNS, 0)

- get pid in other pid namespace

translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
                   TRANSLATE_PID_FD_PIDNS, ns_fd)
or
translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
                   TRANSLATE_PID_TASK_PIDNS, ns_pid)

- get deepest pid

translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
                   TRANSLATE_PID_TASK_PIDNS, pid)

- get pid of init task for namespace

translate_pid(1, TRANSLATE_PID_FD_PIDNS, ns_fd,
                 TRANSLATE_PID_CURRENT_PIDNS, 0)


This syscall also could be used for checking topology of pid namespaces:

- ns1 nests inside ns2

translate_pid(1, TRANSLATE_PID_FD_PIDNS, ns1_fd,
                 TRANSLATE_PID_FD_PIDNS, ns2_fd) > 1

- task1 lives in same pid-namespace as task2

translate_pid(1, TRANSLATE_PID_TASK_PIDNS, task1_pid,
                 TRANSLATE_PID_TASK_PIDNS, task2_pid) == 1

- task1 is isolated from task2

translate_pid(task1_pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
                         TRANSLATE_PID_TASK_PIDNS, task2_pid) == 0

- pid is reachable from ns

translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
                   TRANSLATE_PID_FD_PIDNS, ns_fd) > 0

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

---

v1: https://lkml.org/lkml/2015/9/15/411
v2: https://lkml.org/lkml/2015/9/24/278
 * use namespace-fd as second/third argument
 * add -pid for getting parent pid
 * move code into kernel/sys.c next to getppid
 * drop ifdef CONFIG_PID_NS
 * add generic syscall
v3: https://lkml.org/lkml/2015/9/28/3
 * use proc_ns_fdget()
 * update description
 * rebase to next-20150925
 * fix conflict with mlock2
v4: https://lkml.org/lkml/2017/10/16/852
 * rename into translate_pid()
 * remove syscall if CONFIG_PID_NS=n
 * drop -pid for parent task
 * drop fget-fdget optimizations
 * add helper get_pid_ns_by_fd()
 * wire only into x86
v5:
 * rewrite commit message
 * resolve pidns by task pid or by pidns fd
 * add arguments source_type and target_type

--- sample tool translate_pid.c ---

#define _GNU_SOURCE
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sched.h>
#include <fcntl.h>
#include <err.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

#ifndef SYS_translate_pid
#ifdef __x86_64__
#define SYS_translate_pid 333
#endif
#endif

#ifndef TRANSLATE_PID_CURRENT_PIDNS
#define TRANSLATE_PID_CURRENT_PIDNS	0
#define TRANSLATE_PID_TASK_PIDNS	1
#define TRANSLATE_PID_FD_PIDNS		2
#endif

pid_t translate_pid(pid_t pid, int source_type, int source,
			       int target_type, int target) {
	return syscall(SYS_translate_pid, pid, source_type, source,
					       target_type, target);
}

int main(int argc, char **argv) {
	int pid, source, target;
	char buf[64];

	if (argc != 4)
		errx(1, "usage: %s <pid> <source> <traget>", argv[0]);

	pid = atoi(argv[1]);
	int source_type, target_type;
	source = atoi(argv[2]);
	target = atoi(argv[3]);

	if (source < 0) {
		source_type = TRANSLATE_PID_TASK_PIDNS;
		source = -source;
	} else if (source > 0) {
		source_type = TRANSLATE_PID_FD_PIDNS;
		sprintf(buf, "/proc/%d/ns/pid", source);
		source = open(buf, O_RDONLY);
		if (source < 0)
			err(2, "open source %s", buf);
	} else {
		source_type = TRANSLATE_PID_CURRENT_PIDNS;
	}

	if (target < 0) {
		target_type = TRANSLATE_PID_TASK_PIDNS;
		target = -target;
	} else if (target > 0) {
		target_type = TRANSLATE_PID_FD_PIDNS;
		sprintf(buf, "/proc/%d/ns/pid", target);
		target = open(buf, O_RDONLY);
		if (target < 0)
			err(2, "open target %s", buf);
	} else {
		target_type = TRANSLATE_PID_CURRENT_PIDNS;
	}

	pid = translate_pid(pid, source_type, source, target_type, target);
	if (pid < 0)
		err(2, "translate");

	printf("%d\n", pid);
	return 0;
}

---
---
 arch/x86/entry/syscalls/syscall_32.tbl |    1 +
 arch/x86/entry/syscalls/syscall_64.tbl |    1 +
 include/linux/syscalls.h               |    4 ++
 include/uapi/linux/sched.h             |    7 ++++
 kernel/pid_namespace.c                 |   64 ++++++++++++++++++++++++++++++++
 kernel/sys_ni.c                        |    3 ++
 6 files changed, 80 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index c58f75b088c5..aef52c709845 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382	i386	pkey_free		sys_pkey_free
 383	i386	statx			sys_statx
 384	i386	arch_prctl		sys_arch_prctl			compat_sys_arch_prctl
+385	i386	translate_pid		sys_translate_pid
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..1ebdab83c6f4 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330	common	pkey_alloc		sys_pkey_alloc
 331	common	pkey_free		sys_pkey_free
 332	common	statx			sys_statx
+333	common	translate_pid		sys_translate_pid
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b961184f597a..d189a1f61160 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -553,6 +553,10 @@ asmlinkage long sys_clock_nanosleep(clockid_t which_clock, int flags,
 /* kernel/printk.c */
 asmlinkage long sys_syslog(int type, char __user *buf, int len);
 
+/* kernel/pid_namespace.c */
+asmlinkage long sys_translate_pid(pid_t pid, int source_type, int source,
+				  int target_type, int target);
+
 /* kernel/ptrace.c */
 asmlinkage long sys_ptrace(long request, long pid, unsigned long addr,
 			   unsigned long data);
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 22627f80063e..7c45fd8d33d7 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -55,4 +55,11 @@
 			 SCHED_FLAG_RECLAIM		| \
 			 SCHED_FLAG_DL_OVERRUN)
 
+/*
+ * For translate_pid()
+ */
+#define TRANSLATE_PID_CURRENT_PIDNS	0	/* Current pid namespace */
+#define TRANSLATE_PID_TASK_PIDNS	1	/* Namespace by task pid */
+#define TRANSLATE_PID_FD_PIDNS		2	/* Namespace by pidns fd */
+
 #endif /* _UAPI_LINUX_SCHED_H */
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 2a2ac53d8b8b..84c8b47289d5 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -13,6 +13,7 @@
 #include <linux/user_namespace.h>
 #include <linux/syscalls.h>
 #include <linux/cred.h>
+#include <linux/file.h>
 #include <linux/err.h>
 #include <linux/acct.h>
 #include <linux/slab.h>
@@ -380,6 +381,69 @@ static void pidns_put(struct ns_common *ns)
 	put_pid_ns(to_pid_ns(ns));
 }
 
+/* Under rcu_read_lock(). Returns pointer to pid_namespace or NULL. */
+static struct pid_namespace *resolve_pid_ns(int type, int fd_or_pid)
+{
+	struct pid_namespace *current_ns = task_active_pid_ns(current);
+	struct pid_namespace *pidns = NULL;
+	struct ns_common *ns;
+	struct file *file;
+
+	switch (type) {
+	case TRANSLATE_PID_CURRENT_PIDNS:
+		pidns = current_ns;
+		break;
+	case TRANSLATE_PID_TASK_PIDNS:
+		pidns = ns_of_pid(find_pid_ns(fd_or_pid, current_ns));
+		break;
+	case TRANSLATE_PID_FD_PIDNS:
+		file = proc_ns_fget(fd_or_pid);
+		if (!IS_ERR(file)) {
+			ns = get_proc_ns(file_inode(file));
+			if (ns->ops->type == CLONE_NEWPID)
+				pidns = to_pid_ns(ns);
+			fput(file);
+		}
+		break;
+	}
+
+	return pidns;
+}
+
+/*
+ * translate_pid - convert pid in source pid-ns into target pid-ns.
+ * @pid:    pid for translation
+ * @source_type: one of TRANSLATE_PID_*
+ * @source: depending on @source_type pid-ns fd, pid, or nothing
+ * @target_type: one of TRANSLATE_PID_*
+ * @target: depending on @target_type pid-ns fd, pid, or nothing
+ *
+ * Returns pid in @target pid-ns, zero if task have no pid there,
+ * or -ESRCH if task with @pid does not found in @source pid-ns,
+ * or -EINVAL if @source or @target couldn't be resolved into pid-ns.
+ */
+SYSCALL_DEFINE5(translate_pid, pid_t, pid,
+		int, source_type, int, source,
+		int, target_type, int, target)
+{
+	struct pid_namespace *source_ns, *target_ns;
+	struct pid *struct_pid;
+	pid_t result = -EINVAL;
+
+	rcu_read_lock();
+	source_ns = resolve_pid_ns(source_type, source);
+	if (!source_ns)
+		goto out;
+	target_ns = resolve_pid_ns(target_type, target);
+	if (!target_ns)
+		goto out;
+	struct_pid = find_pid_ns(pid, source_ns);
+	result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH;
+out:
+	rcu_read_unlock();
+	return result;
+}
+
 static int pidns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 {
 	struct pid_namespace *active = task_active_pid_ns(current);
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 6cafc008f6db..777689bce406 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -146,6 +146,9 @@ COND_SYSCALL(delete_module);
 /* kernel/printk.c */
 COND_SYSCALL(syslog);
 
+/* kernel/pid_namespace.c */
+COND_SYSCALL(sys_translate_pid);
+
 /* kernel/ptrace.c */
 
 /* kernel/sched/core.c */


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-04 19:11 [PATCH RFC v5] pidns: introduce syscall translate_pid Konstantin Khlebnikov
@ 2018-04-04 20:31 ` Nagarathnam Muthusamy
  2018-04-04 22:29   ` Eric W. Biederman
  0 siblings, 1 reply; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-04-04 20:31 UTC (permalink / raw)
  To: Konstantin Khlebnikov, linux-api, linux-kernel
  Cc: Jann Horn, Serge Hallyn, Oleg Nesterov, Andy Lutomirski,
	Eric W. Biederman, Prakash Sangappa, Andrew Morton



On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
> Each process have different pids, one for each pid namespace it belongs.
> When interaction happens within single pid-ns translation isn't required.
> More complicated scenarios needs special handling.
>
> For example:
> - reading pid-files or logs written inside container with pid namespace
> - attaching with ptrace to tasks from different pid namespace
> - passing pids across pid namespaces in any kind of API
>
> Currently there are several interfaces that could be used here:
>
> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>
> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
> In some cases conversion pid -> vpid could be easily done using this
> information, but backward translation requires scanning all tasks.
>
> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
> into pid namespace, this expose process and could be insecure.
>
> This patch adds new syscall for converting pids between pid namespaces:
>
> pid_t translate_pid(pid_t pid, int source_type, int source,
>                                 int target_type, int target);
>
> @source_type and @target_type defines type of following arguments:
>
> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid

I believe using pid to represent the namespace has been already
discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
after which we moved on to fd based version of this interface.

-Nagarathnam.

> TRANSLATE_PID_FD_PIDNS       - pidns fd, argument is file descriptor
>
> Syscall returns pid in target pid-ns or zero if task have no pid there.
>
> Error codes:
> -EINVAL   - @source or @target couldn't be resolved into pid namespace
> -ESRCH    - task with @pid is not found in @source pid-namespace
>
> Other pid namespaces are referenced either by pid of any process who
> lives inside it or by file descriptor pointing to /proc/[pid]/ns/pid.
> Latter method provides better protection against races but in some
> cases requires CAP_SYS_PTRACE.
>
> Translate_pid could breach pid isolation and return pids from outer pid
> namespaces iff process already has file descriptor pointing to them.
>
>
> Examples:
>
> - get pid in current pid namespace
>
> translate_pid(pid, TRANSLATE_PID_FD_PIDNS, ns_fd,
>                     TRANSLATE_PID_CURRENT_PIDNS, 0)
> or
> translate_pid(pid, TRANSLATE_PID_TASK_PIDNS, ns_pid,
>                     TRANSLATE_PID_CURRENT_PIDNS, 0)
>
> - get pid in other pid namespace
>
> translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
>                     TRANSLATE_PID_FD_PIDNS, ns_fd)
> or
> translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
>                     TRANSLATE_PID_TASK_PIDNS, ns_pid)
>
> - get deepest pid
>
> translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
>                     TRANSLATE_PID_TASK_PIDNS, pid)
>
> - get pid of init task for namespace
>
> translate_pid(1, TRANSLATE_PID_FD_PIDNS, ns_fd,
>                   TRANSLATE_PID_CURRENT_PIDNS, 0)
>
>
> This syscall also could be used for checking topology of pid namespaces:
>
> - ns1 nests inside ns2
>
> translate_pid(1, TRANSLATE_PID_FD_PIDNS, ns1_fd,
>                   TRANSLATE_PID_FD_PIDNS, ns2_fd) > 1
>
> - task1 lives in same pid-namespace as task2
>
> translate_pid(1, TRANSLATE_PID_TASK_PIDNS, task1_pid,
>                   TRANSLATE_PID_TASK_PIDNS, task2_pid) == 1
>
> - task1 is isolated from task2
>
> translate_pid(task1_pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
>                           TRANSLATE_PID_TASK_PIDNS, task2_pid) == 0
>
> - pid is reachable from ns
>
> translate_pid(pid, TRANSLATE_PID_CURRENT_PIDNS, 0,
>                     TRANSLATE_PID_FD_PIDNS, ns_fd) > 0
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>
> ---
>
> v1: https://lkml.org/lkml/2015/9/15/411
> v2: https://lkml.org/lkml/2015/9/24/278
>   * use namespace-fd as second/third argument
>   * add -pid for getting parent pid
>   * move code into kernel/sys.c next to getppid
>   * drop ifdef CONFIG_PID_NS
>   * add generic syscall
> v3: https://lkml.org/lkml/2015/9/28/3
>   * use proc_ns_fdget()
>   * update description
>   * rebase to next-20150925
>   * fix conflict with mlock2
> v4: https://lkml.org/lkml/2017/10/16/852
>   * rename into translate_pid()
>   * remove syscall if CONFIG_PID_NS=n
>   * drop -pid for parent task
>   * drop fget-fdget optimizations
>   * add helper get_pid_ns_by_fd()
>   * wire only into x86
> v5:
>   * rewrite commit message
>   * resolve pidns by task pid or by pidns fd
>   * add arguments source_type and target_type
>
> --- sample tool translate_pid.c ---
>
> #define _GNU_SOURCE
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sched.h>
> #include <fcntl.h>
> #include <err.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <stdio.h>
>
> #ifndef SYS_translate_pid
> #ifdef __x86_64__
> #define SYS_translate_pid 333
> #endif
> #endif
>
> #ifndef TRANSLATE_PID_CURRENT_PIDNS
> #define TRANSLATE_PID_CURRENT_PIDNS	0
> #define TRANSLATE_PID_TASK_PIDNS	1
> #define TRANSLATE_PID_FD_PIDNS		2
> #endif
>
> pid_t translate_pid(pid_t pid, int source_type, int source,
> 			       int target_type, int target) {
> 	return syscall(SYS_translate_pid, pid, source_type, source,
> 					       target_type, target);
> }
>
> int main(int argc, char **argv) {
> 	int pid, source, target;
> 	char buf[64];
>
> 	if (argc != 4)
> 		errx(1, "usage: %s <pid> <source> <traget>", argv[0]);
>
> 	pid = atoi(argv[1]);
> 	int source_type, target_type;
> 	source = atoi(argv[2]);
> 	target = atoi(argv[3]);
>
> 	if (source < 0) {
> 		source_type = TRANSLATE_PID_TASK_PIDNS;
> 		source = -source;
> 	} else if (source > 0) {
> 		source_type = TRANSLATE_PID_FD_PIDNS;
> 		sprintf(buf, "/proc/%d/ns/pid", source);
> 		source = open(buf, O_RDONLY);
> 		if (source < 0)
> 			err(2, "open source %s", buf);
> 	} else {
> 		source_type = TRANSLATE_PID_CURRENT_PIDNS;
> 	}
>
> 	if (target < 0) {
> 		target_type = TRANSLATE_PID_TASK_PIDNS;
> 		target = -target;
> 	} else if (target > 0) {
> 		target_type = TRANSLATE_PID_FD_PIDNS;
> 		sprintf(buf, "/proc/%d/ns/pid", target);
> 		target = open(buf, O_RDONLY);
> 		if (target < 0)
> 			err(2, "open target %s", buf);
> 	} else {
> 		target_type = TRANSLATE_PID_CURRENT_PIDNS;
> 	}
>
> 	pid = translate_pid(pid, source_type, source, target_type, target);
> 	if (pid < 0)
> 		err(2, "translate");
>
> 	printf("%d\n", pid);
> 	return 0;
> }
>
> ---
> ---
>   arch/x86/entry/syscalls/syscall_32.tbl |    1 +
>   arch/x86/entry/syscalls/syscall_64.tbl |    1 +
>   include/linux/syscalls.h               |    4 ++
>   include/uapi/linux/sched.h             |    7 ++++
>   kernel/pid_namespace.c                 |   64 ++++++++++++++++++++++++++++++++
>   kernel/sys_ni.c                        |    3 ++
>   6 files changed, 80 insertions(+)
>
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index c58f75b088c5..aef52c709845 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -391,3 +391,4 @@
>   382	i386	pkey_free		sys_pkey_free
>   383	i386	statx			sys_statx
>   384	i386	arch_prctl		sys_arch_prctl			compat_sys_arch_prctl
> +385	i386	translate_pid		sys_translate_pid
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index 5aef183e2f85..1ebdab83c6f4 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -339,6 +339,7 @@
>   330	common	pkey_alloc		sys_pkey_alloc
>   331	common	pkey_free		sys_pkey_free
>   332	common	statx			sys_statx
> +333	common	translate_pid		sys_translate_pid
>   
>   #
>   # x32-specific system call numbers start at 512 to avoid cache impact
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index b961184f597a..d189a1f61160 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -553,6 +553,10 @@ asmlinkage long sys_clock_nanosleep(clockid_t which_clock, int flags,
>   /* kernel/printk.c */
>   asmlinkage long sys_syslog(int type, char __user *buf, int len);
>   
> +/* kernel/pid_namespace.c */
> +asmlinkage long sys_translate_pid(pid_t pid, int source_type, int source,
> +				  int target_type, int target);
> +
>   /* kernel/ptrace.c */
>   asmlinkage long sys_ptrace(long request, long pid, unsigned long addr,
>   			   unsigned long data);
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 22627f80063e..7c45fd8d33d7 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -55,4 +55,11 @@
>   			 SCHED_FLAG_RECLAIM		| \
>   			 SCHED_FLAG_DL_OVERRUN)
>   
> +/*
> + * For translate_pid()
> + */
> +#define TRANSLATE_PID_CURRENT_PIDNS	0	/* Current pid namespace */
> +#define TRANSLATE_PID_TASK_PIDNS	1	/* Namespace by task pid */
> +#define TRANSLATE_PID_FD_PIDNS		2	/* Namespace by pidns fd */
> +
>   #endif /* _UAPI_LINUX_SCHED_H */
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index 2a2ac53d8b8b..84c8b47289d5 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -13,6 +13,7 @@
>   #include <linux/user_namespace.h>
>   #include <linux/syscalls.h>
>   #include <linux/cred.h>
> +#include <linux/file.h>
>   #include <linux/err.h>
>   #include <linux/acct.h>
>   #include <linux/slab.h>
> @@ -380,6 +381,69 @@ static void pidns_put(struct ns_common *ns)
>   	put_pid_ns(to_pid_ns(ns));
>   }
>   
> +/* Under rcu_read_lock(). Returns pointer to pid_namespace or NULL. */
> +static struct pid_namespace *resolve_pid_ns(int type, int fd_or_pid)
> +{
> +	struct pid_namespace *current_ns = task_active_pid_ns(current);
> +	struct pid_namespace *pidns = NULL;
> +	struct ns_common *ns;
> +	struct file *file;
> +
> +	switch (type) {
> +	case TRANSLATE_PID_CURRENT_PIDNS:
> +		pidns = current_ns;
> +		break;
> +	case TRANSLATE_PID_TASK_PIDNS:
> +		pidns = ns_of_pid(find_pid_ns(fd_or_pid, current_ns));
> +		break;
> +	case TRANSLATE_PID_FD_PIDNS:
> +		file = proc_ns_fget(fd_or_pid);
> +		if (!IS_ERR(file)) {
> +			ns = get_proc_ns(file_inode(file));
> +			if (ns->ops->type == CLONE_NEWPID)
> +				pidns = to_pid_ns(ns);
> +			fput(file);
> +		}
> +		break;
> +	}
> +
> +	return pidns;
> +}
> +
> +/*
> + * translate_pid - convert pid in source pid-ns into target pid-ns.
> + * @pid:    pid for translation
> + * @source_type: one of TRANSLATE_PID_*
> + * @source: depending on @source_type pid-ns fd, pid, or nothing
> + * @target_type: one of TRANSLATE_PID_*
> + * @target: depending on @target_type pid-ns fd, pid, or nothing
> + *
> + * Returns pid in @target pid-ns, zero if task have no pid there,
> + * or -ESRCH if task with @pid does not found in @source pid-ns,
> + * or -EINVAL if @source or @target couldn't be resolved into pid-ns.
> + */
> +SYSCALL_DEFINE5(translate_pid, pid_t, pid,
> +		int, source_type, int, source,
> +		int, target_type, int, target)
> +{
> +	struct pid_namespace *source_ns, *target_ns;
> +	struct pid *struct_pid;
> +	pid_t result = -EINVAL;
> +
> +	rcu_read_lock();
> +	source_ns = resolve_pid_ns(source_type, source);
> +	if (!source_ns)
> +		goto out;
> +	target_ns = resolve_pid_ns(target_type, target);
> +	if (!target_ns)
> +		goto out;
> +	struct_pid = find_pid_ns(pid, source_ns);
> +	result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH;
> +out:
> +	rcu_read_unlock();
> +	return result;
> +}
> +
>   static int pidns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>   {
>   	struct pid_namespace *active = task_active_pid_ns(current);
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 6cafc008f6db..777689bce406 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -146,6 +146,9 @@ COND_SYSCALL(delete_module);
>   /* kernel/printk.c */
>   COND_SYSCALL(syslog);
>   
> +/* kernel/pid_namespace.c */
> +COND_SYSCALL(sys_translate_pid);
> +
>   /* kernel/ptrace.c */
>   
>   /* kernel/sched/core.c */
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-04 20:31 ` Nagarathnam Muthusamy
@ 2018-04-04 22:29   ` Eric W. Biederman
  2018-04-05  7:02     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2018-04-04 22:29 UTC (permalink / raw)
  To: Nagarathnam Muthusamy
  Cc: Konstantin Khlebnikov, linux-api, linux-kernel, Jann Horn,
	Serge Hallyn, Oleg Nesterov, Andy Lutomirski, Prakash Sangappa,
	Andrew Morton

Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:

> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>> Each process have different pids, one for each pid namespace it belongs.
>> When interaction happens within single pid-ns translation isn't required.
>> More complicated scenarios needs special handling.
>>
>> For example:
>> - reading pid-files or logs written inside container with pid namespace
>> - attaching with ptrace to tasks from different pid namespace
>> - passing pids across pid namespaces in any kind of API
>>
>> Currently there are several interfaces that could be used here:
>>
>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.

Using the inode number in interfaces is not an option.  Especially not
withou referencing the device number for the filesystem as well.

>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>> In some cases conversion pid -> vpid could be easily done using this
>> information, but backward translation requires scanning all tasks.
>>
>> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>> into pid namespace, this expose process and could be insecure.
>>
>> This patch adds new syscall for converting pids between pid namespaces:
>>
>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>                                 int target_type, int target);
>>
>> @source_type and @target_type defines type of following arguments:
>>
>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>
> I believe using pid to represent the namespace has been already
> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
> after which we moved on to fd based version of this interface.

Or in short why is the case of pids important?

You Konstantin you almost said why they were important in your message
saying you were going to send this one.  However you don't explain in
your description why you want to identify pid namespaces by pid.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-04 22:29   ` Eric W. Biederman
@ 2018-04-05  7:02     ` Konstantin Khlebnikov
  2018-04-23 17:37       ` Nagarathnam Muthusamy
  2018-04-27 12:15       ` Michael Kerrisk (man-pages)
  0 siblings, 2 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-04-05  7:02 UTC (permalink / raw)
  To: Eric W. Biederman, Nagarathnam Muthusamy
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton

On 05.04.2018 01:29, Eric W. Biederman wrote:
> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
> 
>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>> Each process have different pids, one for each pid namespace it belongs.
>>> When interaction happens within single pid-ns translation isn't required.
>>> More complicated scenarios needs special handling.
>>>
>>> For example:
>>> - reading pid-files or logs written inside container with pid namespace
>>> - attaching with ptrace to tasks from different pid namespace
>>> - passing pids across pid namespaces in any kind of API
>>>
>>> Currently there are several interfaces that could be used here:
>>>
>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
> 
> Using the inode number in interfaces is not an option.  Especially not
> withou referencing the device number for the filesystem as well.

This is supposed to be single-instance fs,
not part of proc but referenced but its magic "symlinks".

Device numbers are not mentioned in "man namespaces".

> 
>>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>>> In some cases conversion pid -> vpid could be easily done using this
>>> information, but backward translation requires scanning all tasks.
>>>
>>> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>>> into pid namespace, this expose process and could be insecure.
>>>
>>> This patch adds new syscall for converting pids between pid namespaces:
>>>
>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>                                  int target_type, int target);
>>>
>>> @source_type and @target_type defines type of following arguments:
>>>
>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>
>> I believe using pid to represent the namespace has been already
>> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
>> after which we moved on to fd based version of this interface.
> 
> Or in short why is the case of pids important?
> 
> You Konstantin you almost said why they were important in your message
> saying you were going to send this one.  However you don't explain in
> your description why you want to identify pid namespaces by pid.
> 

Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
pid based variant doesn't have such restrictions.
Most pid-based syscalls are racy in some cases but they are
here for decades and everybody knowns how to deal with it.
So, I've decided to merge both worlds in one interface which clearly tells what to expect.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-05  7:02     ` Konstantin Khlebnikov
@ 2018-04-23 17:37       ` Nagarathnam Muthusamy
  2018-04-25  5:36         ` Konstantin Khlebnikov
  2018-04-27 12:15       ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-04-23 17:37 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
> On 05.04.2018 01:29, Eric W. Biederman wrote:
>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>
>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>> Each process have different pids, one for each pid namespace it 
>>>> belongs.
>>>> When interaction happens within single pid-ns translation isn't 
>>>> required.
>>>> More complicated scenarios needs special handling.
>>>>
>>>> For example:
>>>> - reading pid-files or logs written inside container with pid 
>>>> namespace
>>>> - attaching with ptrace to tasks from different pid namespace
>>>> - passing pids across pid namespaces in any kind of API
>>>>
>>>> Currently there are several interfaces that could be used here:
>>>>
>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>
>> Using the inode number in interfaces is not an option. Especially not
>> withou referencing the device number for the filesystem as well.
>
> This is supposed to be single-instance fs,
> not part of proc but referenced but its magic "symlinks".
>
> Device numbers are not mentioned in "man namespaces".
>
>>
>>>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>>>> In some cases conversion pid -> vpid could be easily done using this
>>>> information, but backward translation requires scanning all tasks.
>>>>
>>>> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>>>> into pid namespace, this expose process and could be insecure.
>>>>
>>>> This patch adds new syscall for converting pids between pid 
>>>> namespaces:
>>>>
>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>                                  int target_type, int target);
>>>>
>>>> @source_type and @target_type defines type of following arguments:
>>>>
>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is 
>>>> unused
>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>
>>> I believe using pid to represent the namespace has been already
>>> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
>>> after which we moved on to fd based version of this interface.
>>
>> Or in short why is the case of pids important?
>>
>> You Konstantin you almost said why they were important in your message
>> saying you were going to send this one.  However you don't explain in
>> your description why you want to identify pid namespaces by pid.
>>
>
> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
> pid based variant doesn't have such restrictions.

Can you provide more information on usecase requiring PID translation 
but not used for tracing related purposes?
On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS and 
TRANSLATE_PID_FD_PIDNS integrated first and then possibly extend the 
interface to include TRANSLATE_PID_TASK_PIDNS in future?

Thanks,
Nagarathnam.
> Most pid-based syscalls are racy in some cases but they are
> here for decades and everybody knowns how to deal with it.
> So, I've decided to merge both worlds in one interface which clearly 
> tells what to expect.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-23 17:37       ` Nagarathnam Muthusamy
@ 2018-04-25  5:36         ` Konstantin Khlebnikov
  2018-05-15 17:19           ` Nagarathnam Muthusamy
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-04-25  5:36 UTC (permalink / raw)
  To: Nagarathnam Muthusamy, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton

On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
> 
> 
> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>
>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>> Each process have different pids, one for each pid namespace it belongs.
>>>>> When interaction happens within single pid-ns translation isn't required.
>>>>> More complicated scenarios needs special handling.
>>>>>
>>>>> For example:
>>>>> - reading pid-files or logs written inside container with pid namespace
>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>> - passing pids across pid namespaces in any kind of API
>>>>>
>>>>> Currently there are several interfaces that could be used here:
>>>>>
>>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>>
>>> Using the inode number in interfaces is not an option. Especially not
>>> withou referencing the device number for the filesystem as well.
>>
>> This is supposed to be single-instance fs,
>> not part of proc but referenced but its magic "symlinks".
>>
>> Device numbers are not mentioned in "man namespaces".
>>
>>>
>>>>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>>>>> In some cases conversion pid -> vpid could be easily done using this
>>>>> information, but backward translation requires scanning all tasks.
>>>>>
>>>>> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>>>>> into pid namespace, this expose process and could be insecure.
>>>>>
>>>>> This patch adds new syscall for converting pids between pid namespaces:
>>>>>
>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>                                  int target_type, int target);
>>>>>
>>>>> @source_type and @target_type defines type of following arguments:
>>>>>
>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>
>>>> I believe using pid to represent the namespace has been already
>>>> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
>>>> after which we moved on to fd based version of this interface.
>>>
>>> Or in short why is the case of pids important?
>>>
>>> You Konstantin you almost said why they were important in your message
>>> saying you were going to send this one.  However you don't explain in
>>> your description why you want to identify pid namespaces by pid.
>>>
>>
>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>> pid based variant doesn't have such restrictions.
> 
> Can you provide more information on usecase requiring PID translation but not used for tracing related purposes?

Any introspection for [nested] containers. It's easier to work when you have all information when you don't have any.
For example our CMS https://github.com/yandex/porto allows to start nested sub-container (or even deeper) by request from any container and 
have to tell back which pid task is have. And it could translate any pid inside into accessible by client and vice versa.

> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS and TRANSLATE_PID_FD_PIDNS integrated first and then possibly extend the 
> interface to include TRANSLATE_PID_TASK_PIDNS in future?

I don't see reason for this separation.
Pids and pid namespaces are part of the API for a long time.

> 
> Thanks,
> Nagarathnam.
>> Most pid-based syscalls are racy in some cases but they are
>> here for decades and everybody knowns how to deal with it.
>> So, I've decided to merge both worlds in one interface which clearly tells what to expect.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-05  7:02     ` Konstantin Khlebnikov
  2018-04-23 17:37       ` Nagarathnam Muthusamy
@ 2018-04-27 12:15       ` Michael Kerrisk (man-pages)
  2018-05-31 18:02           ` Eric W. Biederman
  1 sibling, 1 reply; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2018-04-27 12:15 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman, Nagarathnam Muthusamy
  Cc: mtk.manpages, linux-api, linux-kernel, Jann Horn, Serge Hallyn,
	Oleg Nesterov, Andy Lutomirski, Prakash Sangappa, Andrew Morton

On 04/05/2018 09:02 AM, Konstantin Khlebnikov wrote:
> On 05.04.2018 01:29, Eric W. Biederman wrote:
>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>
>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>> Each process have different pids, one for each pid namespace it belongs.
>>>> When interaction happens within single pid-ns translation isn't required.
>>>> More complicated scenarios needs special handling.
>>>>
>>>> For example:
>>>> - reading pid-files or logs written inside container with pid namespace
>>>> - attaching with ptrace to tasks from different pid namespace
>>>> - passing pids across pid namespaces in any kind of API
>>>>
>>>> Currently there are several interfaces that could be used here:
>>>>
>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>
>> Using the inode number in interfaces is not an option.  Especially not
>> withou referencing the device number for the filesystem as well.
> 
> This is supposed to be single-instance fs,
> not part of proc but referenced but its magic "symlinks".
> 
> Device numbers are not mentioned in "man namespaces".

Thanks for the heads-up!

That was a bug in the man-page. ioctl_ns(2) already says the right thing.
Now I patches namespaces(7), as below.

Cheers,

Michael

diff --git a/man7/namespaces.7 b/man7/namespaces.7
index 725ebaff6..3c155de7e 100644
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@@ -154,11 +154,14 @@ In Linux 3.7 and earlier, these files were visible as hard links.
 Since Linux 3.8,
 .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5
 they appear as symbolic links.
-If two processes are in the same namespace, then the inode numbers of their
+If two processes are in the same namespace,
+then the device IDs and inode numbers of their
 .IR /proc/[pid]/ns/xxx
 symbolic links will be the same; an application can check this using the
+.I stat.st_dev
+and
 .I stat.st_ino
-field returned by
+fields returned by
 .BR stat (2).
 The content of this symbolic link is a string containing
 the namespace type and inode number as in the following example:


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-25  5:36         ` Konstantin Khlebnikov
@ 2018-05-15 17:19           ` Nagarathnam Muthusamy
  2018-05-15 17:36             ` Konstantin Khlebnikov
  0 siblings, 1 reply; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-05-15 17:19 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>
>>
>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>
>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>> Each process have different pids, one for each pid namespace it 
>>>>>> belongs.
>>>>>> When interaction happens within single pid-ns translation isn't 
>>>>>> required.
>>>>>> More complicated scenarios needs special handling.
>>>>>>
>>>>>> For example:
>>>>>> - reading pid-files or logs written inside container with pid 
>>>>>> namespace
>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>
>>>>>> Currently there are several interfaces that could be used here:
>>>>>>
>>>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>>>
>>>> Using the inode number in interfaces is not an option. Especially not
>>>> withou referencing the device number for the filesystem as well.
>>>
>>> This is supposed to be single-instance fs,
>>> not part of proc but referenced but its magic "symlinks".
>>>
>>> Device numbers are not mentioned in "man namespaces".
>>>
>>>>
>>>>>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>>>>>> In some cases conversion pid -> vpid could be easily done using this
>>>>>> information, but backward translation requires scanning all tasks.
>>>>>>
>>>>>> Unix socket automatically translates pid attached to 
>>>>>> SCM_CREDENTIALS.
>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>
>>>>>> This patch adds new syscall for converting pids between pid 
>>>>>> namespaces:
>>>>>>
>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>                                  int target_type, int target);
>>>>>>
>>>>>> @source_type and @target_type defines type of following arguments:
>>>>>>
>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is 
>>>>>> unused
>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>
>>>>> I believe using pid to represent the namespace has been already
>>>>> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
>>>>> after which we moved on to fd based version of this interface.
>>>>
>>>> Or in short why is the case of pids important?
>>>>
>>>> You Konstantin you almost said why they were important in your message
>>>> saying you were going to send this one.  However you don't explain in
>>>> your description why you want to identify pid namespaces by pid.
>>>>
>>>
>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>> pid based variant doesn't have such restrictions.
>>
>> Can you provide more information on usecase requiring PID translation 
>> but not used for tracing related purposes?
>
> Any introspection for [nested] containers. It's easier to work when 
> you have all information when you don't have any.
> For example our CMS https://github.com/yandex/porto allows to start 
> nested sub-container (or even deeper) by request from any container 
> and have to tell back which pid task is have. And it could translate 
> any pid inside into accessible by client and vice versa.
>

I still dont get the exact reason why PID based approach to identify the 
namespace during pid translation process is absolutely required compared 
to fd based approach. From your version of TranslatePid in

https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp

I see that you are going through the trouble of forking a process and 
sending SMC_CREDENTIALS for pid translation. Even your existing API 
could be extremely simplified if translate_pid based on file descriptors 
make it to the gate and I believe from the last discussion it was almost 
there https://patchwork.kernel.org/patch/10305439/


>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS and 
>> TRANSLATE_PID_FD_PIDNS integrated first and then possibly extend the 
>> interface to include TRANSLATE_PID_TASK_PIDNS in future?
>
> I don't see reason for this separation.
> Pids and pid namespaces are part of the API for a long time.

If you are talking about the translate_pid API proposed, I believe the 
V4 proposed under https://patchwork.kernel.org/patch/10003935/ had only 
fd based API before a mix of PID and fd based is proposed in V5. Again, 
I was just wondering if we can get the FD based approach in first and 
then extend the API to include PID based approach later as fd based 
approach could provide a lot of immediate benefits?

Thanks,
Nagarathnam.
>
>>
>> Thanks,
>> Nagarathnam.
>>> Most pid-based syscalls are racy in some cases but they are
>>> here for decades and everybody knowns how to deal with it.
>>> So, I've decided to merge both worlds in one interface which clearly 
>>> tells what to expect.
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-15 17:19           ` Nagarathnam Muthusamy
@ 2018-05-15 17:36             ` Konstantin Khlebnikov
  2018-05-15 17:40               ` Nagarathnam Muthusamy
  2018-05-31 17:41               ` Nagarathnam Muthusamy
  0 siblings, 2 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-05-15 17:36 UTC (permalink / raw)
  To: Nagarathnam Muthusamy, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 15.05.2018 20:19, Nagarathnam Muthusamy wrote:
> 
> 
> On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
>> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>>
>>>
>>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>>
>>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>>> Each process have different pids, one for each pid namespace it belongs.
>>>>>>> When interaction happens within single pid-ns translation isn't required.
>>>>>>> More complicated scenarios needs special handling.
>>>>>>>
>>>>>>> For example:
>>>>>>> - reading pid-files or logs written inside container with pid namespace
>>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>>
>>>>>>> Currently there are several interfaces that could be used here:
>>>>>>>
>>>>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>>>>
>>>>> Using the inode number in interfaces is not an option. Especially not
>>>>> withou referencing the device number for the filesystem as well.
>>>>
>>>> This is supposed to be single-instance fs,
>>>> not part of proc but referenced but its magic "symlinks".
>>>>
>>>> Device numbers are not mentioned in "man namespaces".
>>>>
>>>>>
>>>>>>> Pids for nested Pid namespaces are shown in file /proc/[pid]/status.
>>>>>>> In some cases conversion pid -> vpid could be easily done using this
>>>>>>> information, but backward translation requires scanning all tasks.
>>>>>>>
>>>>>>> Unix socket automatically translates pid attached to SCM_CREDENTIALS.
>>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and entering
>>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>>
>>>>>>> This patch adds new syscall for converting pids between pid namespaces:
>>>>>>>
>>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>>                                  int target_type, int target);
>>>>>>>
>>>>>>> @source_type and @target_type defines type of following arguments:
>>>>>>>
>>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument is unused
>>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>>
>>>>>> I believe using pid to represent the namespace has been already
>>>>>> discussed in V1 of this patch in https://lkml.org/lkml/2015/9/22/1087
>>>>>> after which we moved on to fd based version of this interface.
>>>>>
>>>>> Or in short why is the case of pids important?
>>>>>
>>>>> You Konstantin you almost said why they were important in your message
>>>>> saying you were going to send this one.  However you don't explain in
>>>>> your description why you want to identify pid namespaces by pid.
>>>>>
>>>>
>>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>>> pid based variant doesn't have such restrictions.
>>>
>>> Can you provide more information on usecase requiring PID translation but not used for tracing related purposes?
>>
>> Any introspection for [nested] containers. It's easier to work when you have all information when you don't have any.
>> For example our CMS https://github.com/yandex/porto allows to start nested sub-container (or even deeper) by request from any container 
>> and have to tell back which pid task is have. And it could translate any pid inside into accessible by client and vice versa.
>>
> 
> I still dont get the exact reason why PID based approach to identify the namespace during pid translation process is absolutely required 
> compared to fd based approach. 

As I told open(/proc/%d/ns/pid) have security restrictions - same uid/CAP_SYS_PTRACE/whatever
Pidns-fd holds pid-namespace and without restrictions could be abused.
Pid based API is racy but always available without any restrictions.


> From your version of TranslatePid in
> 
> https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp
> 
> I see that you are going through the trouble of forking a process and sending SMC_CREDENTIALS for pid translation. Even your existing API 
> could be extremely simplified if translate_pid based on file descriptors make it to the gate and I believe from the last discussion it was 
> almost there https://patchwork.kernel.org/patch/10305439/
> 
> 
>>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS and TRANSLATE_PID_FD_PIDNS integrated first and then possibly extend 
>>> the interface to include TRANSLATE_PID_TASK_PIDNS in future?
>>
>> I don't see reason for this separation.
>> Pids and pid namespaces are part of the API for a long time.
> 
> If you are talking about the translate_pid API proposed, I believe the V4 proposed under https://patchwork.kernel.org/patch/10003935/ had 
> only fd based API before a mix of PID and fd based is proposed in V5. Again, I was just wondering if we can get the FD based approach in 
> first and then extend the API to include PID based approach later as fd based approach could provide a lot of immediate benefits?
> 
> Thanks,
> Nagarathnam.
>>
>>>
>>> Thanks,
>>> Nagarathnam.
>>>> Most pid-based syscalls are racy in some cases but they are
>>>> here for decades and everybody knowns how to deal with it.
>>>> So, I've decided to merge both worlds in one interface which clearly tells what to expect.
>>>
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-15 17:36             ` Konstantin Khlebnikov
@ 2018-05-15 17:40               ` Nagarathnam Muthusamy
  2018-05-15 17:44                 ` Nagarathnam Muthusamy
  2018-05-31 17:41               ` Nagarathnam Muthusamy
  1 sibling, 1 reply; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-05-15 17:40 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 05/15/2018 10:36 AM, Konstantin Khlebnikov wrote:
>
>
> On 15.05.2018 20:19, Nagarathnam Muthusamy wrote:
>>
>>
>> On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
>>> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>>>
>>>>
>>>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>>>
>>>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>>>> Each process have different pids, one for each pid namespace it 
>>>>>>>> belongs.
>>>>>>>> When interaction happens within single pid-ns translation isn't 
>>>>>>>> required.
>>>>>>>> More complicated scenarios needs special handling.
>>>>>>>>
>>>>>>>> For example:
>>>>>>>> - reading pid-files or logs written inside container with pid 
>>>>>>>> namespace
>>>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>>>
>>>>>>>> Currently there are several interfaces that could be used here:
>>>>>>>>
>>>>>>>> Pid namespaces are identified by inode number of 
>>>>>>>> /proc/[pid]/ns/pid.
>>>>>>
>>>>>> Using the inode number in interfaces is not an option. Especially 
>>>>>> not
>>>>>> withou referencing the device number for the filesystem as well.
>>>>>
>>>>> This is supposed to be single-instance fs,
>>>>> not part of proc but referenced but its magic "symlinks".
>>>>>
>>>>> Device numbers are not mentioned in "man namespaces".
>>>>>
>>>>>>
>>>>>>>> Pids for nested Pid namespaces are shown in file 
>>>>>>>> /proc/[pid]/status.
>>>>>>>> In some cases conversion pid -> vpid could be easily done using 
>>>>>>>> this
>>>>>>>> information, but backward translation requires scanning all tasks.
>>>>>>>>
>>>>>>>> Unix socket automatically translates pid attached to 
>>>>>>>> SCM_CREDENTIALS.
>>>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and 
>>>>>>>> entering
>>>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>>>
>>>>>>>> This patch adds new syscall for converting pids between pid 
>>>>>>>> namespaces:
>>>>>>>>
>>>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>>>                                  int target_type, int target);
>>>>>>>>
>>>>>>>> @source_type and @target_type defines type of following arguments:
>>>>>>>>
>>>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument 
>>>>>>>> is unused
>>>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>>>
>>>>>>> I believe using pid to represent the namespace has been already
>>>>>>> discussed in V1 of this patch in 
>>>>>>> https://lkml.org/lkml/2015/9/22/1087
>>>>>>> after which we moved on to fd based version of this interface.
>>>>>>
>>>>>> Or in short why is the case of pids important?
>>>>>>
>>>>>> You Konstantin you almost said why they were important in your 
>>>>>> message
>>>>>> saying you were going to send this one.  However you don't 
>>>>>> explain in
>>>>>> your description why you want to identify pid namespaces by pid.
>>>>>>
>>>>>
>>>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>>>> pid based variant doesn't have such restrictions.
>>>>
>>>> Can you provide more information on usecase requiring PID 
>>>> translation but not used for tracing related purposes?
>>>
>>> Any introspection for [nested] containers. It's easier to work when 
>>> you have all information when you don't have any.
>>> For example our CMS https://github.com/yandex/porto allows to start 
>>> nested sub-container (or even deeper) by request from any container 
>>> and have to tell back which pid task is have. And it could translate 
>>> any pid inside into accessible by client and vice versa.
>>>
>>
>> I still dont get the exact reason why PID based approach to identify 
>> the namespace during pid translation process is absolutely required 
>> compared to fd based approach. 
>
> As I told open(/proc/%d/ns/pid) have security restrictions - same 
> uid/CAP_SYS_PTRACE/whatever
> Pidns-fd holds pid-namespace and without restrictions could be abused.
> Pid based API is racy but always available without any restrictions.

I get that Pid based API is available without any restrictions but do we 
have any existing usecase which requires Pid based API but cannot use 
Pidns-fd based API? Most of the usecases discussed in this thread deals 
with introspection of a process by another process and I believe that 
security requirement for opening (/proc/%d/ns/pid) is required for all 
such usecases. In other words, Why would a process which does not belong 
to same uid of the process observed or have CAP_SYS_PTRACE be allowed to 
translate PID?

Thanks,
Nagarathnam.
>
>
>> From your version of TranslatePid in
>>
>> https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp 
>>
>>
>> I see that you are going through the trouble of forking a process and 
>> sending SMC_CREDENTIALS for pid translation. Even your existing API 
>> could be extremely simplified if translate_pid based on file 
>> descriptors make it to the gate and I believe from the last 
>> discussion it was almost there 
>> https://patchwork.kernel.org/patch/10305439/
>>
>>
>>>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS 
>>>> and TRANSLATE_PID_FD_PIDNS integrated first and then possibly 
>>>> extend the interface to include TRANSLATE_PID_TASK_PIDNS in future?
>>>
>>> I don't see reason for this separation.
>>> Pids and pid namespaces are part of the API for a long time.
>>
>> If you are talking about the translate_pid API proposed, I believe 
>> the V4 proposed under https://patchwork.kernel.org/patch/10003935/ 
>> had only fd based API before a mix of PID and fd based is proposed in 
>> V5. Again, I was just wondering if we can get the FD based approach 
>> in first and then extend the API to include PID based approach later 
>> as fd based approach could provide a lot of immediate benefits?
>>
>> Thanks,
>> Nagarathnam.
>>>
>>>>
>>>> Thanks,
>>>> Nagarathnam.
>>>>> Most pid-based syscalls are racy in some cases but they are
>>>>> here for decades and everybody knowns how to deal with it.
>>>>> So, I've decided to merge both worlds in one interface which 
>>>>> clearly tells what to expect.
>>>>
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-15 17:40               ` Nagarathnam Muthusamy
@ 2018-05-15 17:44                 ` Nagarathnam Muthusamy
  0 siblings, 0 replies; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-05-15 17:44 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 05/15/2018 10:40 AM, Nagarathnam Muthusamy wrote:
>
>
> On 05/15/2018 10:36 AM, Konstantin Khlebnikov wrote:
>>
>>
>> On 15.05.2018 20:19, Nagarathnam Muthusamy wrote:
>>>
>>>
>>> On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
>>>> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>>>>
>>>>>
>>>>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>>>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>>>>
>>>>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>>>>> Each process have different pids, one for each pid namespace 
>>>>>>>>> it belongs.
>>>>>>>>> When interaction happens within single pid-ns translation 
>>>>>>>>> isn't required.
>>>>>>>>> More complicated scenarios needs special handling.
>>>>>>>>>
>>>>>>>>> For example:
>>>>>>>>> - reading pid-files or logs written inside container with pid 
>>>>>>>>> namespace
>>>>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>>>>
>>>>>>>>> Currently there are several interfaces that could be used here:
>>>>>>>>>
>>>>>>>>> Pid namespaces are identified by inode number of 
>>>>>>>>> /proc/[pid]/ns/pid.
>>>>>>>
>>>>>>> Using the inode number in interfaces is not an option. 
>>>>>>> Especially not
>>>>>>> withou referencing the device number for the filesystem as well.
>>>>>>
>>>>>> This is supposed to be single-instance fs,
>>>>>> not part of proc but referenced but its magic "symlinks".
>>>>>>
>>>>>> Device numbers are not mentioned in "man namespaces".
>>>>>>
>>>>>>>
>>>>>>>>> Pids for nested Pid namespaces are shown in file 
>>>>>>>>> /proc/[pid]/status.
>>>>>>>>> In some cases conversion pid -> vpid could be easily done 
>>>>>>>>> using this
>>>>>>>>> information, but backward translation requires scanning all 
>>>>>>>>> tasks.
>>>>>>>>>
>>>>>>>>> Unix socket automatically translates pid attached to 
>>>>>>>>> SCM_CREDENTIALS.
>>>>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and 
>>>>>>>>> entering
>>>>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>>>>
>>>>>>>>> This patch adds new syscall for converting pids between pid 
>>>>>>>>> namespaces:
>>>>>>>>>
>>>>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>>>>                                  int target_type, int target);
>>>>>>>>>
>>>>>>>>> @source_type and @target_type defines type of following 
>>>>>>>>> arguments:
>>>>>>>>>
>>>>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument 
>>>>>>>>> is unused
>>>>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>>>>
>>>>>>>> I believe using pid to represent the namespace has been already
>>>>>>>> discussed in V1 of this patch in 
>>>>>>>> https://lkml.org/lkml/2015/9/22/1087
>>>>>>>> after which we moved on to fd based version of this interface.
>>>>>>>
>>>>>>> Or in short why is the case of pids important?
>>>>>>>
>>>>>>> You Konstantin you almost said why they were important in your 
>>>>>>> message
>>>>>>> saying you were going to send this one.  However you don't 
>>>>>>> explain in
>>>>>>> your description why you want to identify pid namespaces by pid.
>>>>>>>
>>>>>>
>>>>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>>>>> pid based variant doesn't have such restrictions.
>>>>>
>>>>> Can you provide more information on usecase requiring PID 
>>>>> translation but not used for tracing related purposes?
>>>>
>>>> Any introspection for [nested] containers. It's easier to work when 
>>>> you have all information when you don't have any.
>>>> For example our CMS https://github.com/yandex/porto allows to start 
>>>> nested sub-container (or even deeper) by request from any container 
>>>> and have to tell back which pid task is have. And it could 
>>>> translate any pid inside into accessible by client and vice versa.
>>>>
>>>
>>> I still dont get the exact reason why PID based approach to identify 
>>> the namespace during pid translation process is absolutely required 
>>> compared to fd based approach. 
>>
>> As I told open(/proc/%d/ns/pid) have security restrictions - same 
>> uid/CAP_SYS_PTRACE/whatever
>> Pidns-fd holds pid-namespace and without restrictions could be abused.
>> Pid based API is racy but always available without any restrictions.
>
> I get that Pid based API is available without any restrictions but do 
> we have any existing usecase which requires Pid based API but cannot 
> use Pidns-fd based API? Most of the usecases discussed in this thread 
> deals with introspection of a process by another process and I believe 
> that security requirement for opening (/proc/%d/ns/pid) is required 
> for all such usecases. In other words, Why would a process which does 
> not belong to same uid 

Typo: inspection of a process by another process

Thanks,
Nagarathnam.

> of the process observed or have CAP_SYS_PTRACE be allowed to translate 
> PID?
>
> Thanks,
> Nagarathnam.
>>
>>
>>> From your version of TranslatePid in
>>>
>>> https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp 
>>>
>>>
>>> I see that you are going through the trouble of forking a process 
>>> and sending SMC_CREDENTIALS for pid translation. Even your existing 
>>> API could be extremely simplified if translate_pid based on file 
>>> descriptors make it to the gate and I believe from the last 
>>> discussion it was almost there 
>>> https://patchwork.kernel.org/patch/10305439/
>>>
>>>
>>>>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS 
>>>>> and TRANSLATE_PID_FD_PIDNS integrated first and then possibly 
>>>>> extend the interface to include TRANSLATE_PID_TASK_PIDNS in future?
>>>>
>>>> I don't see reason for this separation.
>>>> Pids and pid namespaces are part of the API for a long time.
>>>
>>> If you are talking about the translate_pid API proposed, I believe 
>>> the V4 proposed under https://patchwork.kernel.org/patch/10003935/ 
>>> had only fd based API before a mix of PID and fd based is proposed 
>>> in V5. Again, I was just wondering if we can get the FD based 
>>> approach in first and then extend the API to include PID based 
>>> approach later as fd based approach could provide a lot of immediate 
>>> benefits?
>>>
>>> Thanks,
>>> Nagarathnam.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Nagarathnam.
>>>>>> Most pid-based syscalls are racy in some cases but they are
>>>>>> here for decades and everybody knowns how to deal with it.
>>>>>> So, I've decided to merge both worlds in one interface which 
>>>>>> clearly tells what to expect.
>>>>>
>>>
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-15 17:36             ` Konstantin Khlebnikov
  2018-05-15 17:40               ` Nagarathnam Muthusamy
@ 2018-05-31 17:41               ` Nagarathnam Muthusamy
  2018-05-31 17:52                 ` Eric W. Biederman
  2018-05-31 18:05                 ` Eric W. Biederman
  1 sibling, 2 replies; 20+ messages in thread
From: Nagarathnam Muthusamy @ 2018-05-31 17:41 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: linux-api, linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton



On 05/15/2018 10:36 AM, Konstantin Khlebnikov wrote:
>
>
> On 15.05.2018 20:19, Nagarathnam Muthusamy wrote:
>>
>>
>> On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
>>> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>>>
>>>>
>>>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>>>
>>>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>>>> Each process have different pids, one for each pid namespace it 
>>>>>>>> belongs.
>>>>>>>> When interaction happens within single pid-ns translation isn't 
>>>>>>>> required.
>>>>>>>> More complicated scenarios needs special handling.
>>>>>>>>
>>>>>>>> For example:
>>>>>>>> - reading pid-files or logs written inside container with pid 
>>>>>>>> namespace
>>>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>>>
>>>>>>>> Currently there are several interfaces that could be used here:
>>>>>>>>
>>>>>>>> Pid namespaces are identified by inode number of 
>>>>>>>> /proc/[pid]/ns/pid.
>>>>>>
>>>>>> Using the inode number in interfaces is not an option. Especially 
>>>>>> not
>>>>>> withou referencing the device number for the filesystem as well.
>>>>>
>>>>> This is supposed to be single-instance fs,
>>>>> not part of proc but referenced but its magic "symlinks".
>>>>>
>>>>> Device numbers are not mentioned in "man namespaces".
>>>>>
>>>>>>
>>>>>>>> Pids for nested Pid namespaces are shown in file 
>>>>>>>> /proc/[pid]/status.
>>>>>>>> In some cases conversion pid -> vpid could be easily done using 
>>>>>>>> this
>>>>>>>> information, but backward translation requires scanning all tasks.
>>>>>>>>
>>>>>>>> Unix socket automatically translates pid attached to 
>>>>>>>> SCM_CREDENTIALS.
>>>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and 
>>>>>>>> entering
>>>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>>>
>>>>>>>> This patch adds new syscall for converting pids between pid 
>>>>>>>> namespaces:
>>>>>>>>
>>>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>>>                                  int target_type, int target);
>>>>>>>>
>>>>>>>> @source_type and @target_type defines type of following arguments:
>>>>>>>>
>>>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace, argument 
>>>>>>>> is unused
>>>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>>>
>>>>>>> I believe using pid to represent the namespace has been already
>>>>>>> discussed in V1 of this patch in 
>>>>>>> https://lkml.org/lkml/2015/9/22/1087
>>>>>>> after which we moved on to fd based version of this interface.
>>>>>>
>>>>>> Or in short why is the case of pids important?
>>>>>>
>>>>>> You Konstantin you almost said why they were important in your 
>>>>>> message
>>>>>> saying you were going to send this one.  However you don't 
>>>>>> explain in
>>>>>> your description why you want to identify pid namespaces by pid.
>>>>>>
>>>>>
>>>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>>>> pid based variant doesn't have such restrictions.
>>>>
>>>> Can you provide more information on usecase requiring PID 
>>>> translation but not used for tracing related purposes?
>>>
>>> Any introspection for [nested] containers. It's easier to work when 
>>> you have all information when you don't have any.
>>> For example our CMS https://github.com/yandex/porto allows to start 
>>> nested sub-container (or even deeper) by request from any container 
>>> and have to tell back which pid task is have. And it could translate 
>>> any pid inside into accessible by client and vice versa.
>>>
>>
>> I still dont get the exact reason why PID based approach to identify 
>> the namespace during pid translation process is absolutely required 
>> compared to fd based approach. 
>
> As I told open(/proc/%d/ns/pid) have security restrictions - same 
> uid/CAP_SYS_PTRACE/whatever
> Pidns-fd holds pid-namespace and without restrictions could be abused.
> Pid based API is racy but always available without any restrictions.
>
>
>> From your version of TranslatePid in
>>
>> https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp 
>>
>>
>> I see that you are going through the trouble of forking a process and 
>> sending SMC_CREDENTIALS for pid translation. Even your existing API 
>> could be extremely simplified if translate_pid based on file 
>> descriptors make it to the gate and I believe from the last 
>> discussion it was almost there 
>> https://patchwork.kernel.org/patch/10305439/
>>
>>
>>>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS 
>>>> and TRANSLATE_PID_FD_PIDNS integrated first and then possibly 
>>>> extend the interface to include TRANSLATE_PID_TASK_PIDNS in future?
>>>
>>> I don't see reason for this separation.
>>> Pids and pid namespaces are part of the API for a long time.
>>
>> If you are talking about the translate_pid API proposed, I believe 
>> the V4 proposed under https://patchwork.kernel.org/patch/10003935/ 
>> had only fd based API before a mix of PID and fd based is proposed in 
>> V5. Again, I was just wondering if we can get the FD based approach 
>> in first and then extend the API to include PID based approach later 
>> as fd based approach could provide a lot of immediate benefits?
>>
>> Thanks,
>> Nagarathnam.
>>>
>>>>
>>>> Thanks,
>>>> Nagarathnam.
>>>>> Most pid-based syscalls are racy in some cases but they are
>>>>> here for decades and everybody knowns how to deal with it.
>>>>> So, I've decided to merge both worlds in one interface which 
>>>>> clearly tells what to expect.
>>>>
>>

Ping? Any additional comments on this patch?

Thanks,
Nagarathnam.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-31 17:41               ` Nagarathnam Muthusamy
@ 2018-05-31 17:52                 ` Eric W. Biederman
  2018-05-31 18:05                 ` Eric W. Biederman
  1 sibling, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2018-05-31 17:52 UTC (permalink / raw)
  To: Nagarathnam Muthusamy
  Cc: Konstantin Khlebnikov, linux-api, linux-kernel, Jann Horn,
	Serge Hallyn, Oleg Nesterov, Andy Lutomirski, Prakash Sangappa,
	Andrew Morton

Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:

> On 05/15/2018 10:36 AM, Konstantin Khlebnikov wrote:
>>
>>
>> On 15.05.2018 20:19, Nagarathnam Muthusamy wrote:
>>>
>>>
>>> On 04/24/2018 10:36 PM, Konstantin Khlebnikov wrote:
>>>> On 23.04.2018 20:37, Nagarathnam Muthusamy wrote:
>>>>>
>>>>>
>>>>> On 04/05/2018 12:02 AM, Konstantin Khlebnikov wrote:
>>>>>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>>>>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>>>>>
>>>>>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>>>>>> Each process have different pids, one for each pid namespace
>>>>>>>>> it belongs.
>>>>>>>>> When interaction happens within single pid-ns translation
>>>>>>>>> isn't required.
>>>>>>>>> More complicated scenarios needs special handling.
>>>>>>>>>
>>>>>>>>> For example:
>>>>>>>>> - reading pid-files or logs written inside container with pid
>>>>>>>>> namespace
>>>>>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>>>>>> - passing pids across pid namespaces in any kind of API
>>>>>>>>>
>>>>>>>>> Currently there are several interfaces that could be used here:
>>>>>>>>>
>>>>>>>>> Pid namespaces are identified by inode number of
>>>>>>>>> /proc/[pid]/ns/pid.
>>>>>>>
>>>>>>> Using the inode number in interfaces is not an
>>>>>>> option. Especially not
>>>>>>> withou referencing the device number for the filesystem as well.
>>>>>>
>>>>>> This is supposed to be single-instance fs,
>>>>>> not part of proc but referenced but its magic "symlinks".
>>>>>>
>>>>>> Device numbers are not mentioned in "man namespaces".
>>>>>>
>>>>>>>
>>>>>>>>> Pids for nested Pid namespaces are shown in file
>>>>>>>>> /proc/[pid]/status.
>>>>>>>>> In some cases conversion pid -> vpid could be easily done
>>>>>>>>> using this
>>>>>>>>> information, but backward translation requires scanning all tasks.
>>>>>>>>>
>>>>>>>>> Unix socket automatically translates pid attached to
>>>>>>>>> SCM_CREDENTIALS.
>>>>>>>>> This requires CAP_SYS_ADMIN for sending arbitrary pids and
>>>>>>>>> entering
>>>>>>>>> into pid namespace, this expose process and could be insecure.
>>>>>>>>>
>>>>>>>>> This patch adds new syscall for converting pids between pid
>>>>>>>>> namespaces:
>>>>>>>>>
>>>>>>>>> pid_t translate_pid(pid_t pid, int source_type, int source,
>>>>>>>>>                                  int target_type, int target);
>>>>>>>>>
>>>>>>>>> @source_type and @target_type defines type of following arguments:
>>>>>>>>>
>>>>>>>>> TRANSLATE_PID_CURRENT_PIDNS  - current pid namespace,
>>>>>>>>> argument is unused
>>>>>>>>> TRANSLATE_PID_TASK_PIDNS     - task pid-ns, argument is task pid
>>>>>>>>
>>>>>>>> I believe using pid to represent the namespace has been already
>>>>>>>> discussed in V1 of this patch in
>>>>>>>> https://lkml.org/lkml/2015/9/22/1087
>>>>>>>> after which we moved on to fd based version of this interface.
>>>>>>>
>>>>>>> Or in short why is the case of pids important?
>>>>>>>
>>>>>>> You Konstantin you almost said why they were important in your
>>>>>>> message
>>>>>>> saying you were going to send this one.  However you don't
>>>>>>> explain in
>>>>>>> your description why you want to identify pid namespaces by pid.
>>>>>>>
>>>>>>
>>>>>> Open of /proc/[pid]/ns/pid requires same permissions as ptrace,
>>>>>> pid based variant doesn't have such restrictions.
>>>>>
>>>>> Can you provide more information on usecase requiring PID
>>>>> translation but not used for tracing related purposes?
>>>>
>>>> Any introspection for [nested] containers. It's easier to work
>>>> when you have all information when you don't have any.
>>>> For example our CMS https://github.com/yandex/porto allows to
>>>> start nested sub-container (or even deeper) by request from any
>>>> container and have to tell back which pid task is have. And it
>>>> could translate any pid inside into accessible by client and vice
>>>> versa.
>>>>
>>>
>>> I still dont get the exact reason why PID based approach to
>>> identify the namespace during pid translation process is absolutely
>>> required compared to fd based approach. 
>>
>> As I told open(/proc/%d/ns/pid) have security restrictions - same
>> uid/CAP_SYS_PTRACE/whatever
>> Pidns-fd holds pid-namespace and without restrictions could be abused.
>> Pid based API is racy but always available without any restrictions.
>>
>>
>>> From your version of TranslatePid in
>>>
>>> https://github.com/yandex/porto/blob/0d7e6e7e1830dcd0038a057b2ab9964cec5b8fab/src/util/unix.cpp 
>>>
>>>
>>> I see that you are going through the trouble of forking a process
>>> and sending SMC_CREDENTIALS for pid translation. Even your existing
>>> API could be extremely simplified if translate_pid based on file
>>> descriptors make it to the gate and I believe from the last
>>> discussion it was almost there
>>> https://patchwork.kernel.org/patch/10305439/
>>>
>>>
>>>>> On a side note, can we have the types TRANSLATE_PID_CURRENT_PIDNS
>>>>> and TRANSLATE_PID_FD_PIDNS integrated first and then possibly
>>>>> extend the interface to include TRANSLATE_PID_TASK_PIDNS in
>>>>> future?
>>>>
>>>> I don't see reason for this separation.
>>>> Pids and pid namespaces are part of the API for a long time.
>>>
>>> If you are talking about the translate_pid API proposed, I believe
>>> the V4 proposed under https://patchwork.kernel.org/patch/10003935/
>>> had only fd based API before a mix of PID and fd based is proposed
>>> in V5. Again, I was just wondering if we can get the FD based
>>> approach in first and then extend the API to include PID based
>>> approach later as fd based approach could provide a lot of
>>> immediate benefits?
>>>
>>> Thanks,
>>> Nagarathnam.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Nagarathnam.
>>>>>> Most pid-based syscalls are racy in some cases but they are
>>>>>> here for decades and everybody knowns how to deal with it.
>>>>>> So, I've decided to merge both worlds in one interface which
>>>>>> clearly tells what to expect.
>>>>>
>>>
>
> Ping? Any additional comments on this patch?

I have totally lost the thread.  Let me see if I can find enough of the
thread to see what is going on.

The whole let's use pids instead of fds was a major distraction.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-04-27 12:15       ` Michael Kerrisk (man-pages)
@ 2018-05-31 18:02           ` Eric W. Biederman
  0 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2018-05-31 18:02 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Konstantin Khlebnikov, Nagarathnam Muthusamy, linux-api,
	linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 04/05/2018 09:02 AM, Konstantin Khlebnikov wrote:
>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>
>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>> Each process have different pids, one for each pid namespace it belongs.
>>>>> When interaction happens within single pid-ns translation isn't required.
>>>>> More complicated scenarios needs special handling.
>>>>>
>>>>> For example:
>>>>> - reading pid-files or logs written inside container with pid namespace
>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>> - passing pids across pid namespaces in any kind of API
>>>>>
>>>>> Currently there are several interfaces that could be used here:
>>>>>
>>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>>
>>> Using the inode number in interfaces is not an option.  Especially not
>>> withou referencing the device number for the filesystem as well.
>> 
>> This is supposed to be single-instance fs,
>> not part of proc but referenced but its magic "symlinks".
>> 
>> Device numbers are not mentioned in "man namespaces".
>
> Thanks for the heads-up!
>
> That was a bug in the man-page. ioctl_ns(2) already says the right thing.
> Now I patches namespaces(7), as below.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

For the changes to namespaces.7.  I suspect you have already applied
them by now, but if not.

Eric


> Cheers,
>
> Michael
>
> diff --git a/man7/namespaces.7 b/man7/namespaces.7
> index 725ebaff6..3c155de7e 100644
> --- a/man7/namespaces.7
> +++ b/man7/namespaces.7
> @@ -154,11 +154,14 @@ In Linux 3.7 and earlier, these files were visible as hard links.
>  Since Linux 3.8,
>  .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5
>  they appear as symbolic links.
> -If two processes are in the same namespace, then the inode numbers of their
> +If two processes are in the same namespace,
> +then the device IDs and inode numbers of their
>  .IR /proc/[pid]/ns/xxx
>  symbolic links will be the same; an application can check this using the
> +.I stat.st_dev
> +and
>  .I stat.st_ino
> -field returned by
> +fields returned by
>  .BR stat (2).
>  The content of this symbolic link is a string containing
>  the namespace type and inode number as in the following example:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
@ 2018-05-31 18:02           ` Eric W. Biederman
  0 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2018-05-31 18:02 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Konstantin Khlebnikov, Nagarathnam Muthusamy, linux-api,
	linux-kernel, Jann Horn, Serge Hallyn, Oleg Nesterov,
	Andy Lutomirski, Prakash Sangappa, Andrew Morton

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 04/05/2018 09:02 AM, Konstantin Khlebnikov wrote:
>> On 05.04.2018 01:29, Eric W. Biederman wrote:
>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>
>>>> On 04/04/2018 12:11 PM, Konstantin Khlebnikov wrote:
>>>>> Each process have different pids, one for each pid namespace it belongs.
>>>>> When interaction happens within single pid-ns translation isn't required.
>>>>> More complicated scenarios needs special handling.
>>>>>
>>>>> For example:
>>>>> - reading pid-files or logs written inside container with pid namespace
>>>>> - attaching with ptrace to tasks from different pid namespace
>>>>> - passing pids across pid namespaces in any kind of API
>>>>>
>>>>> Currently there are several interfaces that could be used here:
>>>>>
>>>>> Pid namespaces are identified by inode number of /proc/[pid]/ns/pid.
>>>
>>> Using the inode number in interfaces is not an option.  Especially not
>>> withou referencing the device number for the filesystem as well.
>> 
>> This is supposed to be single-instance fs,
>> not part of proc but referenced but its magic "symlinks".
>> 
>> Device numbers are not mentioned in "man namespaces".
>
> Thanks for the heads-up!
>
> That was a bug in the man-page. ioctl_ns(2) already says the right thing.
> Now I patches namespaces(7), as below.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

For the changes to namespaces.7.  I suspect you have already applied
them by now, but if not.

Eric


> Cheers,
>
> Michael
>
> diff --git a/man7/namespaces.7 b/man7/namespaces.7
> index 725ebaff6..3c155de7e 100644
> --- a/man7/namespaces.7
> +++ b/man7/namespaces.7
> @@ -154,11 +154,14 @@ In Linux 3.7 and earlier, these files were visible as hard links.
>  Since Linux 3.8,
>  .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5
>  they appear as symbolic links.
> -If two processes are in the same namespace, then the inode numbers of their
> +If two processes are in the same namespace,
> +then the device IDs and inode numbers of their
>  .IR /proc/[pid]/ns/xxx
>  symbolic links will be the same; an application can check this using the
> +.I stat.st_dev
> +and
>  .I stat.st_ino
> -field returned by
> +fields returned by
>  .BR stat (2).
>  The content of this symbolic link is a string containing
>  the namespace type and inode number as in the following example:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-31 17:41               ` Nagarathnam Muthusamy
  2018-05-31 17:52                 ` Eric W. Biederman
@ 2018-05-31 18:05                 ` Eric W. Biederman
  2018-06-01  6:58                   ` Konstantin Khlebnikov
  1 sibling, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2018-05-31 18:05 UTC (permalink / raw)
  To: Nagarathnam Muthusamy
  Cc: Konstantin Khlebnikov, linux-api, linux-kernel, Jann Horn,
	Serge Hallyn, Oleg Nesterov, Andy Lutomirski, Prakash Sangappa,
	Andrew Morton

Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:

> Ping? Any additional comments on this patch?

Konstantin's v5 no.

  I am uncomfortable with unnecessary flexibility.  I don't want to
  encourage the increased usage of pids to identify namespaces.  I saw
  no arguments in favor of pids to identify namespaces from Konstantin.

Your v4 updated to reflect Andrew Morton's concerns in the changelog
yes.

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-05-31 18:05                 ` Eric W. Biederman
@ 2018-06-01  6:58                   ` Konstantin Khlebnikov
  2018-06-01 15:55                     ` NAGARATHNAM MUTHUSAMY
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-06-01  6:58 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Nagarathnam Muthusamy, Konstantin Khlebnikov, Linux API,
	Linux Kernel Mailing List, Jann Horn, Serge Hallyn,
	Oleg Nesterov, Andy Lutomirski, Prakash Sangappa, Andrew Morton

On Thu, May 31, 2018 at 9:05 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>
>> Ping? Any additional comments on this patch?
>
> Konstantin's v5 no.
>
>   I am uncomfortable with unnecessary flexibility.  I don't want to
>   encourage the increased usage of pids to identify namespaces.  I saw
>   no arguments in favor of pids to identify namespaces from Konstantin.
>
> Your v4 updated to reflect Andrew Morton's concerns in the changelog
> yes.

Ok, whatever. If I the only blocker then let it be v4 design.

>
> Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-06-01  6:58                   ` Konstantin Khlebnikov
@ 2018-06-01 15:55                     ` NAGARATHNAM MUTHUSAMY
  2018-06-01 16:14                       ` Eric W. Biederman
  2018-06-01 16:15                       ` Konstantin Khlebnikov
  0 siblings, 2 replies; 20+ messages in thread
From: NAGARATHNAM MUTHUSAMY @ 2018-06-01 15:55 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Eric W. Biederman
  Cc: Konstantin Khlebnikov, Linux API, Linux Kernel Mailing List,
	Jann Horn, Serge Hallyn, Oleg Nesterov, Andy Lutomirski,
	Prakash Sangappa, Andrew Morton



On 5/31/2018 11:58 PM, Konstantin Khlebnikov wrote:
> On Thu, May 31, 2018 at 9:05 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>
>>> Ping? Any additional comments on this patch?
>> Konstantin's v5 no.
>>
>>    I am uncomfortable with unnecessary flexibility.  I don't want to
>>    encourage the increased usage of pids to identify namespaces.  I saw
>>    no arguments in favor of pids to identify namespaces from Konstantin.
>>
>> Your v4 updated to reflect Andrew Morton's concerns in the changelog
>> yes.
> Ok, whatever. If I the only blocker then let it be v4 design.

Thanks! I will resend the V4 patch with a plain text man page for 
translate_pid.

Thanks,
Nagarathnam.
>
>> Eric
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-api" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-06-01 15:55                     ` NAGARATHNAM MUTHUSAMY
@ 2018-06-01 16:14                       ` Eric W. Biederman
  2018-06-01 16:15                       ` Konstantin Khlebnikov
  1 sibling, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2018-06-01 16:14 UTC (permalink / raw)
  To: NAGARATHNAM MUTHUSAMY
  Cc: Konstantin Khlebnikov, Konstantin Khlebnikov, Linux API,
	Linux Kernel Mailing List, Jann Horn, Serge Hallyn,
	Oleg Nesterov, Andy Lutomirski, Prakash Sangappa, Andrew Morton

NAGARATHNAM MUTHUSAMY <nagarathnam.muthusamy@oracle.com> writes:

> On 5/31/2018 11:58 PM, Konstantin Khlebnikov wrote:
>> On Thu, May 31, 2018 at 9:05 PM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>
>>>> Ping? Any additional comments on this patch?
>>> Konstantin's v5 no.
>>>
>>>    I am uncomfortable with unnecessary flexibility.  I don't want to
>>>    encourage the increased usage of pids to identify namespaces.  I saw
>>>    no arguments in favor of pids to identify namespaces from Konstantin.
>>>
>>> Your v4 updated to reflect Andrew Morton's concerns in the changelog
>>> yes.
>> Ok, whatever. If I the only blocker then let it be v4 design.
>
> Thanks! I will resend the V4 patch with a plain text man page for
> translate_pid.

I believe Andrew was also asking for the motivating use case in the
description of the patch.

Thank you,
Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v5] pidns: introduce syscall translate_pid
  2018-06-01 15:55                     ` NAGARATHNAM MUTHUSAMY
  2018-06-01 16:14                       ` Eric W. Biederman
@ 2018-06-01 16:15                       ` Konstantin Khlebnikov
  1 sibling, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2018-06-01 16:15 UTC (permalink / raw)
  To: NAGARATHNAM MUTHUSAMY, Konstantin Khlebnikov, Eric W. Biederman
  Cc: Linux API, Linux Kernel Mailing List, Jann Horn, Serge Hallyn,
	Oleg Nesterov, Andy Lutomirski, Prakash Sangappa, Andrew Morton

On 01.06.2018 18:55, NAGARATHNAM MUTHUSAMY wrote:
> 
> 
> On 5/31/2018 11:58 PM, Konstantin Khlebnikov wrote:
>> On Thu, May 31, 2018 at 9:05 PM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>> Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> writes:
>>>
>>>> Ping? Any additional comments on this patch?
>>> Konstantin's v5 no.
>>>
>>>    I am uncomfortable with unnecessary flexibility.  I don't want to
>>>    encourage the increased usage of pids to identify namespaces.  I saw
>>>    no arguments in favor of pids to identify namespaces from Konstantin.
>>>
>>> Your v4 updated to reflect Andrew Morton's concerns in the changelog
>>> yes.
>> Ok, whatever. If I the only blocker then let it be v4 design.
> 
> Thanks! I will resend the V4 patch with a plain text man page for translate_pid.

Syscall integration have changed since then.
I've prepared v6 patch with v4 design and updated comment.

> 
> Thanks,
> Nagarathnam.
>>
>>> Eric
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-api" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-06-01 16:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04 19:11 [PATCH RFC v5] pidns: introduce syscall translate_pid Konstantin Khlebnikov
2018-04-04 20:31 ` Nagarathnam Muthusamy
2018-04-04 22:29   ` Eric W. Biederman
2018-04-05  7:02     ` Konstantin Khlebnikov
2018-04-23 17:37       ` Nagarathnam Muthusamy
2018-04-25  5:36         ` Konstantin Khlebnikov
2018-05-15 17:19           ` Nagarathnam Muthusamy
2018-05-15 17:36             ` Konstantin Khlebnikov
2018-05-15 17:40               ` Nagarathnam Muthusamy
2018-05-15 17:44                 ` Nagarathnam Muthusamy
2018-05-31 17:41               ` Nagarathnam Muthusamy
2018-05-31 17:52                 ` Eric W. Biederman
2018-05-31 18:05                 ` Eric W. Biederman
2018-06-01  6:58                   ` Konstantin Khlebnikov
2018-06-01 15:55                     ` NAGARATHNAM MUTHUSAMY
2018-06-01 16:14                       ` Eric W. Biederman
2018-06-01 16:15                       ` Konstantin Khlebnikov
2018-04-27 12:15       ` Michael Kerrisk (man-pages)
2018-05-31 18:02         ` Eric W. Biederman
2018-05-31 18:02           ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.