All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] pidfd: show pids for nested pid namespaces in fdinfo
@ 2019-10-08 13:36 Christian Kellner
  2019-10-08 13:52 ` Christian Brauner
  2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
  0 siblings, 2 replies; 36+ messages in thread
From: Christian Kellner @ 2019-10-08 13:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Kellner, Christian Brauner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Elena Reshetova, Thomas Gleixner,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin

From: Christian Kellner <christian@kellner.me>

The fdinfo file for a process file descriptor already contains the
pid of the process in the callers namespaces. Additionally, if pid
namespaces are configured, show the process ids of the process in
all nested namespaces in the same format as in the procfs status
file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
of the processes in nested namespaces.

Signed-off-by: Christian Kellner <christian@kellner.me>
---
 kernel/fork.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 5a0fd518e04e..801793789f33 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1685,8 +1685,18 @@ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
 {
 	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
 	struct pid *pid = f->private_data;
+#ifdef CONFIG_PID_NS
+	int i;
+#endif
 
 	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
+#ifdef CONFIG_PID_NS
+	seq_puts(m, "\nNSpid:");
+	for (i = ns->level; i <= pid->level; i++) {
+		ns = pid_nr_ns(pid, pid->numbers[i].ns);
+		seq_put_decimal_ull(m, "\t", ns);
+	}
+#endif
 	seq_putc(m, '\n');
 }
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-08 13:36 [PATCH] pidfd: show pids for nested pid namespaces in fdinfo Christian Kellner
@ 2019-10-08 13:52 ` Christian Brauner
  2019-10-08 14:00   ` Michal Hocko
  2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
  1 sibling, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-08 13:52 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, Christian Kellner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Elena Reshetova, Thomas Gleixner,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin

On Tue, Oct 08, 2019 at 03:36:37PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> The fdinfo file for a process file descriptor already contains the
> pid of the process in the callers namespaces. Additionally, if pid
> namespaces are configured, show the process ids of the process in
> all nested namespaces in the same format as in the procfs status
> file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> of the processes in nested namespaces.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>

Yeah, makes sense to me.
Note that if you send the pidfd to a sibling pid namespace NSpid won't
show you anything useful. But that's what I'd expect security wise. You
should only be able to snoop on descendant pid namespaces.

Please add a test for this to verify that this all works correctly and
then resend. The tests live in tools/testing/selftests/pidfd/ and should
already have most of the infrastructure there. The fdinfo parsing code
should be in samples/pidfd/ which

For the patch itself:

Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

You can resend with my Reviewed-by retained if you don't change
anything. Before I see tests I'll hold off on merging this. ;)

Thanks!
Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-08 13:52 ` Christian Brauner
@ 2019-10-08 14:00   ` Michal Hocko
  0 siblings, 0 replies; 36+ messages in thread
From: Michal Hocko @ 2019-10-08 14:00 UTC (permalink / raw)
  To: Christian Kellner, Christian Brauner
  Cc: linux-kernel, Christian Kellner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Elena Reshetova, Thomas Gleixner, Roman Gushchin,
	Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin

On Tue 08-10-19 15:52:59, Christian Brauner wrote:
> On Tue, Oct 08, 2019 at 03:36:37PM +0200, Christian Kellner wrote:
> > From: Christian Kellner <christian@kellner.me>
> > 
> > The fdinfo file for a process file descriptor already contains the
> > pid of the process in the callers namespaces. Additionally, if pid
> > namespaces are configured, show the process ids of the process in
> > all nested namespaces in the same format as in the procfs status
> > file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> > of the processes in nested namespaces.
> > 
> > Signed-off-by: Christian Kellner <christian@kellner.me>
> 
> Yeah, makes sense to me.
> Note that if you send the pidfd to a sibling pid namespace NSpid won't
> show you anything useful. But that's what I'd expect security wise. You
> should only be able to snoop on descendant pid namespaces.
> 
> Please add a test for this to verify that this all works correctly and
> then resend. The tests live in tools/testing/selftests/pidfd/ and should
> already have most of the infrastructure there. The fdinfo parsing code
> should be in samples/pidfd/ which
> 
> For the patch itself:
> 
> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
> 
> You can resend with my Reviewed-by retained if you don't change
> anything. Before I see tests I'll hold off on merging this. ;)

This is also forming a new user visible "api" right? So the make sure
that linux-api is on the Cc list.

And one minore note. The ifdefery is just ugly, could you just make it a
separate function with ifdef hidden inside?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-08 13:36 [PATCH] pidfd: show pids for nested pid namespaces in fdinfo Christian Kellner
  2019-10-08 13:52 ` Christian Brauner
@ 2019-10-09 16:05 ` Christian Kellner
  2019-10-09 16:05   ` [PATCH v2 2/2] pidfd: add tests for NSpid info " Christian Kellner
                     ` (2 more replies)
  1 sibling, 3 replies; 36+ messages in thread
From: Christian Kellner @ 2019-10-09 16:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Christian Kellner, Christian Brauner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Elena Reshetova, Thomas Gleixner,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin

From: Christian Kellner <christian@kellner.me>

The fdinfo file for a process file descriptor already contains the
pid of the process in the callers namespaces. Additionally, if pid
namespaces are configured, show the process ids of the process in
all nested namespaces in the same format as in the procfs status
file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
of the processes in nested namespaces.

Signed-off-by: Christian Kellner <christian@kellner.me>
---

Changes in v2:
- Moved into separate function to avoid multiple ifdefs as suggested
  by Michal Hocko

 kernel/fork.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 5a0fd518e04e..f7a59ef046e9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1681,12 +1681,27 @@ static int pidfd_release(struct inode *inode, struct file *file)
 }
 
 #ifdef CONFIG_PROC_FS
+static void pidfd_nspid(struct seq_file *m, struct pid *pid)
+{
+#ifdef CONFIG_PID_NS
+	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
+	int i;
+
+	seq_puts(m, "\nNSpid:");
+	for (i = ns->level; i <= pid->level; i++) {
+		ns = pid->numbers[i].ns;
+		seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
+	}
+#endif
+}
+
 static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
 {
 	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
 	struct pid *pid = f->private_data;
 
 	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
+	pidfd_nspid(m, pid);
 	seq_putc(m, '\n');
 }
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
@ 2019-10-09 16:05   ` Christian Kellner
  2019-10-11 15:09       ` Jann Horn
  2019-10-09 17:29   ` [PATCH v2 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
  2 siblings, 1 reply; 36+ messages in thread
From: Christian Kellner @ 2019-10-09 16:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Christian Kellner, Christian Brauner, Shuah Khan,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin, linux-kselftest

From: Christian Kellner <christian@kellner.me>

Add tests that check that if pid namespaces are configured the fdinfo
file of a pidfd contains an NSpid: entry containing the process id
in the current and additionally all nested namespaces.

Signed-off-by: Christian Kellner <christian@kellner.me>
---
 tools/testing/selftests/pidfd/Makefile        |  2 +-
 tools/testing/selftests/pidfd/pidfd.h         | 12 +++
 .../selftests/pidfd/pidfd_fdinfo_test.c       | 98 +++++++++++++++++++
 tools/testing/selftests/pidfd/pidfd_test.c    | 12 ---
 4 files changed, 111 insertions(+), 13 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_fdinfo_test.c

diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index 464c9b76148f..b7784dc488b8 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 CFLAGS += -g -I../../../../usr/include/ -lpthread
 
-TEST_GEN_PROGS := pidfd_test pidfd_open_test pidfd_poll_test pidfd_wait
+TEST_GEN_PROGS := pidfd_test  pdfd_fdinfo_test pidfd_open_test pidfd_poll_test pidfd_wait
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index c6bc68329f4b..2946d788645b 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -84,4 +84,16 @@ static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
 	return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 }
 
+static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
+{
+	size_t stack_size = 1024;
+	char *stack[1024] = { 0 };
+
+#ifdef __ia64__
+	return __clone2(fn, stack, stack_size, flags | SIGCHLD, NULL, pidfd);
+#else
+	return clone(fn, stack + stack_size, flags | SIGCHLD, NULL, pidfd);
+#endif
+}
+
 #endif /* __PIDFD_H */
diff --git a/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
new file mode 100644
index 000000000000..fbae502ad8ad
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/types.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/wait.h>
+
+#include "pidfd.h"
+#include "../kselftest.h"
+
+static int child_fdinfo_nspid_test(void *args)
+{
+	ksft_print_msg("Child: pid %d\n", getpid());
+	return 0;
+}
+
+static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
+{
+	char path[512];
+	FILE *f;
+	size_t n = 0;
+	ssize_t k;
+	char *line = NULL;
+	int r = -1;
+
+	snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
+
+	f = fopen(path, "re");
+	if (!f)
+		return -1;
+
+	while ((k = getline(&line, &n, f)) != -1) {
+		if (strncmp(line, "NSpid:", 6))
+			continue;
+
+		line[k - 1] = '\0';
+		ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
+		r = strncmp(line + 6, expect, len);
+		break;
+	}
+
+	free(line);
+	fclose(f);
+
+	return r;
+}
+
+static void test_pidfd_fdinfo_nspid(void)
+{
+	char expect[512];
+	int pid, pidfd = 0;
+	int n, r;
+	const char *test_name = "pidfd check for NSpid information in fdinfo";
+
+	pid = pidfd_clone(CLONE_PIDFD | CLONE_NEWPID | CLONE_NEWUSER, &pidfd,
+			  child_fdinfo_nspid_test);
+	if (pid < 0)
+		ksft_exit_fail_msg(
+			"%s test: pidfd_clone failed (ret %d, errno %d)\n",
+			test_name, pid, errno);
+
+	ksft_print_msg("Parent: child-pid: %d\n", pid);
+
+	/* The child will have pid 1 in the new pid namespace,
+	 * so the line must be 'NSPid:\t<pid>\t1'
+	 */
+	n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);
+	r = compare_fdinfo_nspid(pidfd, expect, n);
+
+	(void)close(pidfd);
+
+	if (wait_for_pid(pid))
+		ksft_exit_fail_msg(
+			"%s test: waitpid failed (ret %d, errno %d)\n",
+			test_name, r, errno);
+
+	if (r != 0)
+		ksft_exit_fail_msg("%s test: Failed\n", test_name);
+	else
+		ksft_test_result_pass("%s test: Passed\n", test_name);
+}
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(1);
+
+	test_pidfd_fdinfo_nspid();
+
+	return ksft_exit_pass();
+}
diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c
index 7aff2d3b42c0..9cf0b6b3e389 100644
--- a/tools/testing/selftests/pidfd/pidfd_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_test.c
@@ -27,18 +27,6 @@
 
 #define MAX_EVENTS 5
 
-static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
-{
-	size_t stack_size = 1024;
-	char *stack[1024] = { 0 };
-
-#ifdef __ia64__
-	return __clone2(fn, stack, stack_size, flags | SIGCHLD, NULL, pidfd);
-#else
-	return clone(fn, stack + stack_size, flags | SIGCHLD, NULL, pidfd);
-#endif
-}
-
 static int signal_received;
 
 static void set_signal_received_on_sigusr1(int sig)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
  2019-10-09 16:05   ` [PATCH v2 2/2] pidfd: add tests for NSpid info " Christian Kellner
@ 2019-10-09 17:29   ` Christian Brauner
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
  2 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-09 17:29 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, linux-api, Christian Kellner, Christian Brauner,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Elena Reshetova, Thomas Gleixner,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin

On Wed, Oct 09, 2019 at 06:05:30PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> The fdinfo file for a process file descriptor already contains the
> pid of the process in the callers namespaces. Additionally, if pid
> namespaces are configured, show the process ids of the process in
> all nested namespaces in the same format as in the procfs status
> file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> of the processes in nested namespaces.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>
> ---
> 
> Changes in v2:
> - Moved into separate function to avoid multiple ifdefs as suggested
>   by Michal Hocko
> 
>  kernel/fork.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 5a0fd518e04e..f7a59ef046e9 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1681,12 +1681,27 @@ static int pidfd_release(struct inode *inode, struct file *file)
>  }
>  
>  #ifdef CONFIG_PROC_FS
> +static void pidfd_nspid(struct seq_file *m, struct pid *pid)

If it has to be a separate helper then please make it:

static inline void print_pidfd_nspid(struct seq_file *m, struct pid_namespace *ns, struct pid *pid)
{
#ifdef CONFIG_PID_NS
	int i;

	seq_puts(m, "\nNSpid:");
	for (i = ns->level; i <= pid->level; i++) {
		ns = pid->numbers[i].ns;
		seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
	}
#endif
}

It's called nowhere else and we've already retrieved the pid_namespace
in pidfd_show_fdinfo().

> +{
> +#ifdef CONFIG_PID_NS
> +	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
> +	int i;
> +
> +	seq_puts(m, "\nNSpid:");
> +	for (i = ns->level; i <= pid->level; i++) {
> +		ns = pid->numbers[i].ns;
> +		seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> +	}
> +#endif
> +}
> +
>  static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
>  {
>  	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
>  	struct pid *pid = f->private_data;
>  
>  	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
> +	pidfd_nspid(m, pid);
>  	seq_putc(m, '\n');
>  }
>  #endif
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
  2019-10-09 16:05   ` [PATCH v2 2/2] pidfd: add tests for NSpid info " Christian Kellner
  2019-10-09 17:29   ` [PATCH v2 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
@ 2019-10-11 12:23   ` Christian Kellner
  2019-10-11 12:23     ` [PATCH v3 2/2] pidfd: add tests for NSpid info " Christian Kellner
                       ` (3 more replies)
  2 siblings, 4 replies; 36+ messages in thread
From: Christian Kellner @ 2019-10-11 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Christian Kellner, Christian Brauner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

From: Christian Kellner <christian@kellner.me>

The fdinfo file for a process file descriptor already contains the
pid of the process in the callers namespaces. Additionally, if pid
namespaces are configured, show the process ids of the process in
all nested namespaces in the same format as in the procfs status
file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
of the processes in nested namespaces.

Signed-off-by: Christian Kellner <christian@kellner.me>
---

Changes in v2:
- Moved into separate function to avoid multiple ifdefs as suggested
  by Michal Hocko
Changes in v3:
- Helper function takes struct pid_namespace *ns param and got a new
  name

 kernel/fork.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index bcdf53125210..183950aad82b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1695,12 +1695,27 @@ static int pidfd_release(struct inode *inode, struct file *file)
 }
 
 #ifdef CONFIG_PROC_FS
+static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
+				     struct pid_namespace *ns)
+{
+#ifdef CONFIG_PID_NS
+	int i;
+
+	seq_puts(m, "\nNSpid:");
+	for (i = ns->level; i <= pid->level; i++) {
+		ns = pid->numbers[i].ns;
+		seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
+	}
+#endif
+}
+
 static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
 {
 	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
 	struct pid *pid = f->private_data;
 
 	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
+	print_pidfd_nspid(m, pid, ns);
 	seq_putc(m, '\n');
 }
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
@ 2019-10-11 12:23     ` Christian Kellner
  2019-10-11 13:18         ` Christian Brauner
  2019-10-11 13:17     ` [PATCH v3 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Christian Kellner @ 2019-10-11 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Christian Kellner, Christian Brauner, Shuah Khan,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, linux-kselftest

From: Christian Kellner <christian@kellner.me>

Add tests that check that if pid namespaces are configured the fdinfo
file of a pidfd contains an NSpid: entry containing the process id
in the current and additionally all nested namespaces.

Signed-off-by: Christian Kellner <christian@kellner.me>
---
 tools/testing/selftests/pidfd/Makefile        |  2 +-
 tools/testing/selftests/pidfd/pidfd.h         | 12 +++
 .../selftests/pidfd/pidfd_fdinfo_test.c       | 98 +++++++++++++++++++
 tools/testing/selftests/pidfd/pidfd_test.c    | 12 ---
 4 files changed, 111 insertions(+), 13 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_fdinfo_test.c

diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index 7550f08822a3..43db1b98e845 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 CFLAGS += -g -I../../../../usr/include/ -pthread
 
-TEST_GEN_PROGS := pidfd_test pidfd_open_test pidfd_poll_test pidfd_wait
+TEST_GEN_PROGS := pidfd_test pidfd_fdinfo_test pidfd_open_test pidfd_poll_test pidfd_wait
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index c6bc68329f4b..2946d788645b 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -84,4 +84,16 @@ static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
 	return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 }
 
+static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
+{
+	size_t stack_size = 1024;
+	char *stack[1024] = { 0 };
+
+#ifdef __ia64__
+	return __clone2(fn, stack, stack_size, flags | SIGCHLD, NULL, pidfd);
+#else
+	return clone(fn, stack + stack_size, flags | SIGCHLD, NULL, pidfd);
+#endif
+}
+
 #endif /* __PIDFD_H */
diff --git a/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
new file mode 100644
index 000000000000..fbae502ad8ad
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/types.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/wait.h>
+
+#include "pidfd.h"
+#include "../kselftest.h"
+
+static int child_fdinfo_nspid_test(void *args)
+{
+	ksft_print_msg("Child: pid %d\n", getpid());
+	return 0;
+}
+
+static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
+{
+	char path[512];
+	FILE *f;
+	size_t n = 0;
+	ssize_t k;
+	char *line = NULL;
+	int r = -1;
+
+	snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
+
+	f = fopen(path, "re");
+	if (!f)
+		return -1;
+
+	while ((k = getline(&line, &n, f)) != -1) {
+		if (strncmp(line, "NSpid:", 6))
+			continue;
+
+		line[k - 1] = '\0';
+		ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
+		r = strncmp(line + 6, expect, len);
+		break;
+	}
+
+	free(line);
+	fclose(f);
+
+	return r;
+}
+
+static void test_pidfd_fdinfo_nspid(void)
+{
+	char expect[512];
+	int pid, pidfd = 0;
+	int n, r;
+	const char *test_name = "pidfd check for NSpid information in fdinfo";
+
+	pid = pidfd_clone(CLONE_PIDFD | CLONE_NEWPID | CLONE_NEWUSER, &pidfd,
+			  child_fdinfo_nspid_test);
+	if (pid < 0)
+		ksft_exit_fail_msg(
+			"%s test: pidfd_clone failed (ret %d, errno %d)\n",
+			test_name, pid, errno);
+
+	ksft_print_msg("Parent: child-pid: %d\n", pid);
+
+	/* The child will have pid 1 in the new pid namespace,
+	 * so the line must be 'NSPid:\t<pid>\t1'
+	 */
+	n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);
+	r = compare_fdinfo_nspid(pidfd, expect, n);
+
+	(void)close(pidfd);
+
+	if (wait_for_pid(pid))
+		ksft_exit_fail_msg(
+			"%s test: waitpid failed (ret %d, errno %d)\n",
+			test_name, r, errno);
+
+	if (r != 0)
+		ksft_exit_fail_msg("%s test: Failed\n", test_name);
+	else
+		ksft_test_result_pass("%s test: Passed\n", test_name);
+}
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(1);
+
+	test_pidfd_fdinfo_nspid();
+
+	return ksft_exit_pass();
+}
diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c
index 7aff2d3b42c0..9cf0b6b3e389 100644
--- a/tools/testing/selftests/pidfd/pidfd_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_test.c
@@ -27,18 +27,6 @@
 
 #define MAX_EVENTS 5
 
-static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
-{
-	size_t stack_size = 1024;
-	char *stack[1024] = { 0 };
-
-#ifdef __ia64__
-	return __clone2(fn, stack, stack_size, flags | SIGCHLD, NULL, pidfd);
-#else
-	return clone(fn, stack + stack_size, flags | SIGCHLD, NULL, pidfd);
-#endif
-}
-
 static int signal_received;
 
 static void set_signal_received_on_sigusr1(int sig)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
  2019-10-11 12:23     ` [PATCH v3 2/2] pidfd: add tests for NSpid info " Christian Kellner
@ 2019-10-11 13:17     ` Christian Brauner
  2019-10-11 14:55     ` Jann Horn
  2019-10-14 16:20     ` [PATCH v4 1/2] " Christian Kellner
  3 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 13:17 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, linux-api, Christian Kellner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 02:23:20PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> The fdinfo file for a process file descriptor already contains the
> pid of the process in the callers namespaces. Additionally, if pid
> namespaces are configured, show the process ids of the process in
> all nested namespaces in the same format as in the procfs status
> file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> of the processes in nested namespaces.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>

Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

If I hear no technical objections I'll pick this up targeting the 5.5
merge window.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-11 12:23     ` [PATCH v3 2/2] pidfd: add tests for NSpid info " Christian Kellner
@ 2019-10-11 13:18         ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 13:18 UTC (permalink / raw)
  To: Christian Kellner, Shuah Khan
  Cc: linux-kernel, linux-api, Christian Kellner, Shuah Khan,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, linux-kselftest

On Fri, Oct 11, 2019 at 02:23:21PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> Add tests that check that if pid namespaces are configured the fdinfo
> file of a pidfd contains an NSpid: entry containing the process id
> in the current and additionally all nested namespaces.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

Shuah, can I get an Ack for you from this. If you have no objections I'd
queue up this patchset for the 5.5 merge window.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 2/2] pidfd: add tests for NSpid info in fdinfo
@ 2019-10-11 13:18         ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 13:18 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, linux-api, Christian Kellner, Shuah Khan,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, linux-kselftest

On Fri, Oct 11, 2019 at 02:23:21PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> Add tests that check that if pid namespaces are configured the fdinfo
> file of a pidfd contains an NSpid: entry containing the process id
> in the current and additionally all nested namespaces.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

Shuah, can I get an Ack for you from this. If you have no objections I'd
queue up this patchset for the 5.5 merge window.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
  2019-10-11 12:23     ` [PATCH v3 2/2] pidfd: add tests for NSpid info " Christian Kellner
  2019-10-11 13:17     ` [PATCH v3 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
@ 2019-10-11 14:55     ` Jann Horn
  2019-10-11 15:17       ` Christian Brauner
  2019-10-14 16:20     ` [PATCH v4 1/2] " Christian Kellner
  3 siblings, 1 reply; 36+ messages in thread
From: Jann Horn @ 2019-10-11 14:55 UTC (permalink / raw)
  To: Christian Kellner, Christian Brauner
  Cc: kernel list, Linux API, Christian Kellner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 2:23 PM Christian Kellner <ckellner@redhat.com> wrote:
> The fdinfo file for a process file descriptor already contains the
> pid of the process in the callers namespaces. Additionally, if pid
> namespaces are configured, show the process ids of the process in
> all nested namespaces in the same format as in the procfs status
> file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> of the processes in nested namespaces.
[...]
>  #ifdef CONFIG_PROC_FS
> +static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
> +                                    struct pid_namespace *ns)

`ns` is the namespace of the PID namespace of the procfs instance
through which the file descriptor is being viewed.

> +{
> +#ifdef CONFIG_PID_NS
> +       int i;
> +
> +       seq_puts(m, "\nNSpid:");
> +       for (i = ns->level; i <= pid->level; i++) {

ns->level is the level of the PID namespace associated with the procfs
instance through which the file descriptor is being viewed. pid->level
is the level of the PID associated with the pidfd.

> +               ns = pid->numbers[i].ns;
> +               seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> +       }
> +#endif
> +}

I think you assumed that `ns` is always going to contain `pid`.
However, that's not the case. Consider the following scenario:

 - the init_pid_ns has two child PID namespaces, A and B (each with
its own mount namespace and procfs instance)
 - process P1 lives in A
 - process P2 lives in B
 - P1 opens a pidfd for itself
 - P1 passes the pidfd to P2 (e.g. via a unix domain socket)
 - P2 reads /proc/self/fdinfo/$pidfd

Now the loop will print the ID of P1 in A. I don't think that's what
you intended? You might want to bail out if "pid_nr_ns(pid, ns) == 0",
or something like that.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-09 16:05   ` [PATCH v2 2/2] pidfd: add tests for NSpid info " Christian Kellner
@ 2019-10-11 15:09       ` Jann Horn
  0 siblings, 0 replies; 36+ messages in thread
From: Jann Horn @ 2019-10-11 15:09 UTC (permalink / raw)
  To: Christian Kellner
  Cc: kernel list, Linux API, Christian Kellner, Christian Brauner,
	Shuah Khan, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

On Wed, Oct 9, 2019 at 6:10 PM Christian Kellner <ckellner@redhat.com> wrote:
> Add tests that check that if pid namespaces are configured the fdinfo
> file of a pidfd contains an NSpid: entry containing the process id
> in the current and additionally all nested namespaces.
[...]
> +static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
> +{
> +       char path[512];
> +       FILE *f;
> +       size_t n = 0;
> +       ssize_t k;
> +       char *line = NULL;
> +       int r = -1;
> +
> +       snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);

(Maybe at some point the selftests code should add some more concise
alternative to snprintf() calls on separate lines. A macro or
something like that so that you can write stuff like `f =
fopen(tprintf("/proc/self/fdinfo/%d", pidfd), "re")`.)

> +       f = fopen(path, "re");
> +       if (!f)
> +               return -1;
> +
> +       while ((k = getline(&line, &n, f)) != -1) {
> +               if (strncmp(line, "NSpid:", 6))
> +                       continue;
> +
> +               line[k - 1] = '\0';
> +               ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
> +               r = strncmp(line + 6, expect, len);

Wouldn't it be better to get rid of the nullbyte assignment and change
the strncmp() into a strcmp() here...

[...]
> +       /* The child will have pid 1 in the new pid namespace,
> +        * so the line must be 'NSPid:\t<pid>\t1'
> +        */
> +       n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);

... and add a "\n" to the format string? It's shorter and doesn't
silently ignore it if the line doesn't end at that point.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 2/2] pidfd: add tests for NSpid info in fdinfo
@ 2019-10-11 15:09       ` Jann Horn
  0 siblings, 0 replies; 36+ messages in thread
From: Jann Horn @ 2019-10-11 15:09 UTC (permalink / raw)
  To: Christian Kellner
  Cc: kernel list, Linux API, Christian Kellner, Christian Brauner,
	Shuah Khan, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

On Wed, Oct 9, 2019 at 6:10 PM Christian Kellner <ckellner@redhat.com> wrote:
> Add tests that check that if pid namespaces are configured the fdinfo
> file of a pidfd contains an NSpid: entry containing the process id
> in the current and additionally all nested namespaces.
[...]
> +static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
> +{
> +       char path[512];
> +       FILE *f;
> +       size_t n = 0;
> +       ssize_t k;
> +       char *line = NULL;
> +       int r = -1;
> +
> +       snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);

(Maybe at some point the selftests code should add some more concise
alternative to snprintf() calls on separate lines. A macro or
something like that so that you can write stuff like `f =
fopen(tprintf("/proc/self/fdinfo/%d", pidfd), "re")`.)

> +       f = fopen(path, "re");
> +       if (!f)
> +               return -1;
> +
> +       while ((k = getline(&line, &n, f)) != -1) {
> +               if (strncmp(line, "NSpid:", 6))
> +                       continue;
> +
> +               line[k - 1] = '\0';
> +               ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
> +               r = strncmp(line + 6, expect, len);

Wouldn't it be better to get rid of the nullbyte assignment and change
the strncmp() into a strcmp() here...

[...]
> +       /* The child will have pid 1 in the new pid namespace,
> +        * so the line must be 'NSPid:\t<pid>\t1'
> +        */
> +       n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);

... and add a "\n" to the format string? It's shorter and doesn't
silently ignore it if the line doesn't end at that point.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 14:55     ` Jann Horn
@ 2019-10-11 15:17       ` Christian Brauner
  2019-10-11 15:30         ` Jann Horn
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 15:17 UTC (permalink / raw)
  To: Jann Horn
  Cc: Christian Kellner, Christian Brauner, kernel list, Linux API,
	Christian Kellner, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 04:55:59PM +0200, Jann Horn wrote:
> On Fri, Oct 11, 2019 at 2:23 PM Christian Kellner <ckellner@redhat.com> wrote:
> > The fdinfo file for a process file descriptor already contains the
> > pid of the process in the callers namespaces. Additionally, if pid
> > namespaces are configured, show the process ids of the process in
> > all nested namespaces in the same format as in the procfs status
> > file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> > of the processes in nested namespaces.
> [...]
> >  #ifdef CONFIG_PROC_FS
> > +static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
> > +                                    struct pid_namespace *ns)
> 
> `ns` is the namespace of the PID namespace of the procfs instance
> through which the file descriptor is being viewed.
> 
> > +{
> > +#ifdef CONFIG_PID_NS
> > +       int i;
> > +
> > +       seq_puts(m, "\nNSpid:");
> > +       for (i = ns->level; i <= pid->level; i++) {
> 
> ns->level is the level of the PID namespace associated with the procfs
> instance through which the file descriptor is being viewed. pid->level
> is the level of the PID associated with the pidfd.
> 
> > +               ns = pid->numbers[i].ns;
> > +               seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> > +       }
> > +#endif
> > +}
> 
> I think you assumed that `ns` is always going to contain `pid`.
> However, that's not the case. Consider the following scenario:
> 
>  - the init_pid_ns has two child PID namespaces, A and B (each with
> its own mount namespace and procfs instance)
>  - process P1 lives in A
>  - process P2 lives in B
>  - P1 opens a pidfd for itself
>  - P1 passes the pidfd to P2 (e.g. via a unix domain socket)
>  - P2 reads /proc/self/fdinfo/$pidfd
> 
> Now the loop will print the ID of P1 in A. I don't think that's what
> you intended? You might want to bail out if "pid_nr_ns(pid, ns) == 0",
> or something like that.

I assumed the same thing happens when you pass around an fd for
/proc/self/status and that's why I didn't object to this behavior.

Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 15:17       ` Christian Brauner
@ 2019-10-11 15:30         ` Jann Horn
  2019-10-11 16:58           ` Christian Brauner
  0 siblings, 1 reply; 36+ messages in thread
From: Jann Horn @ 2019-10-11 15:30 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Kellner, Christian Brauner, kernel list, Linux API,
	Christian Kellner, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 5:17 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Fri, Oct 11, 2019 at 04:55:59PM +0200, Jann Horn wrote:
> > On Fri, Oct 11, 2019 at 2:23 PM Christian Kellner <ckellner@redhat.com> wrote:
> > > The fdinfo file for a process file descriptor already contains the
> > > pid of the process in the callers namespaces. Additionally, if pid
> > > namespaces are configured, show the process ids of the process in
> > > all nested namespaces in the same format as in the procfs status
> > > file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> > > of the processes in nested namespaces.
> > [...]
> > >  #ifdef CONFIG_PROC_FS
> > > +static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
> > > +                                    struct pid_namespace *ns)
> >
> > `ns` is the namespace of the PID namespace of the procfs instance
> > through which the file descriptor is being viewed.
> >
> > > +{
> > > +#ifdef CONFIG_PID_NS
> > > +       int i;
> > > +
> > > +       seq_puts(m, "\nNSpid:");
> > > +       for (i = ns->level; i <= pid->level; i++) {
> >
> > ns->level is the level of the PID namespace associated with the procfs
> > instance through which the file descriptor is being viewed. pid->level
> > is the level of the PID associated with the pidfd.
> >
> > > +               ns = pid->numbers[i].ns;
> > > +               seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> > > +       }
> > > +#endif
> > > +}
> >
> > I think you assumed that `ns` is always going to contain `pid`.
> > However, that's not the case. Consider the following scenario:
> >
> >  - the init_pid_ns has two child PID namespaces, A and B (each with
> > its own mount namespace and procfs instance)
> >  - process P1 lives in A
> >  - process P2 lives in B
> >  - P1 opens a pidfd for itself
> >  - P1 passes the pidfd to P2 (e.g. via a unix domain socket)
> >  - P2 reads /proc/self/fdinfo/$pidfd
> >
> > Now the loop will print the ID of P1 in A. I don't think that's what
> > you intended? You might want to bail out if "pid_nr_ns(pid, ns) == 0",
> > or something like that.
>
> I assumed the same thing happens when you pass around an fd for
> /proc/self/status and that's why I didn't object to this behavior.

I don't see how /proc/$pid/status is relevant. In the
/proc/$pid/status case, the output is the list of PIDs starting at the
PID namespace the procfs is associated with; and the process is always
contained in that namespace, which also means that the first PID
listed is the one in the PID namespace of the procfs instance. In the
pidfd case, the process is not necessarily contained in that
namespace, and the output doesn't make sense.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 15:30         ` Jann Horn
@ 2019-10-11 16:58           ` Christian Brauner
  2019-10-11 18:20             ` Jann Horn
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 16:58 UTC (permalink / raw)
  To: Jann Horn
  Cc: Christian Kellner, kernel list, Linux API, Christian Kellner,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 05:30:03PM +0200, Jann Horn wrote:
> On Fri, Oct 11, 2019 at 5:17 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > On Fri, Oct 11, 2019 at 04:55:59PM +0200, Jann Horn wrote:
> > > On Fri, Oct 11, 2019 at 2:23 PM Christian Kellner <ckellner@redhat.com> wrote:
> > > > The fdinfo file for a process file descriptor already contains the
> > > > pid of the process in the callers namespaces. Additionally, if pid
> > > > namespaces are configured, show the process ids of the process in
> > > > all nested namespaces in the same format as in the procfs status
> > > > file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> > > > of the processes in nested namespaces.
> > > [...]
> > > >  #ifdef CONFIG_PROC_FS
> > > > +static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
> > > > +                                    struct pid_namespace *ns)
> > >
> > > `ns` is the namespace of the PID namespace of the procfs instance
> > > through which the file descriptor is being viewed.
> > >
> > > > +{
> > > > +#ifdef CONFIG_PID_NS
> > > > +       int i;
> > > > +
> > > > +       seq_puts(m, "\nNSpid:");
> > > > +       for (i = ns->level; i <= pid->level; i++) {
> > >
> > > ns->level is the level of the PID namespace associated with the procfs
> > > instance through which the file descriptor is being viewed. pid->level
> > > is the level of the PID associated with the pidfd.
> > >
> > > > +               ns = pid->numbers[i].ns;
> > > > +               seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> > > > +       }
> > > > +#endif
> > > > +}
> > >
> > > I think you assumed that `ns` is always going to contain `pid`.
> > > However, that's not the case. Consider the following scenario:
> > >
> > >  - the init_pid_ns has two child PID namespaces, A and B (each with
> > > its own mount namespace and procfs instance)
> > >  - process P1 lives in A
> > >  - process P2 lives in B
> > >  - P1 opens a pidfd for itself
> > >  - P1 passes the pidfd to P2 (e.g. via a unix domain socket)
> > >  - P2 reads /proc/self/fdinfo/$pidfd
> > >
> > > Now the loop will print the ID of P1 in A. I don't think that's what
> > > you intended? You might want to bail out if "pid_nr_ns(pid, ns) == 0",
> > > or something like that.
> >
> > I assumed the same thing happens when you pass around an fd for
> > /proc/self/status and that's why I didn't object to this behavior.
> 
> I don't see how /proc/$pid/status is relevant. In the
> /proc/$pid/status case, the output is the list of PIDs starting at the
> PID namespace the procfs is associated with; and the process is always
> contained in that namespace, which also means that the first PID
> listed is the one in the PID namespace of the procfs instance. In the
> pidfd case, the process is not necessarily contained in that
> namespace, and the output doesn't make sense.

I might be misreading what you're saying.
(Maybe I'm doing something obviously wrong.)
If I compile the following two programs:
b2: https://paste.ubuntu.com/p/xthMsCXy3s/
c2: https://paste.ubuntu.com/p/y5HSzyMQJr/

Then in shell1
sudo unshare --mount --pid --fork --mount-proc

and in shell2
sudo unshare --mount --pid --fork --mount-proc

and run b2 in shell1 and c2 in shell2 which sends around an fd for
/proc/b2/status to c2. Now c2 reads b2's status file via the fd it
received. The c2 will see the pid of b2 in b2's pid namespace even
though the process is not contained in the pid namespace of c2.

Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-11 15:09       ` Jann Horn
@ 2019-10-11 17:08         ` Christian Brauner
  -1 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 17:08 UTC (permalink / raw)
  To: Jann Horn
  Cc: Christian Kellner, kernel list, Linux API, Christian Kellner,
	Shuah Khan, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

On Fri, Oct 11, 2019 at 05:09:29PM +0200, Jann Horn wrote:
> On Wed, Oct 9, 2019 at 6:10 PM Christian Kellner <ckellner@redhat.com> wrote:
> > Add tests that check that if pid namespaces are configured the fdinfo
> > file of a pidfd contains an NSpid: entry containing the process id
> > in the current and additionally all nested namespaces.
> [...]
> > +static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
> > +{
> > +       char path[512];
> > +       FILE *f;
> > +       size_t n = 0;
> > +       ssize_t k;
> > +       char *line = NULL;
> > +       int r = -1;
> > +
> > +       snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
> 
> (Maybe at some point the selftests code should add some more concise
> alternative to snprintf() calls on separate lines. A macro or
> something like that so that you can write stuff like `f =
> fopen(tprintf("/proc/self/fdinfo/%d", pidfd), "re")`.)
> 
> > +       f = fopen(path, "re");
> > +       if (!f)
> > +               return -1;
> > +
> > +       while ((k = getline(&line, &n, f)) != -1) {
> > +               if (strncmp(line, "NSpid:", 6))
> > +                       continue;
> > +
> > +               line[k - 1] = '\0';
> > +               ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
> > +               r = strncmp(line + 6, expect, len);
> 
> Wouldn't it be better to get rid of the nullbyte assignment and change
> the strncmp() into a strcmp() here...
> 
> [...]
> > +       /* The child will have pid 1 in the new pid namespace,
> > +        * so the line must be 'NSPid:\t<pid>\t1'
> > +        */
> > +       n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);
> 
> ... and add a "\n" to the format string? It's shorter and doesn't
> silently ignore it if the line doesn't end at that point.

Also, what Christian just told me and what I wanted to suggest is that
we add tests for sending around pidfds and reading fdinfo too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 2/2] pidfd: add tests for NSpid info in fdinfo
@ 2019-10-11 17:08         ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-11 17:08 UTC (permalink / raw)
  To: Jann Horn
  Cc: Christian Kellner, kernel list, Linux API, Christian Kellner,
	Shuah Khan, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Joel Fernandes (Google),
	Al Viro, Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

On Fri, Oct 11, 2019 at 05:09:29PM +0200, Jann Horn wrote:
> On Wed, Oct 9, 2019 at 6:10 PM Christian Kellner <ckellner@redhat.com> wrote:
> > Add tests that check that if pid namespaces are configured the fdinfo
> > file of a pidfd contains an NSpid: entry containing the process id
> > in the current and additionally all nested namespaces.
> [...]
> > +static int compare_fdinfo_nspid(int pidfd, char *expect, size_t len)
> > +{
> > +       char path[512];
> > +       FILE *f;
> > +       size_t n = 0;
> > +       ssize_t k;
> > +       char *line = NULL;
> > +       int r = -1;
> > +
> > +       snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
> 
> (Maybe at some point the selftests code should add some more concise
> alternative to snprintf() calls on separate lines. A macro or
> something like that so that you can write stuff like `f =
> fopen(tprintf("/proc/self/fdinfo/%d", pidfd), "re")`.)
> 
> > +       f = fopen(path, "re");
> > +       if (!f)
> > +               return -1;
> > +
> > +       while ((k = getline(&line, &n, f)) != -1) {
> > +               if (strncmp(line, "NSpid:", 6))
> > +                       continue;
> > +
> > +               line[k - 1] = '\0';
> > +               ksft_print_msg("Child: fdinfo NSpid line: '%s'.\n", line);
> > +               r = strncmp(line + 6, expect, len);
> 
> Wouldn't it be better to get rid of the nullbyte assignment and change
> the strncmp() into a strcmp() here...
> 
> [...]
> > +       /* The child will have pid 1 in the new pid namespace,
> > +        * so the line must be 'NSPid:\t<pid>\t1'
> > +        */
> > +       n = snprintf(expect, sizeof(expect), "\t%d\t%d", pid, 1);
> 
> ... and add a "\n" to the format string? It's shorter and doesn't
> silently ignore it if the line doesn't end at that point.

Also, what Christian just told me and what I wanted to suggest is that
we add tests for sending around pidfds and reading fdinfo too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 1/2] pidfd: show pids for nested pid namespaces in fdinfo
  2019-10-11 16:58           ` Christian Brauner
@ 2019-10-11 18:20             ` Jann Horn
  2019-10-12 10:19               ` [PATCH] pidfd: add NSpid entries to fdinfo Christian Brauner
  0 siblings, 1 reply; 36+ messages in thread
From: Jann Horn @ 2019-10-11 18:20 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Kellner, kernel list, Linux API, Christian Kellner,
	Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Fri, Oct 11, 2019 at 6:58 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> On Fri, Oct 11, 2019 at 05:30:03PM +0200, Jann Horn wrote:
> > On Fri, Oct 11, 2019 at 5:17 PM Christian Brauner
> > <christian.brauner@ubuntu.com> wrote:
> > >
> > > On Fri, Oct 11, 2019 at 04:55:59PM +0200, Jann Horn wrote:
> > > > On Fri, Oct 11, 2019 at 2:23 PM Christian Kellner <ckellner@redhat.com> wrote:
> > > > > The fdinfo file for a process file descriptor already contains the
> > > > > pid of the process in the callers namespaces. Additionally, if pid
> > > > > namespaces are configured, show the process ids of the process in
> > > > > all nested namespaces in the same format as in the procfs status
> > > > > file, i.e. "NSPid:\t%d\%d...". This allows the easy identification
> > > > > of the processes in nested namespaces.
> > > > [...]
> > > > >  #ifdef CONFIG_PROC_FS
> > > > > +static inline void print_pidfd_nspid(struct seq_file *m, struct pid *pid,
> > > > > +                                    struct pid_namespace *ns)
> > > >
> > > > `ns` is the namespace of the PID namespace of the procfs instance
> > > > through which the file descriptor is being viewed.
> > > >
> > > > > +{
> > > > > +#ifdef CONFIG_PID_NS
> > > > > +       int i;
> > > > > +
> > > > > +       seq_puts(m, "\nNSpid:");
> > > > > +       for (i = ns->level; i <= pid->level; i++) {
> > > >
> > > > ns->level is the level of the PID namespace associated with the procfs
> > > > instance through which the file descriptor is being viewed. pid->level
> > > > is the level of the PID associated with the pidfd.
> > > >
> > > > > +               ns = pid->numbers[i].ns;
> > > > > +               seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> > > > > +       }
> > > > > +#endif
> > > > > +}
> > > >
> > > > I think you assumed that `ns` is always going to contain `pid`.
> > > > However, that's not the case. Consider the following scenario:
> > > >
> > > >  - the init_pid_ns has two child PID namespaces, A and B (each with
> > > > its own mount namespace and procfs instance)
> > > >  - process P1 lives in A
> > > >  - process P2 lives in B
> > > >  - P1 opens a pidfd for itself
> > > >  - P1 passes the pidfd to P2 (e.g. via a unix domain socket)
> > > >  - P2 reads /proc/self/fdinfo/$pidfd
> > > >
> > > > Now the loop will print the ID of P1 in A. I don't think that's what
> > > > you intended? You might want to bail out if "pid_nr_ns(pid, ns) == 0",
> > > > or something like that.
> > >
> > > I assumed the same thing happens when you pass around an fd for
> > > /proc/self/status and that's why I didn't object to this behavior.
> >
> > I don't see how /proc/$pid/status is relevant. In the
> > /proc/$pid/status case, the output is the list of PIDs starting at the
> > PID namespace the procfs is associated with; and the process is always
> > contained in that namespace, which also means that the first PID
> > listed is the one in the PID namespace of the procfs instance. In the
> > pidfd case, the process is not necessarily contained in that
> > namespace, and the output doesn't make sense.
>
> I might be misreading what you're saying.
> (Maybe I'm doing something obviously wrong.)
> If I compile the following two programs:
> b2: https://paste.ubuntu.com/p/xthMsCXy3s/
> c2: https://paste.ubuntu.com/p/y5HSzyMQJr/
>
> Then in shell1
> sudo unshare --mount --pid --fork --mount-proc
>
> and in shell2
> sudo unshare --mount --pid --fork --mount-proc
>
> and run b2 in shell1 and c2 in shell2 which sends around an fd for
> /proc/b2/status to c2. Now c2 reads b2's status file via the fd it
> received. The c2 will see the pid of b2 in b2's pid namespace even
> though the process is not contained in the pid namespace of c2.

Because the reader doesn't matter; the perspective you have on the
system is defined by which pidns the procfs instance you're looking
through is associated with, and here you're looking through shell1's
procfs. It's normal that when you look through another procfs, you see
PIDs differently.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-11 18:20             ` Jann Horn
@ 2019-10-12 10:19               ` Christian Brauner
  2019-10-12 10:21                 ` Christian Brauner
  2019-10-14 15:09                 ` Jann Horn
  0 siblings, 2 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-12 10:19 UTC (permalink / raw)
  To: jannh
  Cc: aarcange, akpm, christian.brauner, christian, ckellner, cyphar,
	elena.reshetova, guro, ldv, linux-api, linux-kernel, mhocko,
	mingo, peterz, tglx, viro

Currently, the fdinfo file of contains the field Pid:
It contains the pid a given pidfd refers to in the pid namespace of the
opener's procfs instance.
If the pid namespace of the process is not a descendant of the pid
namespace of the procfs instance 0 will be shown as its pid. This is
similar to calling getppid() on a process who's parent is out of it's
pid namespace (e.g. when moving a process into a sibling pid namespace
via setns()).

Add an NSpid field for easy retrieval of the pid in all descendant pid
namespaces:
If pid namespaces are supported this field will contain the pid a given
pidfd refers to for all descendant pid namespaces starting from the
current pid namespace of the opener's procfs instance, i.e. the first
pid entry for Pid and NSpid will be identical.
If the pid namespace of the process is not a descendant of the pid
namespace of the procfs instance 0 will be shown as its first NSpid and
no other NSpid entries will be shown.
Note that this differs from the Pid and NSpid fields in
/proc/<pid>/status where Pid and NSpid are always shown relative to the
pid namespace of the opener's procfs instace. The difference becomes
obvious when sending around a pidfd between pid namespaces from
different trees, i.e. where no ancestoral relation is present between
the pid namespaces:
1. sending around pidfd:
- create two new pid namespaces ns1 and ns2 in the initial pid namespace
  (Also take care to create new mount namespaces in the new pid
  namespace and mount procfs.)
- create a process with a pidfd in ns1
- send pidfd from ns1 to ns2
- read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
  are 0
- create a process with a pidfd in
- open a pidfd for a process in the initial pid namespace
2. sending around /proc/<pid>/status fd:
- create two new pid namespaces ns1 and ns2 in the initial pid namespace
  (Also take care to create new mount namespaces in the new pid
  namespace and mount procfs.)
- create a process in ns1
- open /proc/<pid>/status in the initial pid namespace for the process
  you created in ns1
- send statusfd from initial pid namespace to ns2
- read statusfd and observe:
  - that Pid will contain the pid of the process as seen from the init
  - that NSpid will contain the pids of the process for all descendant
    pid namespaces starting from the initial pid namespace

Cc: Jann Horn <jannh@google.com>
Cc: linux-api@vger.kernel.org
Co-Developed-by: Christian Kellner <christian@kellner.me>
Signed-off-by: Christian Kellner <christian@kellner.me>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 kernel/fork.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1f6c45f6a734..b155bad92d9c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1695,12 +1695,83 @@ static int pidfd_release(struct inode *inode, struct file *file)
 }
 
 #ifdef CONFIG_PROC_FS
+/**
+ * pidfd_show_fdinfo - print information about a pidfd
+ * @m: proc fdinfo file
+ * @f: file referencing a pidfd
+ *
+ * Pid:
+ * This function will print the pid a given pidfd refers to in the pid
+ * namespace of the opener's procfs instance.
+ * If the pid namespace of the process is not a descendant of the pid
+ * namespace of the procfs instance 0 will be shown as its pid. This is
+ * similar to calling getppid() on a process who's parent is out of it's
+ * pid namespace (e.g. when moving a process into a sibling pid namespace
+ * via setns()).
+ *
+ * NSpid:
+ * If pid namespaces are supported then this function will also print the
+ * pid a given pidfd refers to for all descendant pid namespaces starting
+ * from the current pid namespace of the opener's procfs instance, i.e. the
+ * first pid entry for Pid and NSpid will be identical.
+ * If the pid namespace of the process is not a descendant of the pid
+ * namespace of the procfs instance 0 will be shown as its first NSpid and
+ * no other NSpid entries will be shown.
+ * Note that this differs from the Pid and NSpid fields in
+ * /proc/<pid>/status where Pid and NSpid are always shown relative to the
+ * pid namespace of the opener's procfs instace. The difference becomes
+ * obvious when sending around a pidfd between pid namespaces from
+ * different trees, i.e. where no ancestoral relation is present between
+ * the pid namespaces:
+ * 1. sending around pidfd:
+ * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
+ *   (Also take care to create new mount namespaces in the new pid
+ *   namespace and mount procfs.)
+ * - create a process with a pidfd in ns1
+ * - send pidfd from ns1 to ns2
+ * - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
+ *   are 0
+ * - create a process with a pidfd in
+ * - open a pidfd for a process in the initial pid namespace
+ * 2. sending around /proc/<pid>/status fd:
+ * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
+ *   (Also take care to create new mount namespaces in the new pid
+ *   namespace and mount procfs.)
+ * - create a process in ns1
+ * - open /proc/<pid>/status in the initial pid namespace for the process
+ *   you created in ns1
+ * - send statusfd from initial pid namespace to ns2
+ * - read statusfd and observe:
+ *   - that Pid will contain the pid of the process as seen from the init
+ *   - that NSpid will contain the pids of the process for all descendant
+ *     pid namespaces starting from the initial pid namespace
+ */
 static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
 {
 	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
 	struct pid *pid = f->private_data;
+	pid_t nr = pid_nr_ns(pid, ns);
+
+	seq_put_decimal_ull(m, "Pid:\t", nr);
+
+#ifdef CONFIG_PID_NS
+	seq_puts(m, "\nNSpid:");
+	if (nr == 0) {
+		/*
+		 * If nr is zero the pid namespace of the procfs and the
+		 * pid namespace of the pidfd are neither the same pid
+		 * namespace nor are they ancestors. Since NSpid and Pid
+		 * are always identical in their first entry shortcut it
+		 * and simply print 0.
+		 */
+		seq_put_decimal_ull(m, "\t", nr);
+	} else {
+		int i;
+		for (i = ns->level; i <= pid->level; i++)
+			seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, pid->numbers[i].ns));
+	}
+#endif
 
-	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
 	seq_putc(m, '\n');
 }
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-12 10:19               ` [PATCH] pidfd: add NSpid entries to fdinfo Christian Brauner
@ 2019-10-12 10:21                 ` Christian Brauner
  2019-10-14  9:43                   ` Christian Kellner
  2019-10-14 15:09                 ` Jann Horn
  1 sibling, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-12 10:21 UTC (permalink / raw)
  To: jannh
  Cc: aarcange, akpm, christian, ckellner, cyphar, elena.reshetova,
	guro, ldv, linux-api, linux-kernel, mhocko, mingo, peterz, tglx,
	viro

On Sat, Oct 12, 2019 at 12:19:22PM +0200, Christian Brauner wrote:
> Currently, the fdinfo file of contains the field Pid:
> It contains the pid a given pidfd refers to in the pid namespace of the
> opener's procfs instance.
> If the pid namespace of the process is not a descendant of the pid
> namespace of the procfs instance 0 will be shown as its pid. This is
> similar to calling getppid() on a process who's parent is out of it's
> pid namespace (e.g. when moving a process into a sibling pid namespace
> via setns()).
> 
> Add an NSpid field for easy retrieval of the pid in all descendant pid
> namespaces:
> If pid namespaces are supported this field will contain the pid a given
> pidfd refers to for all descendant pid namespaces starting from the
> current pid namespace of the opener's procfs instance, i.e. the first
> pid entry for Pid and NSpid will be identical.
> If the pid namespace of the process is not a descendant of the pid
> namespace of the procfs instance 0 will be shown as its first NSpid and
> no other NSpid entries will be shown.
> Note that this differs from the Pid and NSpid fields in
> /proc/<pid>/status where Pid and NSpid are always shown relative to the
> pid namespace of the opener's procfs instace. The difference becomes
> obvious when sending around a pidfd between pid namespaces from
> different trees, i.e. where no ancestoral relation is present between
> the pid namespaces:
> 1. sending around pidfd:
> - create two new pid namespaces ns1 and ns2 in the initial pid namespace
>   (Also take care to create new mount namespaces in the new pid
>   namespace and mount procfs.)
> - create a process with a pidfd in ns1
> - send pidfd from ns1 to ns2
> - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
>   are 0
> - create a process with a pidfd in
> - open a pidfd for a process in the initial pid namespace
> 2. sending around /proc/<pid>/status fd:
> - create two new pid namespaces ns1 and ns2 in the initial pid namespace
>   (Also take care to create new mount namespaces in the new pid
>   namespace and mount procfs.)
> - create a process in ns1
> - open /proc/<pid>/status in the initial pid namespace for the process
>   you created in ns1
> - send statusfd from initial pid namespace to ns2
> - read statusfd and observe:
>   - that Pid will contain the pid of the process as seen from the init
>   - that NSpid will contain the pids of the process for all descendant
>     pid namespaces starting from the initial pid namespace
> 
> Cc: Jann Horn <jannh@google.com>
> Cc: linux-api@vger.kernel.org
> Co-Developed-by: Christian Kellner <christian@kellner.me>
> Signed-off-by: Christian Kellner <christian@kellner.me>
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

I think this might be more what we want.
I tried to think of cases where the first entry of Pid is not identical
to the first entry of NSpid but I came up with none. Maybe you do, Jann?

Christian, this is just a quick stab I took. Feel free to pick this up
as a template.

Thanks!
Christian

> ---
>  kernel/fork.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 1f6c45f6a734..b155bad92d9c 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1695,12 +1695,83 @@ static int pidfd_release(struct inode *inode, struct file *file)
>  }
>  
>  #ifdef CONFIG_PROC_FS
> +/**
> + * pidfd_show_fdinfo - print information about a pidfd
> + * @m: proc fdinfo file
> + * @f: file referencing a pidfd
> + *
> + * Pid:
> + * This function will print the pid a given pidfd refers to in the pid
> + * namespace of the opener's procfs instance.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its pid. This is
> + * similar to calling getppid() on a process who's parent is out of it's
> + * pid namespace (e.g. when moving a process into a sibling pid namespace
> + * via setns()).
> + *
> + * NSpid:
> + * If pid namespaces are supported then this function will also print the
> + * pid a given pidfd refers to for all descendant pid namespaces starting
> + * from the current pid namespace of the opener's procfs instance, i.e. the
> + * first pid entry for Pid and NSpid will be identical.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its first NSpid and
> + * no other NSpid entries will be shown.
> + * Note that this differs from the Pid and NSpid fields in
> + * /proc/<pid>/status where Pid and NSpid are always shown relative to the
> + * pid namespace of the opener's procfs instace. The difference becomes
> + * obvious when sending around a pidfd between pid namespaces from
> + * different trees, i.e. where no ancestoral relation is present between
> + * the pid namespaces:
> + * 1. sending around pidfd:
> + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
> + *   (Also take care to create new mount namespaces in the new pid
> + *   namespace and mount procfs.)
> + * - create a process with a pidfd in ns1
> + * - send pidfd from ns1 to ns2
> + * - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
> + *   are 0
> + * - create a process with a pidfd in
> + * - open a pidfd for a process in the initial pid namespace
> + * 2. sending around /proc/<pid>/status fd:
> + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
> + *   (Also take care to create new mount namespaces in the new pid
> + *   namespace and mount procfs.)
> + * - create a process in ns1
> + * - open /proc/<pid>/status in the initial pid namespace for the process
> + *   you created in ns1
> + * - send statusfd from initial pid namespace to ns2
> + * - read statusfd and observe:
> + *   - that Pid will contain the pid of the process as seen from the init
> + *   - that NSpid will contain the pids of the process for all descendant
> + *     pid namespaces starting from the initial pid namespace
> + */
>  static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
>  {
>  	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
>  	struct pid *pid = f->private_data;
> +	pid_t nr = pid_nr_ns(pid, ns);
> +
> +	seq_put_decimal_ull(m, "Pid:\t", nr);
> +
> +#ifdef CONFIG_PID_NS
> +	seq_puts(m, "\nNSpid:");
> +	if (nr == 0) {
> +		/*
> +		 * If nr is zero the pid namespace of the procfs and the
> +		 * pid namespace of the pidfd are neither the same pid
> +		 * namespace nor are they ancestors. Since NSpid and Pid
> +		 * are always identical in their first entry shortcut it
> +		 * and simply print 0.
> +		 */
> +		seq_put_decimal_ull(m, "\t", nr);
> +	} else {
> +		int i;
> +		for (i = ns->level; i <= pid->level; i++)
> +			seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, pid->numbers[i].ns));
> +	}
> +#endif
>  
> -	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
>  	seq_putc(m, '\n');
>  }
>  #endif
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-12 10:21                 ` Christian Brauner
@ 2019-10-14  9:43                   ` Christian Kellner
  2019-10-14 10:31                     ` Christian Brauner
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Kellner @ 2019-10-14  9:43 UTC (permalink / raw)
  To: Christian Brauner, jannh
  Cc: aarcange, akpm, cyphar, elena.reshetova, guro, ldv, linux-api,
	linux-kernel, mhocko, mingo, peterz, tglx, viro

On Sat, 2019-10-12 at 12:21 +0200, Christian Brauner wrote:
> I think this might be more what we want.
Yep, indeed.

> I tried to think of cases where the first entry of Pid is not
> identical
> to the first entry of NSpid but I came up with none. Maybe you do,
> Jann?
Yeah, I don't think that can be the case. By looking at the source of
'pid_nr_ns(pid, ns)' a non-zero return means that a) 'pid' valid, ie.
non-null and b) 'ns' is in the pid namespace hierarchy of 'pid' (at
pid->level, i.e. "pid->numbers[ns->level].ns == ns").

> Christian, this is just a quick stab I took. Feel free to pick this
> up as a template.
Thanks! I slightly re-worked it, with the reasoning above in mind, to
get rid of one of the branches:

+#ifdef CONFIG_PID_NS
+	seq_put_decimal_ull(m, "\nNSpid:\t", nr);
+	if (nr) {
+		int i;
+
+		/* If nr is non-zero it means that 'pid' is valid and that
+		 * ns, i.e. the pid namespace associated with the procfs
+		 * instance, is in the pid namespace hierarchy of pid.
+		 * Start at one level below and print all descending pids.
+		 */
+		for (i = ns->level + 1; i <= pid->level; i++) {
+			ns = pid->numbers[i].ns;
+			seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
+		}
+	}
+#endif

But I now just realized that with the very same reasoning, if nr is
non-zero, we don't need to redo all the checks and can just do:

for (i = ns->level + 1; i <= pid->level; i++)
	seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);

If this sounds good to you I resend the patches with the change above.

Thanks,
Christian


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-14  9:43                   ` Christian Kellner
@ 2019-10-14 10:31                     ` Christian Brauner
  2019-10-14 15:10                       ` Jann Horn
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-14 10:31 UTC (permalink / raw)
  To: Christian Kellner
  Cc: jannh, aarcange, akpm, cyphar, elena.reshetova, guro, ldv,
	linux-api, linux-kernel, mhocko, mingo, peterz, tglx, viro

On Mon, Oct 14, 2019 at 11:43:01AM +0200, Christian Kellner wrote:
> On Sat, 2019-10-12 at 12:21 +0200, Christian Brauner wrote:
> > I think this might be more what we want.
> Yep, indeed.
> 
> > I tried to think of cases where the first entry of Pid is not
> > identical
> > to the first entry of NSpid but I came up with none. Maybe you do,
> > Jann?
> Yeah, I don't think that can be the case. By looking at the source of
> 'pid_nr_ns(pid, ns)' a non-zero return means that a) 'pid' valid, ie.
> non-null and b) 'ns' is in the pid namespace hierarchy of 'pid' (at
> pid->level, i.e. "pid->numbers[ns->level].ns == ns").
> 
> > Christian, this is just a quick stab I took. Feel free to pick this
> > up as a template.
> Thanks! I slightly re-worked it, with the reasoning above in mind, to
> get rid of one of the branches:

Thanks!

> 
> +#ifdef CONFIG_PID_NS
> +	seq_put_decimal_ull(m, "\nNSpid:\t", nr);
> +	if (nr) {
> +		int i;
> +
> +		/* If nr is non-zero it means that 'pid' is valid and that
> +		 * ns, i.e. the pid namespace associated with the procfs
> +		 * instance, is in the pid namespace hierarchy of pid.
> +		 * Start at one level below and print all descending pids.
> +		 */
> +		for (i = ns->level + 1; i <= pid->level; i++) {
> +			ns = pid->numbers[i].ns;

I'm not a fan of overriding the "ns" pointer. It's not a huge deal but
it's rather subtle.

> +			seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, ns));
> +		}
> +	}
> +#endif
> 
> But I now just realized that with the very same reasoning, if nr is
> non-zero, we don't need to redo all the checks and can just do:
> 
> for (i = ns->level + 1; i <= pid->level; i++)
> 	seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
> 
> If this sounds good to you I resend the patches with the change above.

You could probably do:

#ifdef CONFIG_PID_NS
seq_put_decimal_ull(m, "\nNSpid:\t", nr);
for (i = ns->level + 1; i <= pid->level && nr; i++)
	seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
#endif

Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-12 10:19               ` [PATCH] pidfd: add NSpid entries to fdinfo Christian Brauner
  2019-10-12 10:21                 ` Christian Brauner
@ 2019-10-14 15:09                 ` Jann Horn
  2019-10-14 17:06                   ` Christian Brauner
  1 sibling, 1 reply; 36+ messages in thread
From: Jann Horn @ 2019-10-14 15:09 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrea Arcangeli, Andrew Morton, Christian Kellner,
	Christian Kellner, Aleksa Sarai, Elena Reshetova, Roman Gushchin,
	Dmitry V. Levin, Linux API, kernel list, Michal Hocko,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Al Viro

On Sat, Oct 12, 2019 at 12:19 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> Currently, the fdinfo file of contains the field Pid:

nit: something missing after "of"?

> It contains the pid a given pidfd refers to in the pid namespace of the
> opener's procfs instance.

s/opener's // here and in various places below? "the opener's" is
kinda misleading since it sounds as if it has something to do with
task identity.

> If the pid namespace of the process is not a descendant of the pid
> namespace of the procfs instance 0 will be shown as its pid. This is
> similar to calling getppid() on a process who's parent is out of it's

nit: s/who's/whose/

nit: s/it's/its/

> pid namespace (e.g. when moving a process into a sibling pid namespace
> via setns()).

You can't move a process into a PID namespace via setns(), you can
only set the PID namespace for its children; and even then, you can't
set that to a sibling PID namespace, you can only move into descendant
PID namespaces.

> Add an NSpid field for easy retrieval of the pid in all descendant pid
> namespaces:
> If pid namespaces are supported this field will contain the pid a given
> pidfd refers to for all descendant pid namespaces starting from the
> current pid namespace of the opener's procfs instance, i.e. the first

s/current // - neither tasks nor procfs instances can change which pid
namespace they're associated with

> pid entry for Pid and NSpid will be identical.

*the Pid field and the first entry in the NSpid field?

> If the pid namespace of the process is not a descendant of the pid
> namespace of the procfs instance 0 will be shown as its first NSpid and
> no other NSpid entries will be shown.
> Note that this differs from the Pid and NSpid fields in
> /proc/<pid>/status where Pid and NSpid are always shown relative to the
> pid namespace of the opener's procfs instace. The difference becomes
> obvious when sending around a pidfd between pid namespaces from
> different trees, i.e. where no ancestoral relation is present between
> the pid namespaces:
> 1. sending around pidfd:
> - create two new pid namespaces ns1 and ns2 in the initial pid namespace
>   (Also take care to create new mount namespaces in the new pid
>   namespace and mount procfs.)
> - create a process with a pidfd in ns1
> - send pidfd from ns1 to ns2
> - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
>   are 0
> - create a process with a pidfd in

truncated phrase?

> - open a pidfd for a process in the initial pid namespace
> 2. sending around /proc/<pid>/status fd:
> - create two new pid namespaces ns1 and ns2 in the initial pid namespace
>   (Also take care to create new mount namespaces in the new pid
>   namespace and mount procfs.)
> - create a process in ns1
> - open /proc/<pid>/status in the initial pid namespace for the process
>   you created in ns1
> - send statusfd from initial pid namespace to ns2
> - read statusfd and observe:
>   - that Pid will contain the pid of the process as seen from the init
>   - that NSpid will contain the pids of the process for all descendant
>     pid namespaces starting from the initial pid namespace

You don't need fd passing for case 2, you can just look at the procfs
instance that's mounted in the initial mount namespace. Using fd
passing in this example kinda obfuscates what's going on, in my
opinion.


> +/**
> + * pidfd_show_fdinfo - print information about a pidfd
> + * @m: proc fdinfo file
> + * @f: file referencing a pidfd
> + *
> + * Pid:
> + * This function will print the pid a given pidfd refers to in the pid
> + * namespace of the opener's procfs instance.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its pid. This is
> + * similar to calling getppid() on a process who's parent is out of it's
> + * pid namespace (e.g. when moving a process into a sibling pid namespace
> + * via setns()).
> + * NSpid:
> + * If pid namespaces are supported then this function will also print the
> + * pid a given pidfd refers to for all descendant pid namespaces starting
> + * from the current pid namespace of the opener's procfs instance, i.e. the
> + * first pid entry for Pid and NSpid will be identical.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its first NSpid and
> + * no other NSpid entries will be shown.
> + * Note that this differs from the Pid and NSpid fields in
> + * /proc/<pid>/status where Pid and NSpid are always shown relative to the
> + * pid namespace of the opener's procfs instace. The difference becomes
> + * obvious when sending around a pidfd between pid namespaces from
> + * different trees, i.e. where no ancestoral relation is present between
> + * the pid namespaces:
> + * 1. sending around pidfd:
> + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
> + *   (Also take care to create new mount namespaces in the new pid
> + *   namespace and mount procfs.)
> + * - create a process with a pidfd in ns1
> + * - send pidfd from ns1 to ns2
> + * - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
> + *   are 0
> + * - create a process with a pidfd in
> + * - open a pidfd for a process in the initial pid namespace
> + * 2. sending around /proc/<pid>/status fd:
> + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace
> + *   (Also take care to create new mount namespaces in the new pid
> + *   namespace and mount procfs.)
> + * - create a process in ns1
> + * - open /proc/<pid>/status in the initial pid namespace for the process
> + *   you created in ns1
> + * - send statusfd from initial pid namespace to ns2
> + * - read statusfd and observe:
> + *   - that Pid will contain the pid of the process as seen from the init
> + *   - that NSpid will contain the pids of the process for all descendant
> + *     pid namespaces starting from the initial pid namespace
> + */

See above, same problems as in the commit message. Actually, since you
have a big comment block with this text, there's no reason to repeat
it in the commit message, right?

(By the way, only slightly related to this patch: At the moment, if
you have a pidfd to a task that has already been reaped, and the PID
has been reallocated to another task, the pidfd will still show up in
/proc/*/fdinfo/* as if it referred to a non-dead process, right? Might
be worth considering whether you want to probe ->tasks[PIDTYPE_PID] or
->tasks[PIDTYPE_TGID] for NULL-ness when printing fdinfo, or something
like that.)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-14 10:31                     ` Christian Brauner
@ 2019-10-14 15:10                       ` Jann Horn
  2019-10-14 15:20                         ` Christian Kellner
  0 siblings, 1 reply; 36+ messages in thread
From: Jann Horn @ 2019-10-14 15:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Kellner, Andrea Arcangeli, Andrew Morton, Aleksa Sarai,
	Elena Reshetova, Roman Gushchin, Dmitry V. Levin, Linux API,
	kernel list, Michal Hocko, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Al Viro

On Mon, Oct 14, 2019 at 12:32 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> On Mon, Oct 14, 2019 at 11:43:01AM +0200, Christian Kellner wrote:
> > On Sat, 2019-10-12 at 12:21 +0200, Christian Brauner wrote:
> > > I tried to think of cases where the first entry of Pid is not
> > > identical
> > > to the first entry of NSpid but I came up with none. Maybe you do,
> > > Jann?
> > Yeah, I don't think that can be the case. By looking at the source of
> > 'pid_nr_ns(pid, ns)' a non-zero return means that a) 'pid' valid, ie.
> > non-null and b) 'ns' is in the pid namespace hierarchy of 'pid' (at
> > pid->level, i.e. "pid->numbers[ns->level].ns == ns").

Agreed.

> You could probably do:
>
> #ifdef CONFIG_PID_NS
> seq_put_decimal_ull(m, "\nNSpid:\t", nr);
> for (i = ns->level + 1; i <= pid->level && nr; i++)
>         seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
> #endif

Personally, I dislike hiding the precondition for running the loop in
the loop statement like that. While it makes the code more concise, it
somewhat obfuscates the high-level logic at a first glance.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-14 15:10                       ` Jann Horn
@ 2019-10-14 15:20                         ` Christian Kellner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Kellner @ 2019-10-14 15:20 UTC (permalink / raw)
  To: Jann Horn, Christian Brauner
  Cc: Andrea Arcangeli, Andrew Morton, Aleksa Sarai, Elena Reshetova,
	Roman Gushchin, Dmitry V. Levin, Linux API, kernel list,
	Michal Hocko, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Al Viro

On Mon, 2019-10-14 at 17:10 +0200, Jann Horn wrote:
> On Mon, Oct 14, 2019 at 12:32 PM Christian Brauner wrote:
> > On Mon, Oct 14, 2019 at 11:43:01AM +0200, Christian Kellner wrote:
> > You could probably do:
> > 
> > #ifdef CONFIG_PID_NS
> > seq_put_decimal_ull(m, "\nNSpid:\t", nr);
> > for (i = ns->level + 1; i <= pid->level && nr; i++)
> >         seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
> > #endif
> 
> Personally, I dislike hiding the precondition for running the loop in
> the loop statement like that. While it makes the code more concise,
> it somewhat obfuscates the high-level logic at a first glance.
I agree and it has the side-effect of needing another #ifdef at the end
of the variable block for "i". I think I will go with:

if (nr) {
	int i;

	/* If nr is non-zero it means that 'pid' is valid and that
	 * ns, i.e. the pid namespace associated with the procfs
	 * instance, is in the pid namespace hierarchy of pid.
	 * Start at one below the already printed level.
	 */
	for (i = ns->level + 1; i <= pid->level; i++)
		seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
}

I will re-work the comment block and then send a new version of
the patch.

Thanks,
Christian


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 1/2] pidfd: add NSpid entries to fdinfo
  2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
                       ` (2 preceding siblings ...)
  2019-10-11 14:55     ` Jann Horn
@ 2019-10-14 16:20     ` Christian Kellner
  2019-10-14 16:20       ` [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo Christian Kellner
  2019-10-15  9:40       ` [PATCH v4 1/2] pidfd: add NSpid entries to fdinfo Christian Brauner
  3 siblings, 2 replies; 36+ messages in thread
From: Christian Kellner @ 2019-10-14 16:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Jann Horn, Christian Kellner, Christian Brauner,
	Christian Brauner, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

From: Christian Kellner <christian@kellner.me>

Currently, the fdinfo file contains the Pid field which shows the
pid a given pidfd refers to in the pid namespace of the procfs
instance. If pid namespaces are configured, also show an NSpid field
for easy retrieval of the pid in all descendant pid namespaces. If
the pid namespace of the process is not a descendant of the pid
namespace of the procfs instance 0 will be shown as its first NSpid
entry and no other entries will be shown. Add a block comment to
pidfd_show_fdinfo with a detailed explanation of Pid and NSpid fields.

Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Kellner <christian@kellner.me>
---
Changes in v4:
- Reworked to properly handle the case where the pidfd is from a
  different branch in the pid namespace hierarchy; also add block
  comment with an in-depth explanation (Christian Brauner)

 kernel/fork.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index bcdf53125210..782986962d47 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1695,12 +1695,63 @@ static int pidfd_release(struct inode *inode, struct file *file)
 }
 
 #ifdef CONFIG_PROC_FS
+/**
+ * pidfd_show_fdinfo - print information about a pidfd
+ * @m: proc fdinfo file
+ * @f: file referencing a pidfd
+ *
+ * Pid:
+ * This function will print the pid that a given pidfd refers to in the
+ * pid namespace of the procfs instance.
+ * If the pid namespace of the process is not a descendant of the pid
+ * namespace of the procfs instance 0 will be shown as its pid. This is
+ * similar to calling getppid() on a process whose parent is outside of
+ * its pid namespace.
+ *
+ * NSpid:
+ * If pid namespaces are supported then this function will also print
+ * the pid of a given pidfd refers to for all descendant pid namespaces
+ * starting from the current pid namespace of the instance, i.e. the
+ * Pid field and the first entry in the NSpid field will be identical.
+ * If the pid namespace of the process is not a descendant of the pid
+ * namespace of the procfs instance 0 will be shown as its first NSpid
+ * entry and no others will be shown.
+ * Note that this differs from the Pid and NSpid fields in
+ * /proc/<pid>/status where Pid and NSpid are always shown relative to
+ * the  pid namespace of the procfs instance. The difference becomes
+ * obvious when sending around a pidfd between pid namespaces from a
+ * different branch of the tree, i.e. where no ancestoral relation is
+ * present between the pid namespaces:
+ * - create two new pid namespaces ns1 and ns2 in the initial pid
+ *   namespace (also take care to create new mount namespaces in the
+ *   new pid namespace and mount procfs)
+ * - create a process with a pidfd in ns1
+ * - send pidfd from ns1 to ns2
+ * - read /proc/self/fdinfo/<pidfd> and observe that both Pid and NSpid
+ *   have exactly one entry, which is 0
+ */
 static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
 {
 	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
 	struct pid *pid = f->private_data;
+	pid_t nr = pid_nr_ns(pid, ns);
+
+	seq_put_decimal_ull(m, "Pid:\t", nr);
 
-	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
+#ifdef CONFIG_PID_NS
+	seq_put_decimal_ull(m, "\nNSpid:\t", nr);
+	if (nr) {
+		int i;
+
+		/* If nr is non-zero it means that 'pid' is valid and that
+		 * ns, i.e. the pid namespace associated with the procfs
+		 * instance, is in the pid namespace hierarchy of pid.
+		 * Start at one below the already printed level.
+		 */
+		for (i = ns->level + 1; i <= pid->level; i++)
+			seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
+	}
+#endif
 	seq_putc(m, '\n');
 }
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-14 16:20     ` [PATCH v4 1/2] " Christian Kellner
@ 2019-10-14 16:20       ` Christian Kellner
  2019-10-15 10:07         ` Christian Brauner
  2019-10-15  9:40       ` [PATCH v4 1/2] pidfd: add NSpid entries to fdinfo Christian Brauner
  1 sibling, 1 reply; 36+ messages in thread
From: Christian Kellner @ 2019-10-14 16:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, Jann Horn, Christian Kellner, Christian Brauner,
	Shuah Khan, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, linux-kselftest

From: Christian Kellner <christian@kellner.me>

Add a test that checks that if pid namespaces are configured the fdinfo
file of a pidfd contains an NSpid: entry containing the process id in
the current and additionally all nested namespaces. In the case that
a pidfd is from a pid namespace not in the same namespace hierarchy as
the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
that pidfd, analogous to the 'Pid' entry.

Signed-off-by: Christian Kellner <christian@kellner.me>
---
Changes in v4:
- Rework to test include a the situation where the fdinfo for a pidfd
  of a process in a sibling pid namespace is being read and ensure
  the NSpid field only contains one entry, being 0.

 tools/testing/selftests/pidfd/Makefile        |   2 +-
 .../selftests/pidfd/pidfd_fdinfo_test.c       | 265 ++++++++++++++++++
 2 files changed, 266 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_fdinfo_test.c

diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index 7550f08822a3..43db1b98e845 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 CFLAGS += -g -I../../../../usr/include/ -pthread
 
-TEST_GEN_PROGS := pidfd_test pidfd_open_test pidfd_poll_test pidfd_wait
+TEST_GEN_PROGS := pidfd_test pidfd_fdinfo_test pidfd_open_test pidfd_poll_test pidfd_wait
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
new file mode 100644
index 000000000000..3721be994abd
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c
@@ -0,0 +1,265 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/types.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/wait.h>
+
+#include "pidfd.h"
+#include "../kselftest.h"
+
+struct error {
+	int  code;
+	char msg[512];
+};
+
+static int error_set(struct error *err, int code, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	if (code == PIDFD_PASS || !err || err->code != PIDFD_PASS)
+		return code;
+
+	err->code = code;
+	va_start(args, fmt);
+	r = vsnprintf(err->msg, sizeof(err->msg), fmt, args);
+	assert((size_t)r < sizeof(err->msg));
+	va_end(args);
+
+	return code;
+}
+
+static void error_report(struct error *err, const char *test_name)
+{
+	switch (err->code) {
+	case PIDFD_ERROR:
+		ksft_exit_fail_msg("%s test: Fatal: %s\n", test_name, err->msg);
+		break;
+
+	case PIDFD_FAIL:
+		/* will be: not ok %d # error %s test: %s */
+		ksft_test_result_error("%s test: %s\n", test_name, err->msg);
+		break;
+
+	case PIDFD_SKIP:
+		/* will be: not ok %d # SKIP %s test: %s */
+		ksft_test_result_skip("%s test: %s\n", test_name, err->msg);
+		break;
+
+	case PIDFD_XFAIL:
+		ksft_test_result_pass("%s test: Expected failure: %s\n",
+				      test_name, err->msg);
+		break;
+
+	case PIDFD_PASS:
+		ksft_test_result_pass("%s test: Passed\n");
+		break;
+
+	default:
+		ksft_exit_fail_msg("%s test: Unknown code: %d %s\n",
+				   test_name, err->code, err->msg);
+		break;
+	}
+}
+
+static inline int error_check(struct error *err, const char *test_name)
+{
+	/* In case of error we bail out and terminate the test program */
+	if (err->code == PIDFD_ERROR)
+		error_report(err, test_name);
+
+	return err->code;
+}
+
+struct child {
+	pid_t pid;
+	int   fd;
+};
+
+static struct child clone_newns(int (*fn)(void *), void *args,
+				struct error *err)
+{
+	static int flags = CLONE_PIDFD | CLONE_NEWPID | CLONE_NEWNS | SIGCHLD;
+	size_t stack_size = 1024;
+	char *stack[1024] = { 0 };
+	struct child ret;
+
+	if (!(flags & CLONE_NEWUSER) && geteuid() != 0)
+		flags |= CLONE_NEWUSER;
+
+#ifdef __ia64__
+	ret.pid = __clone2(fn, stack, stack_size, flags, args, &ret.fd);
+#else
+	ret.pid = clone(fn, stack + stack_size, flags, args, &ret.fd);
+#endif
+
+	if (ret.pid < 0) {
+		error_set(err, PIDFD_ERROR, "clone failed (ret %d, errno %d)",
+			  ret.fd, errno);
+		return ret;
+	}
+
+	ksft_print_msg("New child: %d, fd: %d\n", ret.pid, ret.fd);
+
+	return ret;
+}
+
+static inline int child_join(struct child *child, struct error *err)
+{
+	int r;
+
+	(void)close(child->fd);
+	r = wait_for_pid(child->pid);
+	if (r < 0)
+		error_set(err, PIDFD_ERROR, "waitpid failed (ret %d, errno %d)",
+			  r, errno);
+	else if (r > 0)
+		error_set(err, r, "child %d reported: %d", child->pid, r);
+
+	return r;
+}
+
+static inline void trim_newline(char *str)
+{
+	char *pos = strrchr(str, '\n');
+
+	if (pos)
+		*pos = '\0';
+}
+
+static int verify_fdinfo_nspid(int pidfd, struct error *err,
+			       const char *expect, ...)
+{
+	char buffer[512] = {0, };
+	char path[512] = {0, };
+	va_list args;
+	FILE *f;
+	char *line = NULL;
+	size_t n = 0;
+	int found = 0;
+	int r;
+
+	va_start(args, expect);
+	r = vsnprintf(buffer, sizeof(buffer), expect, args);
+	assert((size_t)r < sizeof(buffer));
+	va_end(args);
+
+	snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
+	f = fopen(path, "re");
+	if (!f)
+		return error_set(err, PIDFD_ERROR, "fdinfo open failed for %d",
+				 pidfd);
+
+	while (getline(&line, &n, f) != -1) {
+		if (strncmp(line, "NSpid:", 6))
+			continue;
+
+		found = 1;
+
+		r = strcmp(line + 6, buffer);
+		if (r != 0) {
+			trim_newline(line);
+			trim_newline(buffer);
+			error_set(err, PIDFD_FAIL, "NSpid: '%s' != '%s'",
+				  line + 6, buffer);
+		}
+		break;
+	}
+
+	free(line);
+	fclose(f);
+
+	if (found == 0)
+		return error_set(err, PIDFD_FAIL, "NSpid not found for fd %d",
+				 pidfd);
+
+	return PIDFD_PASS;
+}
+
+static int child_fdinfo_nspid_test(void *args)
+{
+	struct error err;
+	int pidfd;
+	int r;
+
+	/* if we got no fd for the sibling, we are done */
+	if (!args)
+		return PIDFD_PASS;
+
+	/* verify that we can not resolve the pidfd for a process
+	 * in a sibling pid namespace, i.e. a pid namespace it is
+	 * not in our or a descended namespace
+	 */
+	r = mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, 0);
+	if (r < 0) {
+		ksft_print_msg("Failed to remount / private\n");
+		return PIDFD_ERROR;
+	}
+
+	(void)umount2("/proc", MNT_DETACH);
+	r = mount("proc", "/proc", "proc", 0, NULL);
+	if (r < 0) {
+		ksft_print_msg("Failed to remount /proc\n");
+		return PIDFD_ERROR;
+	}
+
+	pidfd = *(int *)args;
+	r = verify_fdinfo_nspid(pidfd, &err, "\t0\n");
+
+	if (r != PIDFD_PASS)
+		ksft_print_msg("NSpid fdinfo check failed: %s\n", err.msg);
+
+	return r;
+}
+
+static void test_pidfd_fdinfo_nspid(void)
+{
+	struct child a, b;
+	struct error err = {0, };
+	const char *test_name = "pidfd check for NSpid in fdinfo";
+
+	/* Create a new child in a new pid and mount namespace */
+	a = clone_newns(child_fdinfo_nspid_test, NULL, &err);
+	error_check(&err, test_name);
+
+	/* Pass the pidfd representing the first child to the
+	 * second child, which will be in a sibling pid namespace,
+	 * which means that the fdinfo NSpid entry for the pidfd
+	 * should only contain '0'.
+	 */
+	b = clone_newns(child_fdinfo_nspid_test, &a.fd, &err);
+	error_check(&err, test_name);
+
+	/* The children will have pid 1 in the new pid namespace,
+	 * so the line must be 'NSPid:\t<pid>\t1'.
+	 */
+	verify_fdinfo_nspid(a.fd, &err, "\t%d\t%d\n", a.pid, 1);
+	verify_fdinfo_nspid(b.fd, &err, "\t%d\t%d\n", b.pid, 1);
+
+	/* wait for the process, check the exit status and set
+	 * 'err' accordingly, if it is not already set.
+	 */
+	child_join(&a, &err);
+	child_join(&b, &err);
+
+	error_report(&err, test_name);
+}
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(1);
+
+	test_pidfd_fdinfo_nspid();
+
+	return ksft_exit_pass();
+}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] pidfd: add NSpid entries to fdinfo
  2019-10-14 15:09                 ` Jann Horn
@ 2019-10-14 17:06                   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-14 17:06 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrea Arcangeli, Andrew Morton, Christian Kellner,
	Christian Kellner, Aleksa Sarai, Elena Reshetova, Roman Gushchin,
	Dmitry V. Levin, Linux API, kernel list, Michal Hocko,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Al Viro

On October 14, 2019 5:09:58 PM GMT+02:00, Jann Horn <jannh@google.com> wrote:
>On Sat, Oct 12, 2019 at 12:19 PM Christian Brauner
><christian.brauner@ubuntu.com> wrote:
>> Currently, the fdinfo file of contains the field Pid:
>
>nit: something missing after "of"?
>
>> It contains the pid a given pidfd refers to in the pid namespace of
>the
>> opener's procfs instance.
>
>s/opener's // here and in various places below? "the opener's" is
>kinda misleading since it sounds as if it has something to do with
>task identity.
>
>> If the pid namespace of the process is not a descendant of the pid
>> namespace of the procfs instance 0 will be shown as its pid. This is
>> similar to calling getppid() on a process who's parent is out of it's
>
>nit: s/who's/whose/
>
>nit: s/it's/its/
>
>> pid namespace (e.g. when moving a process into a sibling pid
>namespace
>> via setns()).
>
>You can't move a process into a PID namespace via setns(), you can
>only set the PID namespace for its children; and even then, you can't
>set that to a sibling PID namespace, you can only move into descendant
>PID namespaces.

Yes, I know. This was sloppy changelogging on my part.

>
>> Add an NSpid field for easy retrieval of the pid in all descendant
>pid
>> namespaces:
>> If pid namespaces are supported this field will contain the pid a
>given
>> pidfd refers to for all descendant pid namespaces starting from the
>> current pid namespace of the opener's procfs instance, i.e. the first
>
>s/current // - neither tasks nor procfs instances can change which pid
>namespace they're associated with

Yes.

>
>> pid entry for Pid and NSpid will be identical.
>
>*the Pid field and the first entry in the NSpid field?

Yes.

>
>> If the pid namespace of the process is not a descendant of the pid
>> namespace of the procfs instance 0 will be shown as its first NSpid
>and
>> no other NSpid entries will be shown.
>> Note that this differs from the Pid and NSpid fields in
>> /proc/<pid>/status where Pid and NSpid are always shown relative to
>the
>> pid namespace of the opener's procfs instace. The difference becomes
>> obvious when sending around a pidfd between pid namespaces from
>> different trees, i.e. where no ancestoral relation is present between
>> the pid namespaces:
>> 1. sending around pidfd:
>> - create two new pid namespaces ns1 and ns2 in the initial pid
>namespace
>>   (Also take care to create new mount namespaces in the new pid
>>   namespace and mount procfs.)
>> - create a process with a pidfd in ns1
>> - send pidfd from ns1 to ns2
>> - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid entry
>>   are 0
>> - create a process with a pidfd in
>
>truncated phrase?

Yeah, as I said this was really just a rough draft for Christian (the other one :)).

>
>> - open a pidfd for a process in the initial pid namespace
>> 2. sending around /proc/<pid>/status fd:
>> - create two new pid namespaces ns1 and ns2 in the initial pid
>namespace
>>   (Also take care to create new mount namespaces in the new pid
>>   namespace and mount procfs.)
>> - create a process in ns1
>> - open /proc/<pid>/status in the initial pid namespace for the
>process
>>   you created in ns1
>> - send statusfd from initial pid namespace to ns2
>> - read statusfd and observe:
>>   - that Pid will contain the pid of the process as seen from the
>init
>>   - that NSpid will contain the pids of the process for all
>descendant
>>     pid namespaces starting from the initial pid namespace
>
>You don't need fd passing for case 2, you can just look at the procfs
>instance that's mounted in the initial mount namespace. Using fd
>passing in this example kinda obfuscates what's going on, in my
>opinion.

My goal was to illustrate the same mechanism leading to different results.
But I don't care much about how this is described as long as I illustrates the difference.

>
>
>> +/**
>> + * pidfd_show_fdinfo - print information about a pidfd
>> + * @m: proc fdinfo file
>> + * @f: file referencing a pidfd
>> + *
>> + * Pid:
>> + * This function will print the pid a given pidfd refers to in the
>pid
>> + * namespace of the opener's procfs instance.
>> + * If the pid namespace of the process is not a descendant of the
>pid
>> + * namespace of the procfs instance 0 will be shown as its pid. This
>is
>> + * similar to calling getppid() on a process who's parent is out of
>it's
>> + * pid namespace (e.g. when moving a process into a sibling pid
>namespace
>> + * via setns()).
>> + * NSpid:
>> + * If pid namespaces are supported then this function will also
>print the
>> + * pid a given pidfd refers to for all descendant pid namespaces
>starting
>> + * from the current pid namespace of the opener's procfs instance,
>i.e. the
>> + * first pid entry for Pid and NSpid will be identical.
>> + * If the pid namespace of the process is not a descendant of the
>pid
>> + * namespace of the procfs instance 0 will be shown as its first
>NSpid and
>> + * no other NSpid entries will be shown.
>> + * Note that this differs from the Pid and NSpid fields in
>> + * /proc/<pid>/status where Pid and NSpid are always shown relative
>to the
>> + * pid namespace of the opener's procfs instace. The difference
>becomes
>> + * obvious when sending around a pidfd between pid namespaces from
>> + * different trees, i.e. where no ancestoral relation is present
>between
>> + * the pid namespaces:
>> + * 1. sending around pidfd:
>> + * - create two new pid namespaces ns1 and ns2 in the initial pid
>namespace
>> + *   (Also take care to create new mount namespaces in the new pid
>> + *   namespace and mount procfs.)
>> + * - create a process with a pidfd in ns1
>> + * - send pidfd from ns1 to ns2
>> + * - read /proc/self/fdinfo/<pidfd> and observe that Pid and NSpid
>entry
>> + *   are 0
>> + * - create a process with a pidfd in
>> + * - open a pidfd for a process in the initial pid namespace
>> + * 2. sending around /proc/<pid>/status fd:
>> + * - create two new pid namespaces ns1 and ns2 in the initial pid
>namespace
>> + *   (Also take care to create new mount namespaces in the new pid
>> + *   namespace and mount procfs.)
>> + * - create a process in ns1
>> + * - open /proc/<pid>/status in the initial pid namespace for the
>process
>> + *   you created in ns1
>> + * - send statusfd from initial pid namespace to ns2
>> + * - read statusfd and observe:
>> + *   - that Pid will contain the pid of the process as seen from the
>init
>> + *   - that NSpid will contain the pids of the process for all
>descendant
>> + *     pid namespaces starting from the initial pid namespace
>> + */
>
>See above, same problems as in the commit message. Actually, since you
>have a big comment block with this text, there's no reason to repeat
>it in the commit message, right?

If the comment gets modified or the logic changes I'd still like to have the actual context recorded in the changelog too.
I think that's good practice.

>
>(By the way, only slightly related to this patch: At the moment, if
>you have a pidfd to a task that has already been reaped, and the PID
>has been reallocated to another task, the pidfd will still show up in
>/proc/*/fdinfo/* as if it referred to a non-dead process, right? Might
>be worth considering whether you want to probe ->tasks[PIDTYPE_PID] or
>->tasks[PIDTYPE_TGID] for NULL-ness when printing fdinfo, or something
>like that.)

Not a big deal but yes, let me put this on my list.
Or do you feel in the mood for a patch? ;)

Christian


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 1/2] pidfd: add NSpid entries to fdinfo
  2019-10-14 16:20     ` [PATCH v4 1/2] " Christian Kellner
  2019-10-14 16:20       ` [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo Christian Kellner
@ 2019-10-15  9:40       ` Christian Brauner
  1 sibling, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-10-15  9:40 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, linux-api, Jann Horn, Christian Kellner,
	Christian Brauner, Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin

On Mon, Oct 14, 2019 at 06:20:32PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> Currently, the fdinfo file contains the Pid field which shows the
> pid a given pidfd refers to in the pid namespace of the procfs
> instance. If pid namespaces are configured, also show an NSpid field
> for easy retrieval of the pid in all descendant pid namespaces. If
> the pid namespace of the process is not a descendant of the pid
> namespace of the procfs instance 0 will be shown as its first NSpid
> entry and no other entries will be shown. Add a block comment to
> pidfd_show_fdinfo with a detailed explanation of Pid and NSpid fields.
> 
> Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Signed-off-by: Christian Kellner <christian@kellner.me>

Thanks!
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

> ---
> Changes in v4:
> - Reworked to properly handle the case where the pidfd is from a
>   different branch in the pid namespace hierarchy; also add block
>   comment with an in-depth explanation (Christian Brauner)
> 
>  kernel/fork.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 52 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index bcdf53125210..782986962d47 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1695,12 +1695,63 @@ static int pidfd_release(struct inode *inode, struct file *file)
>  }
>  
>  #ifdef CONFIG_PROC_FS
> +/**
> + * pidfd_show_fdinfo - print information about a pidfd
> + * @m: proc fdinfo file
> + * @f: file referencing a pidfd
> + *
> + * Pid:
> + * This function will print the pid that a given pidfd refers to in the
> + * pid namespace of the procfs instance.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its pid. This is
> + * similar to calling getppid() on a process whose parent is outside of
> + * its pid namespace.
> + *
> + * NSpid:
> + * If pid namespaces are supported then this function will also print
> + * the pid of a given pidfd refers to for all descendant pid namespaces
> + * starting from the current pid namespace of the instance, i.e. the
> + * Pid field and the first entry in the NSpid field will be identical.
> + * If the pid namespace of the process is not a descendant of the pid
> + * namespace of the procfs instance 0 will be shown as its first NSpid
> + * entry and no others will be shown.
> + * Note that this differs from the Pid and NSpid fields in
> + * /proc/<pid>/status where Pid and NSpid are always shown relative to
> + * the  pid namespace of the procfs instance. The difference becomes
> + * obvious when sending around a pidfd between pid namespaces from a
> + * different branch of the tree, i.e. where no ancestoral relation is
> + * present between the pid namespaces:
> + * - create two new pid namespaces ns1 and ns2 in the initial pid
> + *   namespace (also take care to create new mount namespaces in the
> + *   new pid namespace and mount procfs)
> + * - create a process with a pidfd in ns1
> + * - send pidfd from ns1 to ns2
> + * - read /proc/self/fdinfo/<pidfd> and observe that both Pid and NSpid
> + *   have exactly one entry, which is 0
> + */
>  static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
>  {
>  	struct pid_namespace *ns = proc_pid_ns(file_inode(m->file));
>  	struct pid *pid = f->private_data;
> +	pid_t nr = pid_nr_ns(pid, ns);
> +
> +	seq_put_decimal_ull(m, "Pid:\t", nr);
>  
> -	seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns));
> +#ifdef CONFIG_PID_NS
> +	seq_put_decimal_ull(m, "\nNSpid:\t", nr);
> +	if (nr) {
> +		int i;
> +
> +		/* If nr is non-zero it means that 'pid' is valid and that

Nit: multiline kernel comment style is usually

/* 
 * bla
 * bla
 */

but I'll just fix this up when applying. No need to resend.

> +		 * ns, i.e. the pid namespace associated with the procfs
> +		 * instance, is in the pid namespace hierarchy of pid.
> +		 * Start at one below the already printed level.
> +		 */
> +		for (i = ns->level + 1; i <= pid->level; i++)
> +			seq_put_decimal_ull(m, "\t", pid->numbers[i].nr);
> +	}
> +#endif
>  	seq_putc(m, '\n');
>  }
>  #endif
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-14 16:20       ` [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo Christian Kellner
@ 2019-10-15 10:07         ` Christian Brauner
  2019-11-13 11:52             ` Naresh Kamboju
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2019-10-15 10:07 UTC (permalink / raw)
  To: Christian Kellner
  Cc: linux-kernel, linux-api, Jann Horn, Christian Kellner,
	Christian Brauner, Shuah Khan, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, linux-kselftest

On Mon, Oct 14, 2019 at 06:20:33PM +0200, Christian Kellner wrote:
> From: Christian Kellner <christian@kellner.me>
> 
> Add a test that checks that if pid namespaces are configured the fdinfo
> file of a pidfd contains an NSpid: entry containing the process id in
> the current and additionally all nested namespaces. In the case that
> a pidfd is from a pid namespace not in the same namespace hierarchy as
> the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
> that pidfd, analogous to the 'Pid' entry.
> 
> Signed-off-by: Christian Kellner <christian@kellner.me>

That looks reasonable to me.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-10-15 10:07         ` Christian Brauner
@ 2019-11-13 11:52             ` Naresh Kamboju
  0 siblings, 0 replies; 36+ messages in thread
From: Naresh Kamboju @ 2019-11-13 11:52 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Kellner, open list, linux-api, Jann Horn,
	Christian Kellner, Christian Brauner, Shuah Khan, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

Hi Christian,

On Tue, 15 Oct 2019 at 15:37, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Mon, Oct 14, 2019 at 06:20:33PM +0200, Christian Kellner wrote:
> > From: Christian Kellner <christian@kellner.me>
> >
> > Add a test that checks that if pid namespaces are configured the fdinfo
> > file of a pidfd contains an NSpid: entry containing the process id in
> > the current and additionally all nested namespaces. In the case that
> > a pidfd is from a pid namespace not in the same namespace hierarchy as
> > the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
> > that pidfd, analogous to the 'Pid' entry.
> >
> > Signed-off-by: Christian Kellner <christian@kellner.me>
>
> That looks reasonable to me.

on arm64 Juno-r2, Hikey (hi6220) and dragonboard 410c and arm32
Beagleboard x15 test pidfd_test failed.
and on x86_64 and i386 test fails intermittently with TIMEOUT error.
Test code is being used from linux next tree.

Juno-r2 test output:
--------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
# TAP version 13
version: 13_ #
# 1..4
: _ #
# # Parent pid 10586
Parent: pid_10586 #
# # Parent Waiting for Child (10587) to complete.
Parent: Waiting_for #
# # Time waited for child 0
Time: waited_for #
# Bail out! pidfd_poll check for premature notification on child
thread exec test Failed
out!: pidfd_poll_check #
# # Planned tests != run tests (4 != 0)
Planned: tests_!= #
# # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
Pass: 0_Fail #
[FAIL] 1 selftests pidfd pidfd_test # exit=1
selftests: pidfd_pidfd_test [FAIL]

arm32 x15 output log,
-----------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
[FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
selftests: pidfd_pidfd_test [FAIL]

x86_64 output log,
-------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
[FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
selftests: pidfd_pidfd_test [FAIL]

Test results comparison,
https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/kselftest/pidfd_pidfd_test
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/pidfd_pidfd_test

link,
https://lkft.validation.linaro.org/scheduler/job/993549#L17835

- Naresh

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
@ 2019-11-13 11:52             ` Naresh Kamboju
  0 siblings, 0 replies; 36+ messages in thread
From: Naresh Kamboju @ 2019-11-13 11:52 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Kellner, open list, linux-api, Jann Horn,
	Christian Kellner, Christian Brauner, Shuah Khan, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK

Hi Christian,

On Tue, 15 Oct 2019 at 15:37, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Mon, Oct 14, 2019 at 06:20:33PM +0200, Christian Kellner wrote:
> > From: Christian Kellner <christian@kellner.me>
> >
> > Add a test that checks that if pid namespaces are configured the fdinfo
> > file of a pidfd contains an NSpid: entry containing the process id in
> > the current and additionally all nested namespaces. In the case that
> > a pidfd is from a pid namespace not in the same namespace hierarchy as
> > the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
> > that pidfd, analogous to the 'Pid' entry.
> >
> > Signed-off-by: Christian Kellner <christian@kellner.me>
>
> That looks reasonable to me.

on arm64 Juno-r2, Hikey (hi6220) and dragonboard 410c and arm32
Beagleboard x15 test pidfd_test failed.
and on x86_64 and i386 test fails intermittently with TIMEOUT error.
Test code is being used from linux next tree.

Juno-r2 test output:
--------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
# TAP version 13
version: 13_ #
# 1..4
: _ #
# # Parent pid 10586
Parent: pid_10586 #
# # Parent Waiting for Child (10587) to complete.
Parent: Waiting_for #
# # Time waited for child 0
Time: waited_for #
# Bail out! pidfd_poll check for premature notification on child
thread exec test Failed
out!: pidfd_poll_check #
# # Planned tests != run tests (4 != 0)
Planned: tests_!= #
# # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
Pass: 0_Fail #
[FAIL] 1 selftests pidfd pidfd_test # exit=1
selftests: pidfd_pidfd_test [FAIL]

arm32 x15 output log,
-----------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
[FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
selftests: pidfd_pidfd_test [FAIL]

x86_64 output log,
-------------------------
# selftests pidfd pidfd_test
pidfd: pidfd_test_ #
[FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
selftests: pidfd_pidfd_test [FAIL]

Test results comparison,
https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/kselftest/pidfd_pidfd_test
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/pidfd_pidfd_test

link,
https://lkft.validation.linaro.org/scheduler/job/993549#L17835

- Naresh

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
  2019-11-13 11:52             ` Naresh Kamboju
@ 2019-11-13 12:20               ` Christian Brauner
  -1 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-11-13 12:20 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Christian Kellner, open list, linux-api, Jann Horn,
	Christian Kellner, Christian Brauner, Shuah Khan, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK, joel

On Wed, Nov 13, 2019 at 05:22:33PM +0530, Naresh Kamboju wrote:
> Hi Christian,

Hi Naresh,


[+Cc Joel since this is _not related_ to the fdinfo patches but rather
 the polling tests]

Thanks for following up here. See for more comments below.

> 
> On Tue, 15 Oct 2019 at 15:37, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > On Mon, Oct 14, 2019 at 06:20:33PM +0200, Christian Kellner wrote:
> > > From: Christian Kellner <christian@kellner.me>
> > >
> > > Add a test that checks that if pid namespaces are configured the fdinfo
> > > file of a pidfd contains an NSpid: entry containing the process id in
> > > the current and additionally all nested namespaces. In the case that
> > > a pidfd is from a pid namespace not in the same namespace hierarchy as
> > > the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
> > > that pidfd, analogous to the 'Pid' entry.
> > >
> > > Signed-off-by: Christian Kellner <christian@kellner.me>
> >
> > That looks reasonable to me.
> 
> on arm64 Juno-r2, Hikey (hi6220) and dragonboard 410c and arm32
> Beagleboard x15 test pidfd_test failed.
> and on x86_64 and i386 test fails intermittently with TIMEOUT error.
> Test code is being used from linux next tree.
> 
> Juno-r2 test output:
> --------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> # TAP version 13
> version: 13_ #
> # 1..4
> : _ #
> # # Parent pid 10586
> Parent: pid_10586 #
> # # Parent Waiting for Child (10587) to complete.
> Parent: Waiting_for #
> # # Time waited for child 0
> Time: waited_for #
> # Bail out! pidfd_poll check for premature notification on child
> thread exec test Failed
> out!: pidfd_poll_check #
> # # Planned tests != run tests (4 != 0)
> Planned: tests_!= #
> # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
> Pass: 0_Fail #
> [FAIL] 1 selftests pidfd pidfd_test # exit=1
> selftests: pidfd_pidfd_test [FAIL]
> 
> arm32 x15 output log,
> -----------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> [FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
> selftests: pidfd_pidfd_test [FAIL]
> 
> x86_64 output log,
> -------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> [FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
> selftests: pidfd_pidfd_test [FAIL]
> 
> Test results comparison,
> https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/kselftest/pidfd_pidfd_test
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/pidfd_pidfd_test
> 
> link,
> https://lkft.validation.linaro.org/scheduler/job/993549#L17835

Note, that this failure is _not_ related to the fdinfo and NSpid patches
here.
It's rather related to the polling testing code that Joel added. Iirc,
then it is timing sensitive.
I'll try to make some room this week to look into this.

	Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo
@ 2019-11-13 12:20               ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2019-11-13 12:20 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Christian Kellner, open list, linux-api, Jann Horn,
	Christian Kellner, Christian Brauner, Shuah Khan, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Michal Hocko, Thomas Gleixner, Elena Reshetova,
	Roman Gushchin, Andrea Arcangeli, Al Viro, Aleksa Sarai,
	Dmitry V. Levin, open list:KERNEL SELFTEST FRAMEWORK, joel

On Wed, Nov 13, 2019 at 05:22:33PM +0530, Naresh Kamboju wrote:
> Hi Christian,

Hi Naresh,


[+Cc Joel since this is _not related_ to the fdinfo patches but rather
 the polling tests]

Thanks for following up here. See for more comments below.

> 
> On Tue, 15 Oct 2019 at 15:37, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > On Mon, Oct 14, 2019 at 06:20:33PM +0200, Christian Kellner wrote:
> > > From: Christian Kellner <christian@kellner.me>
> > >
> > > Add a test that checks that if pid namespaces are configured the fdinfo
> > > file of a pidfd contains an NSpid: entry containing the process id in
> > > the current and additionally all nested namespaces. In the case that
> > > a pidfd is from a pid namespace not in the same namespace hierarchy as
> > > the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
> > > that pidfd, analogous to the 'Pid' entry.
> > >
> > > Signed-off-by: Christian Kellner <christian@kellner.me>
> >
> > That looks reasonable to me.
> 
> on arm64 Juno-r2, Hikey (hi6220) and dragonboard 410c and arm32
> Beagleboard x15 test pidfd_test failed.
> and on x86_64 and i386 test fails intermittently with TIMEOUT error.
> Test code is being used from linux next tree.
> 
> Juno-r2 test output:
> --------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> # TAP version 13
> version: 13_ #
> # 1..4
> : _ #
> # # Parent pid 10586
> Parent: pid_10586 #
> # # Parent Waiting for Child (10587) to complete.
> Parent: Waiting_for #
> # # Time waited for child 0
> Time: waited_for #
> # Bail out! pidfd_poll check for premature notification on child
> thread exec test Failed
> out!: pidfd_poll_check #
> # # Planned tests != run tests (4 != 0)
> Planned: tests_!= #
> # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
> Pass: 0_Fail #
> [FAIL] 1 selftests pidfd pidfd_test # exit=1
> selftests: pidfd_pidfd_test [FAIL]
> 
> arm32 x15 output log,
> -----------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> [FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
> selftests: pidfd_pidfd_test [FAIL]
> 
> x86_64 output log,
> -------------------------
> # selftests pidfd pidfd_test
> pidfd: pidfd_test_ #
> [FAIL] 1 selftests pidfd pidfd_test # TIMEOUT
> selftests: pidfd_pidfd_test [FAIL]
> 
> Test results comparison,
> https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/kselftest/pidfd_pidfd_test
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/pidfd_pidfd_test
> 
> link,
> https://lkft.validation.linaro.org/scheduler/job/993549#L17835

Note, that this failure is _not_ related to the fdinfo and NSpid patches
here.
It's rather related to the polling testing code that Joel added. Iirc,
then it is timing sensitive.
I'll try to make some room this week to look into this.

	Christian

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-11-13 12:20 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 13:36 [PATCH] pidfd: show pids for nested pid namespaces in fdinfo Christian Kellner
2019-10-08 13:52 ` Christian Brauner
2019-10-08 14:00   ` Michal Hocko
2019-10-09 16:05 ` [PATCH v2 1/2] " Christian Kellner
2019-10-09 16:05   ` [PATCH v2 2/2] pidfd: add tests for NSpid info " Christian Kellner
2019-10-11 15:09     ` Jann Horn
2019-10-11 15:09       ` Jann Horn
2019-10-11 17:08       ` Christian Brauner
2019-10-11 17:08         ` Christian Brauner
2019-10-09 17:29   ` [PATCH v2 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
2019-10-11 12:23   ` [PATCH v3 " Christian Kellner
2019-10-11 12:23     ` [PATCH v3 2/2] pidfd: add tests for NSpid info " Christian Kellner
2019-10-11 13:18       ` Christian Brauner
2019-10-11 13:18         ` Christian Brauner
2019-10-11 13:17     ` [PATCH v3 1/2] pidfd: show pids for nested pid namespaces " Christian Brauner
2019-10-11 14:55     ` Jann Horn
2019-10-11 15:17       ` Christian Brauner
2019-10-11 15:30         ` Jann Horn
2019-10-11 16:58           ` Christian Brauner
2019-10-11 18:20             ` Jann Horn
2019-10-12 10:19               ` [PATCH] pidfd: add NSpid entries to fdinfo Christian Brauner
2019-10-12 10:21                 ` Christian Brauner
2019-10-14  9:43                   ` Christian Kellner
2019-10-14 10:31                     ` Christian Brauner
2019-10-14 15:10                       ` Jann Horn
2019-10-14 15:20                         ` Christian Kellner
2019-10-14 15:09                 ` Jann Horn
2019-10-14 17:06                   ` Christian Brauner
2019-10-14 16:20     ` [PATCH v4 1/2] " Christian Kellner
2019-10-14 16:20       ` [PATCH v4 2/2] pidfd: add tests for NSpid info in fdinfo Christian Kellner
2019-10-15 10:07         ` Christian Brauner
2019-11-13 11:52           ` Naresh Kamboju
2019-11-13 11:52             ` Naresh Kamboju
2019-11-13 12:20             ` Christian Brauner
2019-11-13 12:20               ` Christian Brauner
2019-10-15  9:40       ` [PATCH v4 1/2] pidfd: add NSpid entries to fdinfo Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.