All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] pidfd: waiting on processes through pidfds
@ 2019-07-27  8:51 Christian Brauner
  2019-07-27  8:52 ` [PATCH v2 1/2] pidfd: add P_PIDFD to waitid() Christian Brauner
  2019-07-27  8:52 ` [PATCH v2 2/2] pidfd: add pidfd_wait tests Christian Brauner
  0 siblings, 2 replies; 9+ messages in thread
From: Christian Brauner @ 2019-07-27  8:51 UTC (permalink / raw)
  To: linux-kernel, oleg
  Cc: arnd, ebiederm, keescook, joel, tglx, tj, dhowells, jannh, luto,
	akpm, cyphar, torvalds, viro, kernel-team, Christian Brauner

Hey everyone,

/* v2 */
This adds the ability to wait on processes using pidfds. This is one of
the few missing pieces to make it possible to manage processes using
only pidfds.

Now major changes have occured since v1. The only thing that was changed
has been to move all find_get_pid() calls into the switch statement to
avoid checking the type argument twice as suggested by Linus.

The core patch for waitid is pleasantly small. The largest change is
caused by adding proper tests for waitid(P_PIDFD).

/* v1 */
Link: https://lore.kernel.org/lkml/20190726093934.13557-1-christian@brauner.io/

/* v0 */
Link: https://lore.kernel.org/lkml/20190724144651.28272-1-christian@brauner.io

Christian

Christian Brauner (2):
  pidfd: add P_PIDFD to waitid()
  pidfd: add pidfd_wait tests

 include/linux/pid.h                        |   4 +
 include/uapi/linux/wait.h                  |   1 +
 kernel/exit.c                              |  29 ++-
 kernel/fork.c                              |   8 +
 kernel/signal.c                            |   7 +-
 tools/testing/selftests/pidfd/pidfd.h      |  25 +++
 tools/testing/selftests/pidfd/pidfd_test.c |  14 --
 tools/testing/selftests/pidfd/pidfd_wait.c | 245 +++++++++++++++++++++
 8 files changed, 313 insertions(+), 20 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_wait.c

-- 
2.22.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27  8:51 [PATCH v2 0/2] pidfd: waiting on processes through pidfds Christian Brauner
@ 2019-07-27  8:52 ` Christian Brauner
  2019-07-27 16:28   ` Linus Torvalds
  2019-07-27  8:52 ` [PATCH v2 2/2] pidfd: add pidfd_wait tests Christian Brauner
  1 sibling, 1 reply; 9+ messages in thread
From: Christian Brauner @ 2019-07-27  8:52 UTC (permalink / raw)
  To: linux-kernel, oleg
  Cc: arnd, ebiederm, keescook, joel, tglx, tj, dhowells, jannh, luto,
	akpm, cyphar, torvalds, viro, kernel-team, Christian Brauner

This adds the P_PIDFD type to waitid().
One of the last remaining bits for the pidfd api is to make it possible
to wait on pidfds. With P_PIDFD added to waitid() the parts of userspace
that want to use the pidfd api to exclusively manage processes can do so
now.

One of the things this will unblock in the future is the ability to make
it possible to retrieve the exit status via waitid(P_PIDFD) for
non-parent processes if handed a _suitable_ pidfd that has this feature
set. This is similar to what you can do on FreeBSD with kqueue(). It
might even end up being possible to wait on a process as a non-parent if
an appropriate property is enabled on the pidfd.

With P_PIDFD no scoping of the process identified by the pidfd is
possible, i.e. it explicitly blocks things such as wait4(-1), wait4(0),
waitid(P_ALL), waitid(P_PGID) etc. It only allows for semantics
equivalent to wait4(pid), waitid(P_PID). Users that need scoping should
rely on pid-based wait*() syscalls for now.

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
---
v1:
- Linus Torvalds <torvalds@linux-foundation.org>:
  - use flag as discussed before, not a dedicated pidfd_wait() syscall
- Oleg Nesterov <oleg@redhat.com>:
  - use flag as discussed before, not a dedicated pidfd_wait() syscall

v2:
- Linus Torvalds <torvalds@linux-foundation.org>:
  - move find_get_pid() calls into switch statements to avoid checking
    the type argument twice
---
 include/linux/pid.h       |  4 ++++
 include/uapi/linux/wait.h |  1 +
 kernel/exit.c             | 29 +++++++++++++++++++++++++----
 kernel/fork.c             |  8 ++++++++
 kernel/signal.c           |  7 +++++--
 5 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/include/linux/pid.h b/include/linux/pid.h
index 2a83e434db9d..9645b1194c98 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -72,6 +72,10 @@ extern struct pid init_struct_pid;
 
 extern const struct file_operations pidfd_fops;
 
+struct file;
+
+extern struct pid *pidfd_pid(const struct file *file);
+
 static inline struct pid *get_pid(struct pid *pid)
 {
 	if (pid)
diff --git a/include/uapi/linux/wait.h b/include/uapi/linux/wait.h
index ac49a220cf2a..85b809fc9f11 100644
--- a/include/uapi/linux/wait.h
+++ b/include/uapi/linux/wait.h
@@ -17,6 +17,7 @@
 #define P_ALL		0
 #define P_PID		1
 #define P_PGID		2
+#define P_PIDFD		3
 
 
 #endif /* _UAPI_LINUX_WAIT_H */
diff --git a/kernel/exit.c b/kernel/exit.c
index a75b6a7f458a..207f7a37b2d0 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1555,6 +1555,7 @@ static long do_wait(struct wait_opts *wo)
 static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
 			  int options, struct rusage *ru)
 {
+	struct fd f;
 	struct wait_opts wo;
 	struct pid *pid = NULL;
 	enum pid_type type;
@@ -1574,19 +1575,35 @@ static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
 		type = PIDTYPE_PID;
 		if (upid <= 0)
 			return -EINVAL;
+
+		pid = find_get_pid(upid);
 		break;
 	case P_PGID:
 		type = PIDTYPE_PGID;
 		if (upid <= 0)
 			return -EINVAL;
+
+		pid = find_get_pid(upid);
+		break;
+	case P_PIDFD:
+		type = PIDTYPE_PID;
+		if (upid < 0)
+			return -EINVAL;
+
+		f = fdget(upid);
+		if (!f.file)
+			return -EBADF;
+
+		pid = pidfd_pid(f.file);
+		if (IS_ERR(pid)) {
+			fdput(f);
+			return PTR_ERR(pid);
+		}
 		break;
 	default:
 		return -EINVAL;
 	}
 
-	if (type < PIDTYPE_MAX)
-		pid = find_get_pid(upid);
-
 	wo.wo_type	= type;
 	wo.wo_pid	= pid;
 	wo.wo_flags	= options;
@@ -1594,7 +1611,11 @@ static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
 	wo.wo_rusage	= ru;
 	ret = do_wait(&wo);
 
-	put_pid(pid);
+	if (which == P_PIDFD)
+		fdput(f);
+	else
+		put_pid(pid);
+
 	return ret;
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index d8ae0f1b4148..b169e2ca2d84 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1690,6 +1690,14 @@ static inline void rcu_copy_process(struct task_struct *p)
 #endif /* #ifdef CONFIG_TASKS_RCU */
 }
 
+struct pid *pidfd_pid(const struct file *file)
+{
+	if (file->f_op == &pidfd_fops)
+		return file->private_data;
+
+	return ERR_PTR(-EBADF);
+}
+
 static int pidfd_release(struct inode *inode, struct file *file)
 {
 	struct pid *pid = file->private_data;
diff --git a/kernel/signal.c b/kernel/signal.c
index 91b789dd6e72..2e567f64812f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3672,8 +3672,11 @@ static int copy_siginfo_from_user_any(kernel_siginfo_t *kinfo, siginfo_t *info)
 
 static struct pid *pidfd_to_pid(const struct file *file)
 {
-	if (file->f_op == &pidfd_fops)
-		return file->private_data;
+	struct pid *pid;
+
+	pid = pidfd_pid(file);
+	if (!IS_ERR(pid))
+		return pid;
 
 	return tgid_pidfd_to_pid(file);
 }
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] pidfd: add pidfd_wait tests
  2019-07-27  8:51 [PATCH v2 0/2] pidfd: waiting on processes through pidfds Christian Brauner
  2019-07-27  8:52 ` [PATCH v2 1/2] pidfd: add P_PIDFD to waitid() Christian Brauner
@ 2019-07-27  8:52 ` Christian Brauner
  1 sibling, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2019-07-27  8:52 UTC (permalink / raw)
  To: linux-kernel, oleg
  Cc: arnd, ebiederm, keescook, joel, tglx, tj, dhowells, jannh, luto,
	akpm, cyphar, torvalds, viro, kernel-team, Christian Brauner

Add tests for pidfd_wait() and CLONE_WAIT_PID:
- test that waitid(P_PIDFD) can wait on a pidfd
- test that waitid(P_PIDFD) can wait on a pidfd and return siginfo_t
- test that waitid(P_PIDFD) works with WEXITED
- test that waitid(P_PIDFD) works with WSTOPPED
- test that waitid(P_PIDFD) works with WUNTRACED
- test that waitid(P_PIDFD) works with WCONTINUED
- test that waitid(P_PIDFD) works with WNOWAIT
- test that waitid(P_PIDFD)works with WNOHANG

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
---
v1:
- Christian Brauner <christian@brauner.io>:
  - adapt tests to new P_PIDFD concept

v2: unchanged
---
 tools/testing/selftests/pidfd/pidfd.h      |  25 +++
 tools/testing/selftests/pidfd/pidfd_test.c |  14 --
 tools/testing/selftests/pidfd/pidfd_wait.c | 245 +++++++++++++++++++++
 3 files changed, 270 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_wait.c

diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index 8452e910463f..7d7d0ca05e0b 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -16,6 +16,26 @@
 
 #include "../kselftest.h"
 
+#ifndef P_PIDFD
+#define P_PIDFD 3
+#endif
+
+#ifndef CLONE_PIDFD
+#define CLONE_PIDFD 0x00001000
+#endif
+
+#ifndef __NR_pidfd_open
+#define __NR_pidfd_open -1
+#endif
+
+#ifndef __NR_pidfd_send_signal
+#define __NR_pidfd_send_signal -1
+#endif
+
+#ifndef __NR_clone3
+#define __NR_clone3 -1
+#endif
+
 /*
  * The kernel reserves 300 pids via RESERVED_PIDS in kernel/pid.c
  * That means, when it wraps around any pid < 300 will be skipped.
@@ -53,5 +73,10 @@ int wait_for_pid(pid_t pid)
 	return WEXITSTATUS(status);
 }
 
+static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
+					unsigned int flags)
+{
+	return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
+}
 
 #endif /* __PIDFD_H */
diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c
index 7eaa8a3de262..42e3eb494d72 100644
--- a/tools/testing/selftests/pidfd/pidfd_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_test.c
@@ -21,20 +21,12 @@
 #include "pidfd.h"
 #include "../kselftest.h"
 
-#ifndef __NR_pidfd_send_signal
-#define __NR_pidfd_send_signal -1
-#endif
-
 #define str(s) _str(s)
 #define _str(s) #s
 #define CHILD_THREAD_MIN_WAIT 3 /* seconds */
 
 #define MAX_EVENTS 5
 
-#ifndef CLONE_PIDFD
-#define CLONE_PIDFD 0x00001000
-#endif
-
 static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
 {
 	size_t stack_size = 1024;
@@ -47,12 +39,6 @@ static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *))
 #endif
 }
 
-static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
-					unsigned int flags)
-{
-	return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
-}
-
 static int signal_received;
 
 static void set_signal_received_on_sigusr1(int sig)
diff --git a/tools/testing/selftests/pidfd/pidfd_wait.c b/tools/testing/selftests/pidfd/pidfd_wait.c
new file mode 100644
index 000000000000..018d806032c0
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_wait.c
@@ -0,0 +1,245 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+#include <signal.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sched.h>
+#include <string.h>
+#include <sys/resource.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "pidfd.h"
+#include "../kselftest.h"
+
+#define ptr_to_u64(ptr) ((__u64)((uintptr_t)(ptr)))
+
+static pid_t sys_clone3(struct clone_args *args)
+{
+	return syscall(__NR_clone3, args, sizeof(struct clone_args));
+}
+
+static int sys_waitid(int which, pid_t pid, siginfo_t *info, int options,
+		      struct rusage *ru)
+{
+	return syscall(__NR_waitid, which, pid, info, options, ru);
+}
+
+static int test_pidfd_wait_simple(void)
+{
+	const char *test_name = "pidfd wait siginfo";
+	int pidfd = -1, status = 0;
+	pid_t parent_tid = -1;
+	struct clone_args args = {
+		.parent_tid = ptr_to_u64(&parent_tid),
+		.pidfd = ptr_to_u64(&pidfd),
+		.flags = CLONE_PIDFD | CLONE_PARENT_SETTID,
+		.exit_signal = SIGCHLD,
+	};
+	int ret;
+	pid_t pid;
+	siginfo_t info = {
+		.si_signo = 0,
+	};
+
+	pid = sys_clone3(&args);
+	if (pid < 0)
+		ksft_exit_fail_msg("%s test: failed to create new process %s\n",
+				   test_name, strerror(errno));
+
+	if (pid == 0)
+		exit(EXIT_SUCCESS);
+
+	pid = sys_waitid(P_PIDFD, pidfd, &info, WEXITED, NULL);
+	if (pid < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	if (!WIFEXITED(info.si_status) || WEXITSTATUS(info.si_status))
+		ksft_exit_fail_msg(
+			"%s test: unexpected status received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+	close(pidfd);
+
+	if (info.si_signo != SIGCHLD)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_signo value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_signo, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_code != CLD_EXITED)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_code value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_code, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_pid != parent_tid)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_pid value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_pid, parent_tid, pidfd,
+			strerror(errno));
+
+	ksft_test_result_pass("%s test: Passed\n", test_name);
+	return 0;
+}
+
+static int test_pidfd_wait_states(void)
+{
+	const char *test_name = "pidfd wait states";
+	int pidfd = -1, status = 0;
+	pid_t parent_tid = -1;
+	struct clone_args args = {
+		.parent_tid = ptr_to_u64(&parent_tid),
+		.pidfd = ptr_to_u64(&pidfd),
+		.flags = CLONE_PIDFD | CLONE_PARENT_SETTID,
+		.exit_signal = SIGCHLD,
+	};
+	int ret;
+	pid_t pid;
+	siginfo_t info = {
+		.si_signo = 0,
+	};
+
+	pid = sys_clone3(&args);
+	if (pid < 0)
+		ksft_exit_fail_msg("%s test: failed to create new process %s\n",
+				   test_name, strerror(errno));
+
+	if (pid == 0) {
+		kill(getpid(), SIGSTOP);
+		kill(getpid(), SIGSTOP);
+		exit(EXIT_SUCCESS);
+	}
+
+	ret = sys_waitid(P_PIDFD, pidfd, &info, WSTOPPED, NULL);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	if (info.si_signo != SIGCHLD)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_signo value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_signo, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_code != CLD_STOPPED)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_code value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_code, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_pid != parent_tid)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_pid value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_pid, parent_tid, pidfd,
+			strerror(errno));
+
+	ret = sys_pidfd_send_signal(pidfd, SIGCONT, NULL, 0);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	ret = sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	if (info.si_signo != SIGCHLD)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_signo value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_signo, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_code != CLD_CONTINUED)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_code value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_code, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_pid != parent_tid)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_pid value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_pid, parent_tid, pidfd,
+			strerror(errno));
+
+	ret = sys_waitid(P_PIDFD, pidfd, &info, WUNTRACED, NULL);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	if (info.si_signo != SIGCHLD)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_signo value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_signo, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_code != CLD_STOPPED)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_code value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_code, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_pid != parent_tid)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_pid value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_pid, parent_tid, pidfd,
+			strerror(errno));
+
+	ret = sys_pidfd_send_signal(pidfd, SIGKILL, NULL, 0);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	ret = sys_waitid(P_PIDFD, pidfd, &info, WEXITED, NULL);
+	if (ret < 0)
+		ksft_exit_fail_msg(
+			"%s test: failed to wait on process with pid %d and pidfd %d: %s\n",
+			test_name, parent_tid, pidfd, strerror(errno));
+
+	if (info.si_signo != SIGCHLD)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_signo value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_signo, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_code != CLD_KILLED)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_code value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_code, parent_tid, pidfd,
+			strerror(errno));
+
+	if (info.si_pid != parent_tid)
+		ksft_exit_fail_msg(
+			"%s test: unexpected si_pid value %d received after waiting on process with pid %d and pidfd %d: %s\n",
+			test_name, info.si_pid, parent_tid, pidfd,
+			strerror(errno));
+
+	close(pidfd);
+
+	ksft_test_result_pass("%s test: Passed\n", test_name);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(2);
+
+	test_pidfd_wait_simple();
+	test_pidfd_wait_states();
+
+	return ksft_exit_pass();
+}
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27  8:52 ` [PATCH v2 1/2] pidfd: add P_PIDFD to waitid() Christian Brauner
@ 2019-07-27 16:28   ` Linus Torvalds
  2019-07-27 16:41     ` Linus Torvalds
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Linus Torvalds @ 2019-07-27 16:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linux List Kernel Mailing, Oleg Nesterov, Arnd Bergmann,
	Eric W. Biederman, Kees Cook, Joel Fernandes, Thomas Gleixner,
	Tejun Heo, David Howells, Jann Horn, Andrew Lutomirski,
	Andrew Morton, Aleksa Sarai, Al Viro, Android Kernel Team

Sorry to keep pestering about the patch series, but with the addition
of P_PIDFD, I react once again..

On Sat, Jul 27, 2019 at 1:53 AM Christian Brauner <christian@brauner.io> wrote:
>
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1555,6 +1555,7 @@ static long do_wait(struct wait_opts *wo)
>  static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
>                           int options, struct rusage *ru)
>  {
> +       struct fd f;

Please don't do 'struct fd' at this level. That results in this ugly code later:

> -       put_pid(pid);
> +       if (which == P_PIDFD)
> +               fdput(f);
> +       else
> +               put_pid(pid);

which just looks nasty.

Instead, do all the 'file descriptor to pid' games here:

> +       case P_PIDFD:
> +               type = PIDTYPE_PID;
> +               if (upid < 0)
> +                       return -EINVAL;
> +
> +               f = fdget(upid);
> +               if (!f.file)
> +                       return -EBADF;
> +
> +               pid = pidfd_pid(f.file);
> +               if (IS_ERR(pid)) {
> +                       fdput(f);
> +                       return PTR_ERR(pid);
> +               }
>                 break;

and make thus just do something like

        pid = get_pid_from_fd(upid);
        if (IS_ERR(pid))
                return PTR_ERR(pid);

and now do that "fd to pid" in that helper function, and get the
reference to 'struct pid *' there instead.

Which you can actually do efficiently and lightly without even getting
a ref to the 'struct file'. Something like

  struct pid *fd_to_pid(unsigned int fd)
  {
        struct fd f;
        struct pid *pid;

        f = fdget(fd);
        if (!f.file)
                return ERR_PTR(-EBADF);
        pid = pidfd_pid(f.file);
        if (!IS_ERR(pid))
                get_pid(pid);
        fdput(f);
        return pid;
  }

is the stupid and straightforward thing, but if you want to be
*clever* you can actually avoid getting a reference to the 'struct
file *" entirely, and do the fd->pid lookup under rcu_read_lock()
instead. It's slightly more complex, but it avoids the fdget/fdput
reference count games entirely.

And then all that kernel_waitid() ever worries about is "struct pid
*", and the ending goes back to just that simple

        put_pid(pid);
        return ret;

instead.

This was kind of my point of doing all the "find_get_pid()" games in
the "switch()" statement - the different cases have different ways to
look up what the "struct pid *" pointer should be, but they should all
just look up a pid pointer, and then nothing else needs to care about
'type' any more. See?

Hmm?

                Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27 16:28   ` Linus Torvalds
@ 2019-07-27 16:41     ` Linus Torvalds
  2019-07-27 19:42       ` Christian Brauner
  2019-07-27 16:49     ` Al Viro
  2019-07-27 19:45     ` Christian Brauner
  2 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2019-07-27 16:41 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linux List Kernel Mailing, Oleg Nesterov, Arnd Bergmann,
	Eric W. Biederman, Kees Cook, Joel Fernandes, Thomas Gleixner,
	Tejun Heo, David Howells, Jann Horn, Andrew Lutomirski,
	Andrew Morton, Aleksa Sarai, Al Viro, Android Kernel Team

On Sat, Jul 27, 2019 at 9:28 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Something like
>
>   struct pid *fd_to_pid(unsigned int fd)
>   {
>         struct fd f;
>         struct pid *pid;
...

I forgot to put my usual disclaimer about TOTALLY UNTESTED GARBAGE in
that email. I want to make that part clear: that code snippet was
meant as a rough guide of direction, not as a "this works".

Hopefully that was clear.

Also note again that one of the reasons I would prefer that
"fd_to_pid()" interface is that you _can_ do it cleverly with RCU
lookup, but that requires a lot of care.

In particular, I think all of our _existing_
"proc_pid(file_inode(file))" users are done while you actually hold a
reference to "struct file *", so they don't have to worry about races
with another thread doing the final ->release(). So the "clever" thing
is possible, but might need a _lot_ of care to make sure the 'struct
pid *' associated with the file still exists.

The example code sequence was not doing the clever thing, obviously.
So it was untested _and_ simple-stupid. But it has the interface that
I'd prefer.

              Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27 16:28   ` Linus Torvalds
  2019-07-27 16:41     ` Linus Torvalds
@ 2019-07-27 16:49     ` Al Viro
  2019-07-27 19:46       ` Christian Brauner
  2019-07-27 19:45     ` Christian Brauner
  2 siblings, 1 reply; 9+ messages in thread
From: Al Viro @ 2019-07-27 16:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Brauner, Linux List Kernel Mailing, Oleg Nesterov,
	Arnd Bergmann, Eric W. Biederman, Kees Cook, Joel Fernandes,
	Thomas Gleixner, Tejun Heo, David Howells, Jann Horn,
	Andrew Lutomirski, Andrew Morton, Aleksa Sarai,
	Android Kernel Team

On Sat, Jul 27, 2019 at 09:28:40AM -0700, Linus Torvalds wrote:

> is the stupid and straightforward thing, but if you want to be
> *clever* you can actually avoid getting a reference to the 'struct
> file *" entirely, and do the fd->pid lookup under rcu_read_lock()
> instead. It's slightly more complex, but it avoids the fdget/fdput
> reference count games entirely.

Yecchhh...  Please, don't do the last part - at least not unless
we really see that in profiles.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27 16:41     ` Linus Torvalds
@ 2019-07-27 19:42       ` Christian Brauner
  0 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2019-07-27 19:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux List Kernel Mailing, Oleg Nesterov, Arnd Bergmann,
	Eric W. Biederman, Kees Cook, Joel Fernandes, Thomas Gleixner,
	Tejun Heo, David Howells, Jann Horn, Andrew Lutomirski,
	Andrew Morton, Aleksa Sarai, Al Viro, Android Kernel Team

On Sat, Jul 27, 2019 at 09:41:25AM -0700, Linus Torvalds wrote:
> On Sat, Jul 27, 2019 at 9:28 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Something like
> >
> >   struct pid *fd_to_pid(unsigned int fd)
> >   {
> >         struct fd f;
> >         struct pid *pid;
> ...
> 
> I forgot to put my usual disclaimer about TOTALLY UNTESTED GARBAGE in
> that email. I want to make that part clear: that code snippet was
> meant as a rough guide of direction, not as a "this works".
> 
> Hopefully that was clear.

Yeah. I don't take code someone else has written without verifying or
testing into my own code. And I hope people do the same with mine. :)

Christian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27 16:28   ` Linus Torvalds
  2019-07-27 16:41     ` Linus Torvalds
  2019-07-27 16:49     ` Al Viro
@ 2019-07-27 19:45     ` Christian Brauner
  2 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2019-07-27 19:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux List Kernel Mailing, Oleg Nesterov, Arnd Bergmann,
	Eric W. Biederman, Kees Cook, Joel Fernandes, Thomas Gleixner,
	Tejun Heo, David Howells, Jann Horn, Andrew Lutomirski,
	Andrew Morton, Aleksa Sarai, Al Viro, Android Kernel Team

On Sat, Jul 27, 2019 at 09:28:40AM -0700, Linus Torvalds wrote:
> Sorry to keep pestering about the patch series, but with the addition
> of P_PIDFD, I react once again..

That's fine. I don't at all mind being particular about how something
has to be done as long as the result is functional. In this case it
seems we'll end up with something cleaner overall, so sure.

I'll rework the snippets into the actual patch and resend. I'll leave
out the rcu-cleverness you suggested in the other mail though.

Christian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] pidfd: add P_PIDFD to waitid()
  2019-07-27 16:49     ` Al Viro
@ 2019-07-27 19:46       ` Christian Brauner
  0 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2019-07-27 19:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Linux List Kernel Mailing, Oleg Nesterov,
	Arnd Bergmann, Eric W. Biederman, Kees Cook, Joel Fernandes,
	Thomas Gleixner, Tejun Heo, David Howells, Jann Horn,
	Andrew Lutomirski, Andrew Morton, Aleksa Sarai,
	Android Kernel Team

On Sat, Jul 27, 2019 at 05:49:32PM +0100, Al Viro wrote:
> On Sat, Jul 27, 2019 at 09:28:40AM -0700, Linus Torvalds wrote:
> 
> > is the stupid and straightforward thing, but if you want to be
> > *clever* you can actually avoid getting a reference to the 'struct
> > file *" entirely, and do the fd->pid lookup under rcu_read_lock()
> > instead. It's slightly more complex, but it avoids the fdget/fdput
> > reference count games entirely.
> 
> Yecchhh...  Please, don't do the last part - at least not unless
> we really see that in profiles.

Yeah, I will leave this out for now.

Christian

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-07-27 19:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-27  8:51 [PATCH v2 0/2] pidfd: waiting on processes through pidfds Christian Brauner
2019-07-27  8:52 ` [PATCH v2 1/2] pidfd: add P_PIDFD to waitid() Christian Brauner
2019-07-27 16:28   ` Linus Torvalds
2019-07-27 16:41     ` Linus Torvalds
2019-07-27 19:42       ` Christian Brauner
2019-07-27 16:49     ` Al Viro
2019-07-27 19:46       ` Christian Brauner
2019-07-27 19:45     ` Christian Brauner
2019-07-27  8:52 ` [PATCH v2 2/2] pidfd: add pidfd_wait tests Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.