BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] seccomp: continue syscall from notifier
@ 2019-09-18  8:48 Christian Brauner
  2019-09-18  8:48 ` [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Christian Brauner @ 2019-09-18  8:48 UTC (permalink / raw)
  To: keescook, luto
  Cc: jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Christian Brauner

Hey everyone,

This is the patchset coming out of the KSummit session Kees and I gave
in Lisbon last week (cf. [3] which also contains slides with more
details on related things such as deep argument inspection).
The simple idea is to extend the seccomp notifier to allow for the
continuation of a syscall. The rationale for this can be found in the
commit message to [1]. For the curious there is more detail in [2].
This patchset would unblock supervising an extended set of syscalls such
as mount() where a privileged process is supervising the syscalls of a
lesser privileged process and emulates the syscall for the latter in
userspace.
For more comments on security see [1].

Thanks!
Christian

/* References */
[1]: [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
[2]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
[3]: https://linuxplumbersconf.org/event/4/contributions/560

Christian Brauner (4):
  seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  seccomp: add two missing ptrace ifdefines
  seccomp: avoid overflow in implicit constant conversion
  seccomp: test SECCOMP_RET_USER_NOTIF_ALLOW

 include/uapi/linux/seccomp.h                  |   2 +
 kernel/seccomp.c                              |  24 +++-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 110 +++++++++++++++++-
 3 files changed, 131 insertions(+), 5 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18  8:48 [PATCH 0/4] seccomp: continue syscall from notifier Christian Brauner
@ 2019-09-18  8:48 ` Christian Brauner
  2019-09-18 17:30   ` Kees Cook
  2019-09-18  8:48 ` [PATCH 2/4] seccomp: add two missing ptrace ifdefines Christian Brauner
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2019-09-18  8:48 UTC (permalink / raw)
  To: keescook, luto
  Cc: jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Christian Brauner,
	Tycho Andersen, Tyler Hicks

This allows the seccomp notifier to continue a syscall. A positive
discussion about this feature was triggered by a post to the
ksummit-discuss mailing list (cf. [3]) and took place during KSummit
(cf. [1]) and again at the containers/checkpoint-restore
micro-conference at Linux Plumbers.

Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
which enables a process (watchee) to retrieve an fd for its seccomp
filter. This fd can then be handed to another (usually more privileged)
process (watcher). The watcher will then be able to receive seccomp
messages about the syscalls having been performed by the watchee.

This feature is heavily used in some userspace workloads. For example,
it is currently used to intercept mknod() syscalls in user namespaces
aka in containers.
The mknod() syscall can be easily filtered based on dev_t. This allows
us to only intercept a very specific subset of mknod() syscalls.
Furthermore, mknod() is not possible in user namespaces toto coelo and
so intercepting and denying syscalls that are not in the whitelist on
accident is not a big deal. The watchee won't notice a difference.

In contrast to mknod(), a lot of other syscall we intercept (e.g.
setxattr()) cannot be easily filtered like mknod() because they have
pointer arguments. Additionally, some of them might actually succeed in
user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
currently cannot tell seccomp to continue from a user notifier we are
stuck with performing all of the syscalls in lieu of the container. This
is a huge security liability since it is extremely difficult to
correctly assume all of the necessary privileges of the calling task
such that the syscall can be successfully emulated without escaping
other additional security restrictions (think missing CAP_MKNOD for
mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
telling seccomp to resume the syscall.

One thing that came up in the discussion was the problem that another
thread could change the memory after userspace has decided to let the
syscall continue which is a well known TOCTOU with seccomp which is
present in other ways already.
The discussion showed that this feature is already very useful for any
syscall without pointer arguments. For any accidentally intercepted
non-pointer syscall it is safe to continue.
For syscalls with pointer arguments there is a race but for any cautious
userspace and the main usec cases the race doesn't matter. The notifier
is intended to be used in a scenario where a more privileged watcher
supervises the syscalls of lesser privileged watchee to allow it to get
around kernel-enforced limitations by performing the syscall for it
whenever deemed save by the watcher. Hence, if a user tricks the watcher
into allowing a syscall they will either get a deny based on
kernel-enforced restrictions later or they will have changed the
arguments in such a way that they manage to perform a syscall with
arguments that they would've been allowed to do anyway.
In general, it is good to point out again, that the notifier fd was not
intended to allow userspace to implement a security policy but rather to
work around kernel security mechanisms in cases where the watcher knows
that a given action is safe to perform.

/* References */
[1]: https://linuxplumbersconf.org/event/4/contributions/560
[2]: https://linuxplumbersconf.org/event/4/contributions/477
[3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
[4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Tycho Andersen <tycho@tycho.ws>
CC: Tyler Hicks <tyhicks@canonical.com>
Cc: Jann Horn <jannh@google.com>
---
 include/uapi/linux/seccomp.h |  2 ++
 kernel/seccomp.c             | 24 ++++++++++++++++++++----
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 90734aa5aa36..2c23b9aa6383 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -76,6 +76,8 @@ struct seccomp_notif {
 	struct seccomp_data data;
 };
 
+#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001
+
 struct seccomp_notif_resp {
 	__u64 id;
 	__s64 val;
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index dba52a7db5e8..cdb90184d6d7 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -75,6 +75,7 @@ struct seccomp_knotif {
 	/* The return values, only valid when in SECCOMP_NOTIFY_REPLIED */
 	int error;
 	long val;
+	u32 flags;
 
 	/* Signals when this has entered SECCOMP_NOTIFY_REPLIED */
 	struct completion ready;
@@ -732,11 +733,12 @@ static u64 seccomp_next_notify_id(struct seccomp_filter *filter)
 	return filter->notif->next_id++;
 }
 
-static void seccomp_do_user_notification(int this_syscall,
+static bool seccomp_do_user_notification(int this_syscall,
 					 struct seccomp_filter *match,
 					 const struct seccomp_data *sd)
 {
 	int err;
+	u32 flags = 0;
 	long ret = 0;
 	struct seccomp_knotif n = {};
 
@@ -764,6 +766,7 @@ static void seccomp_do_user_notification(int this_syscall,
 	if (err == 0) {
 		ret = n.val;
 		err = n.error;
+		flags = n.flags;
 	}
 
 	/*
@@ -780,8 +783,14 @@ static void seccomp_do_user_notification(int this_syscall,
 		list_del(&n.list);
 out:
 	mutex_unlock(&match->notify_lock);
+
+	/* perform syscall */
+	if (flags & SECCOMP_RET_USER_NOTIF_ALLOW)
+		return false;
+
 	syscall_set_return_value(current, task_pt_regs(current),
 				 err, ret);
+	return true;
 }
 
 static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
@@ -867,8 +876,10 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
 		return 0;
 
 	case SECCOMP_RET_USER_NOTIF:
-		seccomp_do_user_notification(this_syscall, match, sd);
-		goto skip;
+		if (seccomp_do_user_notification(this_syscall, match, sd))
+			goto skip;
+
+		return 0;
 
 	case SECCOMP_RET_LOG:
 		seccomp_log(this_syscall, 0, action, true);
@@ -1087,7 +1098,11 @@ static long seccomp_notify_send(struct seccomp_filter *filter,
 	if (copy_from_user(&resp, buf, sizeof(resp)))
 		return -EFAULT;
 
-	if (resp.flags)
+	if (resp.flags & ~SECCOMP_RET_USER_NOTIF_ALLOW)
+		return -EINVAL;
+
+	if ((resp.flags & SECCOMP_RET_USER_NOTIF_ALLOW) &&
+	    (resp.error || resp.val))
 		return -EINVAL;
 
 	ret = mutex_lock_interruptible(&filter->notify_lock);
@@ -1116,6 +1131,7 @@ static long seccomp_notify_send(struct seccomp_filter *filter,
 	knotif->state = SECCOMP_NOTIFY_REPLIED;
 	knotif->error = resp.error;
 	knotif->val = resp.val;
+	knotif->flags = resp.flags;
 	complete(&knotif->ready);
 out:
 	mutex_unlock(&filter->notify_lock);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-18  8:48 [PATCH 0/4] seccomp: continue syscall from notifier Christian Brauner
  2019-09-18  8:48 ` [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
@ 2019-09-18  8:48 ` Christian Brauner
  2019-09-18  9:15   ` Tyler Hicks
  2019-09-18  8:48 ` [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion Christian Brauner
  2019-09-18  8:48 ` [PATCH 4/4] seccomp: test SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
  3 siblings, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2019-09-18  8:48 UTC (permalink / raw)
  To: keescook, luto
  Cc: jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Christian Brauner,
	Tycho Andersen, Tyler Hicks, stable

Add tw missing ptrace ifdefines to avoid compilation errors on systems
that do not provide PTRACE_EVENTMSG_SYSCALL_ENTRY or
PTRACE_EVENTMSG_SYSCALL_EXIT or:

gcc -Wl,-no-as-needed -Wall  seccomp_bpf.c -lpthread -o seccomp_bpf
In file included from seccomp_bpf.c:52:0:
seccomp_bpf.c: In function ‘tracer_ptrace’:
seccomp_bpf.c:1792:20: error: ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’ undeclared (first use in this function); did you mean ‘PTRACE_EVENT_CLONE’?
  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
                    ^
../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
  __typeof__(_expected) __exp = (_expected); \
             ^~~~~~~~~
seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
  ^~~~~~~~~
seccomp_bpf.c:1792:20: note: each undeclared identifier is reported only once for each function it appears in
  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
                    ^
../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
  __typeof__(_expected) __exp = (_expected); \
             ^~~~~~~~~
seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
  ^~~~~~~~~
seccomp_bpf.c:1793:6: error: ‘PTRACE_EVENTMSG_SYSCALL_EXIT’ undeclared (first use in this function); did you mean ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’?
    : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
      ^
../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
  __typeof__(_expected) __exp = (_expected); \
             ^~~~~~~~~
seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
  ^~~~~~~~~

Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Tycho Andersen <tycho@tycho.ws>
CC: Tyler Hicks <tyhicks@canonical.com>
Cc: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 6ef7f16c4cf5..ee52eab01800 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -155,6 +155,14 @@ struct seccomp_data {
 #ifndef PTRACE_SECCOMP_GET_METADATA
 #define PTRACE_SECCOMP_GET_METADATA	0x420d
 
+#ifndef PTRACE_EVENTMSG_SYSCALL_ENTRY
+#define PTRACE_EVENTMSG_SYSCALL_ENTRY 1
+#endif
+
+#ifndef PTRACE_EVENTMSG_SYSCALL_EXIT
+#define PTRACE_EVENTMSG_SYSCALL_EXIT 2
+#endif
+
 struct seccomp_metadata {
 	__u64 filter_off;       /* Input: which filter */
 	__u64 flags;             /* Output: filter's flags */
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion
  2019-09-18  8:48 [PATCH 0/4] seccomp: continue syscall from notifier Christian Brauner
  2019-09-18  8:48 ` [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
  2019-09-18  8:48 ` [PATCH 2/4] seccomp: add two missing ptrace ifdefines Christian Brauner
@ 2019-09-18  8:48 ` Christian Brauner
  2019-09-18 10:01   ` Tyler Hicks
  2019-09-18  8:48 ` [PATCH 4/4] seccomp: test SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
  3 siblings, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2019-09-18  8:48 UTC (permalink / raw)
  To: keescook, luto
  Cc: jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Christian Brauner,
	Tycho Andersen, Tyler Hicks, stable

USER_NOTIF_MAGIC is assigned to int variables in this test so set it to INT_MAX
to avoid warnings:

seccomp_bpf.c: In function ‘user_notification_continue’:
seccomp_bpf.c:3088:26: warning: overflow in implicit constant conversion [-Woverflow]
 #define USER_NOTIF_MAGIC 116983961184613L
                          ^
seccomp_bpf.c:3572:15: note: in expansion of macro ‘USER_NOTIF_MAGIC’
  resp.error = USER_NOTIF_MAGIC;
               ^~~~~~~~~~~~~~~~

Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Tycho Andersen <tycho@tycho.ws>
CC: Tyler Hicks <tyhicks@canonical.com>
Cc: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index ee52eab01800..921f0e26f835 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -35,6 +35,7 @@
 #include <stdbool.h>
 #include <string.h>
 #include <time.h>
+#include <limits.h>
 #include <linux/elf.h>
 #include <sys/uio.h>
 #include <sys/utsname.h>
@@ -3080,7 +3081,7 @@ static int user_trap_syscall(int nr, unsigned int flags)
 	return seccomp(SECCOMP_SET_MODE_FILTER, flags, &prog);
 }
 
-#define USER_NOTIF_MAGIC 116983961184613L
+#define USER_NOTIF_MAGIC INT_MAX
 TEST(user_notification_basic)
 {
 	pid_t pid;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/4] seccomp: test SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18  8:48 [PATCH 0/4] seccomp: continue syscall from notifier Christian Brauner
                   ` (2 preceding siblings ...)
  2019-09-18  8:48 ` [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion Christian Brauner
@ 2019-09-18  8:48 ` Christian Brauner
  3 siblings, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2019-09-18  8:48 UTC (permalink / raw)
  To: keescook, luto
  Cc: jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Christian Brauner,
	Tycho Andersen, Tyler Hicks, stable

Test whether a syscall can be performed after having been intercepted by
the seccomp notifier. The test uses dup() and kcmp() since it allows us to
nicely test whether the dup() syscall actually succeeded by comparing whether
the fd refers to the same underlying struct file.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Tycho Andersen <tycho@tycho.ws>
CC: Tyler Hicks <tyhicks@canonical.com>
Cc: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 99 +++++++++++++++++++
 1 file changed, 99 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 921f0e26f835..788d7e9007d5 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -44,6 +44,7 @@
 #include <sys/times.h>
 #include <sys/socket.h>
 #include <sys/ioctl.h>
+#include <linux/kcmp.h>
 
 #include <unistd.h>
 #include <sys/syscall.h>
@@ -175,6 +176,10 @@ struct seccomp_metadata {
 
 #define SECCOMP_RET_USER_NOTIF 0x7fc00000U
 
+#ifndef SECCOMP_RET_USER_NOTIF_ALLOW
+#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001
+#endif
+
 #define SECCOMP_IOC_MAGIC		'!'
 #define SECCOMP_IO(nr)			_IO(SECCOMP_IOC_MAGIC, nr)
 #define SECCOMP_IOR(nr, type)		_IOR(SECCOMP_IOC_MAGIC, nr, type)
@@ -3489,6 +3494,100 @@ TEST(seccomp_get_notif_sizes)
 	EXPECT_EQ(sizes.seccomp_notif_resp, sizeof(struct seccomp_notif_resp));
 }
 
+static int filecmp(pid_t pid1, pid_t pid2, int fd1, int fd2)
+{
+#ifdef __NR_kcmp
+	return syscall(__NR_kcmp, pid1, pid2, KCMP_FILE, fd1, fd2);
+#else
+	errno = ENOSYS;
+	return -1;
+#endif
+}
+
+TEST(user_notification_continue)
+{
+	pid_t pid;
+	long ret;
+	int status, listener;
+	struct seccomp_notif req = {};
+	struct seccomp_notif_resp resp = {};
+	struct pollfd pollfd;
+
+	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+	ASSERT_EQ(0, ret) {
+		TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!");
+	}
+
+	listener = user_trap_syscall(__NR_dup, SECCOMP_FILTER_FLAG_NEW_LISTENER);
+	ASSERT_GE(listener, 0);
+
+	pid = fork();
+	ASSERT_GE(pid, 0);
+
+	if (pid == 0) {
+		int dup_fd, pipe_fds[2];
+		pid_t self;
+
+		ret = pipe(pipe_fds);
+		if (ret < 0)
+			exit(EXIT_FAILURE);
+
+		dup_fd = dup(pipe_fds[0]);
+		if (dup_fd < 0)
+			exit(EXIT_FAILURE);
+
+		self = getpid();
+
+		ret = filecmp(self, self, pipe_fds[0], dup_fd);
+		if (ret)
+			exit(EXIT_FAILURE);
+
+		exit(EXIT_SUCCESS);
+	}
+
+	pollfd.fd = listener;
+	pollfd.events = POLLIN | POLLOUT;
+
+	EXPECT_GT(poll(&pollfd, 1, -1), 0);
+	EXPECT_EQ(pollfd.revents, POLLIN);
+
+	EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req), 0);
+
+	pollfd.fd = listener;
+	pollfd.events = POLLIN | POLLOUT;
+
+	EXPECT_GT(poll(&pollfd, 1, -1), 0);
+	EXPECT_EQ(pollfd.revents, POLLOUT);
+
+	EXPECT_EQ(req.data.nr, __NR_dup);
+
+	resp.id = req.id;
+	resp.flags = SECCOMP_RET_USER_NOTIF_ALLOW;
+
+	/* check that if (flags & SECCOMP_RET_USER_NOTIF_ALLOW) the rest is 0 */
+	resp.error = 0;
+	resp.val = USER_NOTIF_MAGIC;
+	EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), -1);
+	EXPECT_EQ(errno, EINVAL);
+
+	resp.error = USER_NOTIF_MAGIC;
+	resp.val = 0;
+	EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), -1);
+	EXPECT_EQ(errno, EINVAL);
+
+	resp.error = 0;
+	resp.val = 0;
+	EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), 0) {
+		if (errno == EINVAL)
+			XFAIL(goto skip, "Kernel does not support SECCOMP_RET_USER_NOTIF_ALLOW");
+	}
+
+skip:
+	EXPECT_EQ(waitpid(pid, &status, 0), pid);
+	EXPECT_EQ(true, WIFEXITED(status));
+	EXPECT_EQ(0, WEXITSTATUS(status));
+}
+
 /*
  * TODO:
  * - add microbenchmarks
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-18  8:48 ` [PATCH 2/4] seccomp: add two missing ptrace ifdefines Christian Brauner
@ 2019-09-18  9:15   ` Tyler Hicks
  2019-09-18 17:33     ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Tyler Hicks @ 2019-09-18  9:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: keescook, luto, jannh, wad, shuah, ast, daniel, kafai,
	songliubraving, yhs, linux-kernel, linux-kselftest, netdev, bpf,
	Tycho Andersen, stable

On 2019-09-18 10:48:31, Christian Brauner wrote:
> Add tw missing ptrace ifdefines to avoid compilation errors on systems
> that do not provide PTRACE_EVENTMSG_SYSCALL_ENTRY or
> PTRACE_EVENTMSG_SYSCALL_EXIT or:
> 
> gcc -Wl,-no-as-needed -Wall  seccomp_bpf.c -lpthread -o seccomp_bpf
> In file included from seccomp_bpf.c:52:0:
> seccomp_bpf.c: In function ‘tracer_ptrace’:
> seccomp_bpf.c:1792:20: error: ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’ undeclared (first use in this function); did you mean ‘PTRACE_EVENT_CLONE’?
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>                     ^
> ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
>   __typeof__(_expected) __exp = (_expected); \
>              ^~~~~~~~~
> seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   ^~~~~~~~~
> seccomp_bpf.c:1792:20: note: each undeclared identifier is reported only once for each function it appears in
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>                     ^
> ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
>   __typeof__(_expected) __exp = (_expected); \
>              ^~~~~~~~~
> seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   ^~~~~~~~~
> seccomp_bpf.c:1793:6: error: ‘PTRACE_EVENTMSG_SYSCALL_EXIT’ undeclared (first use in this function); did you mean ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’?
>     : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
>       ^
> ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
>   __typeof__(_expected) __exp = (_expected); \
>              ^~~~~~~~~
> seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   ^~~~~~~~~
> 
> Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")

I think this Fixes line is incorrect and should be changed to:

Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")

With that changed,

Reviewed-by: Tyler Hicks <tyhicks@canonical.com>

Tyler

> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Shuah Khan <shuah@kernel.org>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Tycho Andersen <tycho@tycho.ws>
> CC: Tyler Hicks <tyhicks@canonical.com>
> Cc: Jann Horn <jannh@google.com>
> Cc: stable@vger.kernel.org
> Cc: linux-kselftest@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: bpf@vger.kernel.org
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 6ef7f16c4cf5..ee52eab01800 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -155,6 +155,14 @@ struct seccomp_data {
>  #ifndef PTRACE_SECCOMP_GET_METADATA
>  #define PTRACE_SECCOMP_GET_METADATA	0x420d
>  
> +#ifndef PTRACE_EVENTMSG_SYSCALL_ENTRY
> +#define PTRACE_EVENTMSG_SYSCALL_ENTRY 1
> +#endif
> +
> +#ifndef PTRACE_EVENTMSG_SYSCALL_EXIT
> +#define PTRACE_EVENTMSG_SYSCALL_EXIT 2
> +#endif
> +
>  struct seccomp_metadata {
>  	__u64 filter_off;       /* Input: which filter */
>  	__u64 flags;             /* Output: filter's flags */
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion
  2019-09-18  8:48 ` [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion Christian Brauner
@ 2019-09-18 10:01   ` Tyler Hicks
  0 siblings, 0 replies; 16+ messages in thread
From: Tyler Hicks @ 2019-09-18 10:01 UTC (permalink / raw)
  To: Christian Brauner
  Cc: keescook, luto, jannh, wad, shuah, ast, daniel, kafai,
	songliubraving, yhs, linux-kernel, linux-kselftest, netdev, bpf,
	Tycho Andersen, stable

On 2019-09-18 10:48:32, Christian Brauner wrote:
> USER_NOTIF_MAGIC is assigned to int variables in this test so set it to INT_MAX
> to avoid warnings:
> 
> seccomp_bpf.c: In function ‘user_notification_continue’:
> seccomp_bpf.c:3088:26: warning: overflow in implicit constant conversion [-Woverflow]
>  #define USER_NOTIF_MAGIC 116983961184613L
>                           ^
> seccomp_bpf.c:3572:15: note: in expansion of macro ‘USER_NOTIF_MAGIC’
>   resp.error = USER_NOTIF_MAGIC;
>                ^~~~~~~~~~~~~~~~
> 
> Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Shuah Khan <shuah@kernel.org>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Tycho Andersen <tycho@tycho.ws>
> CC: Tyler Hicks <tyhicks@canonical.com>

INT_MAX should be a safe value to use.

Reviewed-by: Tyler Hicks <tyhicks@canonical.com>

Tyler

> Cc: Jann Horn <jannh@google.com>
> Cc: stable@vger.kernel.org
> Cc: linux-kselftest@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: bpf@vger.kernel.org
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index ee52eab01800..921f0e26f835 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -35,6 +35,7 @@
>  #include <stdbool.h>
>  #include <string.h>
>  #include <time.h>
> +#include <limits.h>
>  #include <linux/elf.h>
>  #include <sys/uio.h>
>  #include <sys/utsname.h>
> @@ -3080,7 +3081,7 @@ static int user_trap_syscall(int nr, unsigned int flags)
>  	return seccomp(SECCOMP_SET_MODE_FILTER, flags, &prog);
>  }
>  
> -#define USER_NOTIF_MAGIC 116983961184613L
> +#define USER_NOTIF_MAGIC INT_MAX
>  TEST(user_notification_basic)
>  {
>  	pid_t pid;
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18  8:48 ` [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
@ 2019-09-18 17:30   ` Kees Cook
  2019-09-18 18:07     ` Tycho Andersen
  2019-09-19  6:53     ` Christian Brauner
  0 siblings, 2 replies; 16+ messages in thread
From: Kees Cook @ 2019-09-18 17:30 UTC (permalink / raw)
  To: Christian Brauner
  Cc: luto, jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Tycho Andersen,
	Tyler Hicks

On Wed, Sep 18, 2019 at 10:48:30AM +0200, Christian Brauner wrote:
> This allows the seccomp notifier to continue a syscall. A positive
> discussion about this feature was triggered by a post to the
> ksummit-discuss mailing list (cf. [3]) and took place during KSummit
> (cf. [1]) and again at the containers/checkpoint-restore
> micro-conference at Linux Plumbers.
> 
> Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
> which enables a process (watchee) to retrieve an fd for its seccomp
> filter. This fd can then be handed to another (usually more privileged)
> process (watcher). The watcher will then be able to receive seccomp
> messages about the syscalls having been performed by the watchee.
> 
> This feature is heavily used in some userspace workloads. For example,
> it is currently used to intercept mknod() syscalls in user namespaces
> aka in containers.
> The mknod() syscall can be easily filtered based on dev_t. This allows
> us to only intercept a very specific subset of mknod() syscalls.
> Furthermore, mknod() is not possible in user namespaces toto coelo and
> so intercepting and denying syscalls that are not in the whitelist on
> accident is not a big deal. The watchee won't notice a difference.
> 
> In contrast to mknod(), a lot of other syscall we intercept (e.g.
> setxattr()) cannot be easily filtered like mknod() because they have
> pointer arguments. Additionally, some of them might actually succeed in
> user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
> currently cannot tell seccomp to continue from a user notifier we are
> stuck with performing all of the syscalls in lieu of the container. This
> is a huge security liability since it is extremely difficult to
> correctly assume all of the necessary privileges of the calling task
> such that the syscall can be successfully emulated without escaping
> other additional security restrictions (think missing CAP_MKNOD for
> mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
> telling seccomp to resume the syscall.
> 
> One thing that came up in the discussion was the problem that another
> thread could change the memory after userspace has decided to let the
> syscall continue which is a well known TOCTOU with seccomp which is
> present in other ways already.
> The discussion showed that this feature is already very useful for any
> syscall without pointer arguments. For any accidentally intercepted
> non-pointer syscall it is safe to continue.
> For syscalls with pointer arguments there is a race but for any cautious
> userspace and the main usec cases the race doesn't matter. The notifier
> is intended to be used in a scenario where a more privileged watcher
> supervises the syscalls of lesser privileged watchee to allow it to get
> around kernel-enforced limitations by performing the syscall for it
> whenever deemed save by the watcher. Hence, if a user tricks the watcher
> into allowing a syscall they will either get a deny based on
> kernel-enforced restrictions later or they will have changed the
> arguments in such a way that they manage to perform a syscall with
> arguments that they would've been allowed to do anyway.
> In general, it is good to point out again, that the notifier fd was not
> intended to allow userspace to implement a security policy but rather to
> work around kernel security mechanisms in cases where the watcher knows
> that a given action is safe to perform.
> 
> /* References */
> [1]: https://linuxplumbersconf.org/event/4/contributions/560
> [2]: https://linuxplumbersconf.org/event/4/contributions/477
> [3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
> [4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> 
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Will Drewry <wad@chromium.org>
> Cc: Tycho Andersen <tycho@tycho.ws>
> CC: Tyler Hicks <tyhicks@canonical.com>
> Cc: Jann Horn <jannh@google.com>
> ---
>  include/uapi/linux/seccomp.h |  2 ++
>  kernel/seccomp.c             | 24 ++++++++++++++++++++----
>  2 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 90734aa5aa36..2c23b9aa6383 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -76,6 +76,8 @@ struct seccomp_notif {
>  	struct seccomp_data data;
>  };
>  
> +#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001

nit: I'd like to avoid confusion here about what "family" these flags
belong to. "SECCOMP_RET_..." is used for the cBPF filter return action
value, so let's instead call this:

#define SECCOMP_USER_NOTIF_CONTINUE	BIT(0)

I'm thinking of "continue" as slightly different from "allow", in the
sense that I'd like to hint that this doesn't mean arguments could have
been reliably "filtered" via user notification.

And at the same time, please add a giant comment about this in the
header that details the purpose ("check if I should do something on
behalf of the process") and not "is this safe to allow?", due to the
argument parsing ToCToU.

> -static void seccomp_do_user_notification(int this_syscall,
> +static bool seccomp_do_user_notification(int this_syscall,

I'd prefer this stay an "int", just to keep it similar to the other
functions that are checked in __seccomp_filter().

> +	/* perform syscall */

nit: expand this commit to something like "Userspace requests we
continue and perform syscall".

> +	if (flags & SECCOMP_RET_USER_NOTIF_ALLOW)
> +		return false;

return 0;

> +
>  	syscall_set_return_value(current, task_pt_regs(current),
>  				 err, ret);
> +	return true;

return -1;

(This makes it look more like a "skip on failure")

> +	if (resp.flags & ~SECCOMP_RET_USER_NOTIF_ALLOW)
> +		return -EINVAL;
> +
> +	if ((resp.flags & SECCOMP_RET_USER_NOTIF_ALLOW) &&
> +	    (resp.error || resp.val))
>  		return -EINVAL;

Ah yeah, good idea.

Beyond these nits, yes, looks good and should help the usability of this
feature. Thanks for getting it written and tested!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-18  9:15   ` Tyler Hicks
@ 2019-09-18 17:33     ` Kees Cook
  2019-09-19 10:42       ` Dmitry V. Levin
  0 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2019-09-18 17:33 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Christian Brauner, luto, jannh, wad, shuah, ast, daniel, kafai,
	songliubraving, yhs, linux-kernel, linux-kselftest, netdev, bpf,
	Tycho Andersen, stable

On Wed, Sep 18, 2019 at 11:15:12AM +0200, Tyler Hicks wrote:
> On 2019-09-18 10:48:31, Christian Brauner wrote:
> > Add tw missing ptrace ifdefines to avoid compilation errors on systems
> > that do not provide PTRACE_EVENTMSG_SYSCALL_ENTRY or
> > PTRACE_EVENTMSG_SYSCALL_EXIT or:
> > 
> > gcc -Wl,-no-as-needed -Wall  seccomp_bpf.c -lpthread -o seccomp_bpf
> > In file included from seccomp_bpf.c:52:0:
> > seccomp_bpf.c: In function ‘tracer_ptrace’:
> > seccomp_bpf.c:1792:20: error: ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’ undeclared (first use in this function); did you mean ‘PTRACE_EVENT_CLONE’?
> >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >                     ^
> > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> >   __typeof__(_expected) __exp = (_expected); \
> >              ^~~~~~~~~
> > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >   ^~~~~~~~~
> > seccomp_bpf.c:1792:20: note: each undeclared identifier is reported only once for each function it appears in
> >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >                     ^
> > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> >   __typeof__(_expected) __exp = (_expected); \
> >              ^~~~~~~~~
> > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >   ^~~~~~~~~
> > seccomp_bpf.c:1793:6: error: ‘PTRACE_EVENTMSG_SYSCALL_EXIT’ undeclared (first use in this function); did you mean ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’?
> >     : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> >       ^
> > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> >   __typeof__(_expected) __exp = (_expected); \
> >              ^~~~~~~~~
> > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >   ^~~~~~~~~
> > 
> > Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> 
> I think this Fixes line is incorrect and should be changed to:
> 
> Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> 
> With that changed,
> 
> Reviewed-by: Tyler Hicks <tyhicks@canonical.com>

This is actually fixed in -next already (and, yes, with the Fixes line
Tyler has mentioned):

https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next&id=69b2d3c5924273a0ae968d3818210fc57a1b9d07

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18 17:30   ` Kees Cook
@ 2019-09-18 18:07     ` Tycho Andersen
  2019-09-19  6:53       ` Christian Brauner
  2019-09-19  6:53     ` Christian Brauner
  1 sibling, 1 reply; 16+ messages in thread
From: Tycho Andersen @ 2019-09-18 18:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christian Brauner, luto, jannh, wad, shuah, ast, daniel, kafai,
	songliubraving, yhs, linux-kernel, linux-kselftest, netdev, bpf,
	Tyler Hicks

On Wed, Sep 18, 2019 at 10:30:00AM -0700, Kees Cook wrote:
> On Wed, Sep 18, 2019 at 10:48:30AM +0200, Christian Brauner wrote:
> > This allows the seccomp notifier to continue a syscall. A positive
> > discussion about this feature was triggered by a post to the
> > ksummit-discuss mailing list (cf. [3]) and took place during KSummit
> > (cf. [1]) and again at the containers/checkpoint-restore
> > micro-conference at Linux Plumbers.
> > 
> > Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
> > which enables a process (watchee) to retrieve an fd for its seccomp
> > filter. This fd can then be handed to another (usually more privileged)
> > process (watcher). The watcher will then be able to receive seccomp
> > messages about the syscalls having been performed by the watchee.
> > 
> > This feature is heavily used in some userspace workloads. For example,
> > it is currently used to intercept mknod() syscalls in user namespaces
> > aka in containers.
> > The mknod() syscall can be easily filtered based on dev_t. This allows
> > us to only intercept a very specific subset of mknod() syscalls.
> > Furthermore, mknod() is not possible in user namespaces toto coelo and
> > so intercepting and denying syscalls that are not in the whitelist on
> > accident is not a big deal. The watchee won't notice a difference.
> > 
> > In contrast to mknod(), a lot of other syscall we intercept (e.g.
> > setxattr()) cannot be easily filtered like mknod() because they have
> > pointer arguments. Additionally, some of them might actually succeed in
> > user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
> > currently cannot tell seccomp to continue from a user notifier we are
> > stuck with performing all of the syscalls in lieu of the container. This
> > is a huge security liability since it is extremely difficult to
> > correctly assume all of the necessary privileges of the calling task
> > such that the syscall can be successfully emulated without escaping
> > other additional security restrictions (think missing CAP_MKNOD for
> > mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
> > telling seccomp to resume the syscall.
> > 
> > One thing that came up in the discussion was the problem that another
> > thread could change the memory after userspace has decided to let the
> > syscall continue which is a well known TOCTOU with seccomp which is
> > present in other ways already.
> > The discussion showed that this feature is already very useful for any
> > syscall without pointer arguments. For any accidentally intercepted
> > non-pointer syscall it is safe to continue.
> > For syscalls with pointer arguments there is a race but for any cautious
> > userspace and the main usec cases the race doesn't matter. The notifier
> > is intended to be used in a scenario where a more privileged watcher
> > supervises the syscalls of lesser privileged watchee to allow it to get
> > around kernel-enforced limitations by performing the syscall for it
> > whenever deemed save by the watcher. Hence, if a user tricks the watcher
> > into allowing a syscall they will either get a deny based on
> > kernel-enforced restrictions later or they will have changed the
> > arguments in such a way that they manage to perform a syscall with
> > arguments that they would've been allowed to do anyway.
> > In general, it is good to point out again, that the notifier fd was not
> > intended to allow userspace to implement a security policy but rather to
> > work around kernel security mechanisms in cases where the watcher knows
> > that a given action is safe to perform.
> > 
> > /* References */
> > [1]: https://linuxplumbersconf.org/event/4/contributions/560
> > [2]: https://linuxplumbersconf.org/event/4/contributions/477
> > [3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
> > [4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > 
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Will Drewry <wad@chromium.org>
> > Cc: Tycho Andersen <tycho@tycho.ws>
> > CC: Tyler Hicks <tyhicks@canonical.com>
> > Cc: Jann Horn <jannh@google.com>
> > ---
> >  include/uapi/linux/seccomp.h |  2 ++
> >  kernel/seccomp.c             | 24 ++++++++++++++++++++----
> >  2 files changed, 22 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> > index 90734aa5aa36..2c23b9aa6383 100644
> > --- a/include/uapi/linux/seccomp.h
> > +++ b/include/uapi/linux/seccomp.h
> > @@ -76,6 +76,8 @@ struct seccomp_notif {
> >  	struct seccomp_data data;
> >  };
> >  
> > +#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001
> 
> nit: I'd like to avoid confusion here about what "family" these flags
> belong to. "SECCOMP_RET_..." is used for the cBPF filter return action
> value, so let's instead call this:
> 
> #define SECCOMP_USER_NOTIF_CONTINUE	BIT(0)

+1, I was thinking maybe even SECCOMP_USER_NOTIF_FLAG_CONTINUE.

But the whole series (minus the patch that already exists) looks good
to me if we make this change:

Reviewed-by: Tycho Andersen <tycho@tycho.ws>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18 17:30   ` Kees Cook
  2019-09-18 18:07     ` Tycho Andersen
@ 2019-09-19  6:53     ` Christian Brauner
  1 sibling, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2019-09-19  6:53 UTC (permalink / raw)
  To: Kees Cook
  Cc: luto, jannh, wad, shuah, ast, daniel, kafai, songliubraving, yhs,
	linux-kernel, linux-kselftest, netdev, bpf, Tycho Andersen,
	Tyler Hicks

On Wed, Sep 18, 2019 at 10:30:00AM -0700, Kees Cook wrote:
> On Wed, Sep 18, 2019 at 10:48:30AM +0200, Christian Brauner wrote:
> > This allows the seccomp notifier to continue a syscall. A positive
> > discussion about this feature was triggered by a post to the
> > ksummit-discuss mailing list (cf. [3]) and took place during KSummit
> > (cf. [1]) and again at the containers/checkpoint-restore
> > micro-conference at Linux Plumbers.
> > 
> > Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
> > which enables a process (watchee) to retrieve an fd for its seccomp
> > filter. This fd can then be handed to another (usually more privileged)
> > process (watcher). The watcher will then be able to receive seccomp
> > messages about the syscalls having been performed by the watchee.
> > 
> > This feature is heavily used in some userspace workloads. For example,
> > it is currently used to intercept mknod() syscalls in user namespaces
> > aka in containers.
> > The mknod() syscall can be easily filtered based on dev_t. This allows
> > us to only intercept a very specific subset of mknod() syscalls.
> > Furthermore, mknod() is not possible in user namespaces toto coelo and
> > so intercepting and denying syscalls that are not in the whitelist on
> > accident is not a big deal. The watchee won't notice a difference.
> > 
> > In contrast to mknod(), a lot of other syscall we intercept (e.g.
> > setxattr()) cannot be easily filtered like mknod() because they have
> > pointer arguments. Additionally, some of them might actually succeed in
> > user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
> > currently cannot tell seccomp to continue from a user notifier we are
> > stuck with performing all of the syscalls in lieu of the container. This
> > is a huge security liability since it is extremely difficult to
> > correctly assume all of the necessary privileges of the calling task
> > such that the syscall can be successfully emulated without escaping
> > other additional security restrictions (think missing CAP_MKNOD for
> > mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
> > telling seccomp to resume the syscall.
> > 
> > One thing that came up in the discussion was the problem that another
> > thread could change the memory after userspace has decided to let the
> > syscall continue which is a well known TOCTOU with seccomp which is
> > present in other ways already.
> > The discussion showed that this feature is already very useful for any
> > syscall without pointer arguments. For any accidentally intercepted
> > non-pointer syscall it is safe to continue.
> > For syscalls with pointer arguments there is a race but for any cautious
> > userspace and the main usec cases the race doesn't matter. The notifier
> > is intended to be used in a scenario where a more privileged watcher
> > supervises the syscalls of lesser privileged watchee to allow it to get
> > around kernel-enforced limitations by performing the syscall for it
> > whenever deemed save by the watcher. Hence, if a user tricks the watcher
> > into allowing a syscall they will either get a deny based on
> > kernel-enforced restrictions later or they will have changed the
> > arguments in such a way that they manage to perform a syscall with
> > arguments that they would've been allowed to do anyway.
> > In general, it is good to point out again, that the notifier fd was not
> > intended to allow userspace to implement a security policy but rather to
> > work around kernel security mechanisms in cases where the watcher knows
> > that a given action is safe to perform.
> > 
> > /* References */
> > [1]: https://linuxplumbersconf.org/event/4/contributions/560
> > [2]: https://linuxplumbersconf.org/event/4/contributions/477
> > [3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
> > [4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > 
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Will Drewry <wad@chromium.org>
> > Cc: Tycho Andersen <tycho@tycho.ws>
> > CC: Tyler Hicks <tyhicks@canonical.com>
> > Cc: Jann Horn <jannh@google.com>
> > ---
> >  include/uapi/linux/seccomp.h |  2 ++
> >  kernel/seccomp.c             | 24 ++++++++++++++++++++----
> >  2 files changed, 22 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> > index 90734aa5aa36..2c23b9aa6383 100644
> > --- a/include/uapi/linux/seccomp.h
> > +++ b/include/uapi/linux/seccomp.h
> > @@ -76,6 +76,8 @@ struct seccomp_notif {
> >  	struct seccomp_data data;
> >  };
> >  
> > +#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001
> 
> nit: I'd like to avoid confusion here about what "family" these flags
> belong to. "SECCOMP_RET_..." is used for the cBPF filter return action
> value, so let's instead call this:
> 
> #define SECCOMP_USER_NOTIF_CONTINUE	BIT(0)

Ack.

> 
> I'm thinking of "continue" as slightly different from "allow", in the
> sense that I'd like to hint that this doesn't mean arguments could have
> been reliably "filtered" via user notification.

Good point.

> 
> And at the same time, please add a giant comment about this in the
> header that details the purpose ("check if I should do something on
> behalf of the process") and not "is this safe to allow?", due to the
> argument parsing ToCToU.

Yeah, I'll copy parts of what I described in the commit message down
into the code.

> 
> > -static void seccomp_do_user_notification(int this_syscall,
> > +static bool seccomp_do_user_notification(int this_syscall,
> 
> I'd prefer this stay an "int", just to keep it similar to the other
> functions that are checked in __seccomp_filter().

Ack.

> 
> > +	/* perform syscall */
> 
> nit: expand this commit to something like "Userspace requests we
> continue and perform syscall".

Ack.

> 
> > +	if (flags & SECCOMP_RET_USER_NOTIF_ALLOW)
> > +		return false;
> 
> return 0;
> 
> > +
> >  	syscall_set_return_value(current, task_pt_regs(current),
> >  				 err, ret);
> > +	return true;
> 
> return -1;
> 
> (This makes it look more like a "skip on failure")

Ack.

> 
> > +	if (resp.flags & ~SECCOMP_RET_USER_NOTIF_ALLOW)
> > +		return -EINVAL;
> > +
> > +	if ((resp.flags & SECCOMP_RET_USER_NOTIF_ALLOW) &&
> > +	    (resp.error || resp.val))
> >  		return -EINVAL;
> 
> Ah yeah, good idea.
> 
> Beyond these nits, yes, looks good and should help the usability of this
> feature. Thanks for getting it written and tested!

Will rework and resend!

Thanks for the review! :)
Christian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW
  2019-09-18 18:07     ` Tycho Andersen
@ 2019-09-19  6:53       ` Christian Brauner
  0 siblings, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2019-09-19  6:53 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Kees Cook, luto, jannh, wad, shuah, ast, daniel, kafai,
	songliubraving, yhs, linux-kernel, linux-kselftest, netdev, bpf,
	Tyler Hicks

On Wed, Sep 18, 2019 at 12:07:12PM -0600, Tycho Andersen wrote:
> On Wed, Sep 18, 2019 at 10:30:00AM -0700, Kees Cook wrote:
> > On Wed, Sep 18, 2019 at 10:48:30AM +0200, Christian Brauner wrote:
> > > This allows the seccomp notifier to continue a syscall. A positive
> > > discussion about this feature was triggered by a post to the
> > > ksummit-discuss mailing list (cf. [3]) and took place during KSummit
> > > (cf. [1]) and again at the containers/checkpoint-restore
> > > micro-conference at Linux Plumbers.
> > > 
> > > Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
> > > which enables a process (watchee) to retrieve an fd for its seccomp
> > > filter. This fd can then be handed to another (usually more privileged)
> > > process (watcher). The watcher will then be able to receive seccomp
> > > messages about the syscalls having been performed by the watchee.
> > > 
> > > This feature is heavily used in some userspace workloads. For example,
> > > it is currently used to intercept mknod() syscalls in user namespaces
> > > aka in containers.
> > > The mknod() syscall can be easily filtered based on dev_t. This allows
> > > us to only intercept a very specific subset of mknod() syscalls.
> > > Furthermore, mknod() is not possible in user namespaces toto coelo and
> > > so intercepting and denying syscalls that are not in the whitelist on
> > > accident is not a big deal. The watchee won't notice a difference.
> > > 
> > > In contrast to mknod(), a lot of other syscall we intercept (e.g.
> > > setxattr()) cannot be easily filtered like mknod() because they have
> > > pointer arguments. Additionally, some of them might actually succeed in
> > > user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
> > > currently cannot tell seccomp to continue from a user notifier we are
> > > stuck with performing all of the syscalls in lieu of the container. This
> > > is a huge security liability since it is extremely difficult to
> > > correctly assume all of the necessary privileges of the calling task
> > > such that the syscall can be successfully emulated without escaping
> > > other additional security restrictions (think missing CAP_MKNOD for
> > > mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
> > > telling seccomp to resume the syscall.
> > > 
> > > One thing that came up in the discussion was the problem that another
> > > thread could change the memory after userspace has decided to let the
> > > syscall continue which is a well known TOCTOU with seccomp which is
> > > present in other ways already.
> > > The discussion showed that this feature is already very useful for any
> > > syscall without pointer arguments. For any accidentally intercepted
> > > non-pointer syscall it is safe to continue.
> > > For syscalls with pointer arguments there is a race but for any cautious
> > > userspace and the main usec cases the race doesn't matter. The notifier
> > > is intended to be used in a scenario where a more privileged watcher
> > > supervises the syscalls of lesser privileged watchee to allow it to get
> > > around kernel-enforced limitations by performing the syscall for it
> > > whenever deemed save by the watcher. Hence, if a user tricks the watcher
> > > into allowing a syscall they will either get a deny based on
> > > kernel-enforced restrictions later or they will have changed the
> > > arguments in such a way that they manage to perform a syscall with
> > > arguments that they would've been allowed to do anyway.
> > > In general, it is good to point out again, that the notifier fd was not
> > > intended to allow userspace to implement a security policy but rather to
> > > work around kernel security mechanisms in cases where the watcher knows
> > > that a given action is safe to perform.
> > > 
> > > /* References */
> > > [1]: https://linuxplumbersconf.org/event/4/contributions/560
> > > [2]: https://linuxplumbersconf.org/event/4/contributions/477
> > > [3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
> > > [4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > > 
> > > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: Andy Lutomirski <luto@amacapital.net>
> > > Cc: Will Drewry <wad@chromium.org>
> > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > CC: Tyler Hicks <tyhicks@canonical.com>
> > > Cc: Jann Horn <jannh@google.com>
> > > ---
> > >  include/uapi/linux/seccomp.h |  2 ++
> > >  kernel/seccomp.c             | 24 ++++++++++++++++++++----
> > >  2 files changed, 22 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> > > index 90734aa5aa36..2c23b9aa6383 100644
> > > --- a/include/uapi/linux/seccomp.h
> > > +++ b/include/uapi/linux/seccomp.h
> > > @@ -76,6 +76,8 @@ struct seccomp_notif {
> > >  	struct seccomp_data data;
> > >  };
> > >  
> > > +#define SECCOMP_RET_USER_NOTIF_ALLOW 0x00000001
> > 
> > nit: I'd like to avoid confusion here about what "family" these flags
> > belong to. "SECCOMP_RET_..." is used for the cBPF filter return action
> > value, so let's instead call this:
> > 
> > #define SECCOMP_USER_NOTIF_CONTINUE	BIT(0)
> 
> +1, I was thinking maybe even SECCOMP_USER_NOTIF_FLAG_CONTINUE.

I'll flip a coin between yours and Kees suggestion. :)

> 
> But the whole series (minus the patch that already exists) looks good
> to me if we make this change:
> 
> Reviewed-by: Tycho Andersen <tycho@tycho.ws>

Thanks for the review! :)
Christian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-18 17:33     ` Kees Cook
@ 2019-09-19 10:42       ` Dmitry V. Levin
  2019-09-19 16:55         ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry V. Levin @ 2019-09-19 10:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tyler Hicks, Christian Brauner, luto, jannh, wad, shuah, ast,
	daniel, kafai, songliubraving, yhs, linux-kernel,
	linux-kselftest, netdev, bpf, Tycho Andersen, stable

On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
> On Wed, Sep 18, 2019 at 11:15:12AM +0200, Tyler Hicks wrote:
> > On 2019-09-18 10:48:31, Christian Brauner wrote:
> > > Add tw missing ptrace ifdefines to avoid compilation errors on systems
> > > that do not provide PTRACE_EVENTMSG_SYSCALL_ENTRY or
> > > PTRACE_EVENTMSG_SYSCALL_EXIT or:
> > > 
> > > gcc -Wl,-no-as-needed -Wall  seccomp_bpf.c -lpthread -o seccomp_bpf
> > > In file included from seccomp_bpf.c:52:0:
> > > seccomp_bpf.c: In function ‘tracer_ptrace’:
> > > seccomp_bpf.c:1792:20: error: ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’ undeclared (first use in this function); did you mean ‘PTRACE_EVENT_CLONE’?
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >                     ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >              ^~~~~~~~~
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^~~~~~~~~
> > > seccomp_bpf.c:1792:20: note: each undeclared identifier is reported only once for each function it appears in
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >                     ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >              ^~~~~~~~~
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^~~~~~~~~
> > > seccomp_bpf.c:1793:6: error: ‘PTRACE_EVENTMSG_SYSCALL_EXIT’ undeclared (first use in this function); did you mean ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’?
> > >     : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> > >       ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >              ^~~~~~~~~
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^~~~~~~~~
> > > 
> > > Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > 
> > I think this Fixes line is incorrect and should be changed to:
> > 
> > Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> > 
> > With that changed,
> > 
> > Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
> 
> This is actually fixed in -next already (and, yes, with the Fixes line
> Tyler has mentioned):
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next&id=69b2d3c5924273a0ae968d3818210fc57a1b9d07

Excuse me, does it mean that you expect each selftest to be self-hosted?
I was (and still is) under impression that selftests should be built
with headers installed from the tree. Is it the case, or is it not?


-- 
ldv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-19 10:42       ` Dmitry V. Levin
@ 2019-09-19 16:55         ` Kees Cook
  2019-09-19 17:04           ` shuah
  2019-09-19 18:30           ` Dmitry V. Levin
  0 siblings, 2 replies; 16+ messages in thread
From: Kees Cook @ 2019-09-19 16:55 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Tyler Hicks, Christian Brauner, luto, jannh, wad, shuah, ast,
	daniel, kafai, songliubraving, yhs, linux-kernel,
	linux-kselftest, netdev, bpf, Tycho Andersen, stable

On Thu, Sep 19, 2019 at 01:42:51PM +0300, Dmitry V. Levin wrote:
> On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
> > This is actually fixed in -next already (and, yes, with the Fixes line
> > Tyler has mentioned):
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next&id=69b2d3c5924273a0ae968d3818210fc57a1b9d07
> 
> Excuse me, does it mean that you expect each selftest to be self-hosted?
> I was (and still is) under impression that selftests should be built
> with headers installed from the tree. Is it the case, or is it not?

As you know (but to give others some context) there is a long-standing
bug in the selftest build environment that causes these problems (it
isn't including the uAPI headers) which you'd proposed to be fixed
recently[1]. Did that ever get sent as a "real" patch? I don't see it
in Shuah's tree; can you send it to Shuah?

But even with that fixed, since the seccomp selftest has a history of
being built stand-alone, I've continued to take these kinds of fixes.

-Kees

[1] https://lore.kernel.org/lkml/20190805094719.GA1693@altlinux.org/

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-19 16:55         ` Kees Cook
@ 2019-09-19 17:04           ` shuah
  2019-09-19 18:30           ` Dmitry V. Levin
  1 sibling, 0 replies; 16+ messages in thread
From: shuah @ 2019-09-19 17:04 UTC (permalink / raw)
  To: Kees Cook, Dmitry V. Levin
  Cc: Tyler Hicks, Christian Brauner, luto, jannh, wad, ast, daniel,
	kafai, songliubraving, yhs, linux-kernel, linux-kselftest,
	netdev, bpf, Tycho Andersen, stable, shuah

On 9/19/19 10:55 AM, Kees Cook wrote:
> On Thu, Sep 19, 2019 at 01:42:51PM +0300, Dmitry V. Levin wrote:
>> On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
>>> This is actually fixed in -next already (and, yes, with the Fixes line
>>> Tyler has mentioned):
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next&id=69b2d3c5924273a0ae968d3818210fc57a1b9d07
>>
>> Excuse me, does it mean that you expect each selftest to be self-hosted?
>> I was (and still is) under impression that selftests should be built
>> with headers installed from the tree. Is it the case, or is it not?
> 
> As you know (but to give others some context) there is a long-standing
> bug in the selftest build environment that causes these problems (it
> isn't including the uAPI headers) which you'd proposed to be fixed
> recently[1]. Did that ever get sent as a "real" patch? I don't see it
> in Shuah's tree; can you send it to Shuah?
> 
> But even with that fixed, since the seccomp selftest has a history of
> being built stand-alone, I've continued to take these kinds of fixes.
> 
> -Kees
> 
> [1] https://lore.kernel.org/lkml/20190805094719.GA1693@altlinux.org/
> 

It has been sent to kselftest list yesterday. I will pull this in for
my next update.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines
  2019-09-19 16:55         ` Kees Cook
  2019-09-19 17:04           ` shuah
@ 2019-09-19 18:30           ` Dmitry V. Levin
  1 sibling, 0 replies; 16+ messages in thread
From: Dmitry V. Levin @ 2019-09-19 18:30 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tyler Hicks, Christian Brauner, luto, jannh, wad, shuah, ast,
	daniel, kafai, songliubraving, yhs, linux-kernel,
	linux-kselftest, netdev, bpf, Tycho Andersen, stable

On Thu, Sep 19, 2019 at 09:55:30AM -0700, Kees Cook wrote:
> On Thu, Sep 19, 2019 at 01:42:51PM +0300, Dmitry V. Levin wrote:
> > On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
> > > This is actually fixed in -next already (and, yes, with the Fixes line
> > > Tyler has mentioned):
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next&id=69b2d3c5924273a0ae968d3818210fc57a1b9d07
> > 
> > Excuse me, does it mean that you expect each selftest to be self-hosted?
> > I was (and still is) under impression that selftests should be built
> > with headers installed from the tree. Is it the case, or is it not?
> 
> As you know (but to give others some context) there is a long-standing
> bug in the selftest build environment that causes these problems (it
> isn't including the uAPI headers) which you'd proposed to be fixed
> recently[1]. Did that ever get sent as a "real" patch? I don't see it
> in Shuah's tree; can you send it to Shuah?
> 
> [1] https://lore.kernel.org/lkml/20190805094719.GA1693@altlinux.org/

The [1] was an idea rather than a patch, it didn't take arch uapi headers
into account.  OK, I'll try to come up with a proper fix then.


-- 
ldv

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, back to index

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-18  8:48 [PATCH 0/4] seccomp: continue syscall from notifier Christian Brauner
2019-09-18  8:48 ` [PATCH 1/4] seccomp: add SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner
2019-09-18 17:30   ` Kees Cook
2019-09-18 18:07     ` Tycho Andersen
2019-09-19  6:53       ` Christian Brauner
2019-09-19  6:53     ` Christian Brauner
2019-09-18  8:48 ` [PATCH 2/4] seccomp: add two missing ptrace ifdefines Christian Brauner
2019-09-18  9:15   ` Tyler Hicks
2019-09-18 17:33     ` Kees Cook
2019-09-19 10:42       ` Dmitry V. Levin
2019-09-19 16:55         ` Kees Cook
2019-09-19 17:04           ` shuah
2019-09-19 18:30           ` Dmitry V. Levin
2019-09-18  8:48 ` [PATCH 3/4] seccomp: avoid overflow in implicit constant conversion Christian Brauner
2019-09-18 10:01   ` Tyler Hicks
2019-09-18  8:48 ` [PATCH 4/4] seccomp: test SECCOMP_RET_USER_NOTIF_ALLOW Christian Brauner

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org bpf@archiver.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/ public-inbox